Three-dimensional reconstruction using passive or active stereo vision have been used in applications such as industrial environment,1, 2 medical imaging3 and other fields. For the high-speed inline measurement systems, an acceleration of the image processing algorithms is required. One solution for this is the use of FGPAs.4, 5 Real time preprocessing, such as lens distortion and image rectification, near the sensor is therefore an advantage.
The physical inhomogeneity of the camera sensor causes non-linear lens effects (radial distortion) and the relative positioning offsets of the camera causes tangential distortion. For industrial image processing it is necessary to correct the lens distortion and to rectify the images. After rectification, the epipolar lines in both stereoscopic images run parallel to the image rows [6, pp. 239]. Thus, dense matching takes place in a 1D search area instead of in a 2D search area (both stereo images are line-correspondent).
The following describes the reverse transformation of lens distortion and image correction (as used in FPGAs4, 7, 8). The reverse transformation (see Figure 1) uses per camera two lens undistortion and image rectification transformation maps. In this maps, horizontal and vertical pixel positions in the raw image are defined. The grey value at the pixel position is usually an intermediate value within a two-dimensional rectangular grid. This is calculated by a bilinear interpolation. The four neighboring pixels at the pixel position (see Figure 1, neighboring pixels G0 – G3) and the decimal places of the pixel position are required for this. The interpolated grey value corresponds to the grey value of the rectified image to be calculated. [9, p. 273]
LENS UNDISTORTION AND IMAGE RECTIFICATION MODULE
Figure 1 show an overview of the lens undistortion and rectification module. One module per camera is required. This module calculates the rectified images by reverse transformation (see section 2). This means that each integer pixel in the rectified image corresponds to one pixel coordinate in the raw image. Since the corresponding pixel position in the raw image is not an integer, a bilinear interpolation is performed from the nearby integer pixel locations [10, pp. 437]. Two data must be available at the input of the module - the raw image of the camera and the pre-computed compressed rectification map (see sections 2 and 3.1.1). The module consists of four subsystems - Dual Port RAM with address calculator, decompression of the compressed rectification map, an bilinear interpolation and the verification of invalidity values. The following subsections describe the two necessary preparatory steps (calibration and decompression of rectification maps), the four subsystems and the processing pipeline.
Before using the lens undistortion and image rectification module, a camera calibration and the generation of a compressed rectification map per camera (camera setup k, k ∈ [0,1]) is required.
For the camera calibration a calibre plate with a circle grid pattern is used [5, 10, pp. 428]. At least 20 images per camera are taken with different positions of the calibre plate. The four rectification maps are calculated using these calibration images. The following OpenCV functions are used to calculate rectification maps:
1. findCirclesGrid(): Determination of the image points of the circle grid.11
2. stereoCalibrate(): Using for camera calibration to calculate the camera matrices, the distortion coefficients of both cameras and the rotation and translation vector between them. 11
4. initUndistortRectifyMap(): Generation of rectification maps using the parameters calculated by step 3. Two maps (x and y) for each camera containing the floating point pixel positions in the raw images.11
Generation of compressed rectification maps
To reduce bandwidth utilization, it is required to compress the rectification maps. Algorithm 1 shows the necessary steps how to compress the two UndistRectMaps mapk_x and mapk_y of one camera k into one compressed map mapk_c. Redundant information is removed using lossy compression. The resulting map mapk_c has the size of N × M bytes. N is the image width (or number of columns) and M is the image height (or number of rows). In the first step, the pixel index in both UndistRectMaps are subtracted so the maps contain only the relative pixel offsets.8 Then each UndistRectMap is subsampled. Subsampling is done according to a chequered pattern. For mapk_y, the index of this checkerboard pattern is offset by one compared to mapk_x. That way the missing value only has to be interpolated once per clock during decompression. Both subsampled UndistRectMaps are merged to mapk_c. The last step is reinterpretation of the values of this mapk_c. The offset values from the second column are quantized to seven binary decimal bits. Moreover, the offset deviations are calculated by the subtraction of the value from the value located two columns before.8 The calculated offset deviations are stored with a sign bit in addition to the seven binary decimal bits. The values of the first two columns are rounded. Thus, these consist only of absolute pixel offset values without decimal places.
The proposed lens undistortion and image rectification module was realized with the System Generator Tool. The generated ip core is integrated into a vivado project (with image acquisition via LVDS and image output via HDMI and GigE). The passive stereo system used is based on a Xilinx Zynq-7020 SoC. The two cameras are equipped with e2V EV76C570 CMOS sensors. The lens undistortion and image rectification ip core is implemented in the programmable logic cells (Zynq PL).
The module consists essentially of four main parts (see Figure 1) - dual port RAM with BRAM address calculator, decompression, bilinear interpolation and verify on invalidity. Figure 2 shows the modified hardware architecture of the lens undistortion and image rectification module for one camera. The bold arrows show the main path of the image. In comparison to the presented hardware architecture in paper 8 a total of four dual port 36 k RAMs are used for intermediate buffering of 50 rows of the raw image. The number of rows to be saved depends on the possible maximum vertical displacement, which depends on the base distance and the angle between both cameras. The maximum vertical displacement value is obtained from the transformation map UndistRecrMapk_y. The compressed rectification map is loaded synchronized by VDMA from the external memory. Figure 3 shows a block diagram of the decompression subsystem. The compressed transformation map mapk_c is compressed in reverse order to the compression algorithm 1 (see subsection 3.1.2). A line buffer is required to interpolate the missing values. After decompression, the two integer and two decimal values are available, which include the vertical and horizontal displacement.5 The BRAM address calculator uses the two integer values to determine the new position of the grey value G0 stored in the dual port RAM. The calculation of the bilinear interpolation is done with the two decimal places and the four neighbouring grey values G0 − G3 (see Figure 1). The four determined grey values G0 − G3 are the grid points for the calculation of the bilinear interpolation. The Verify of Invalidity subsystem checks whether the pixel position of the calculated value is within the image area.
Interfaces and processing pipeline
Figure 2 shows the architecture and interfaces of this module. The module calculates a rectified image pixel by pixel. To do this, both data streams, which are streamed over an AXI4-Stream interface [12, pp. 5], must be synchronous, but still offset accordingly. The data management is very complex. Figure 4 shows the rough and simplified course of time. The two input streams and the output stream are visible - buffered values from raw image stream, decompressed values from the compressed transformation map mapk_c and rectified values from image stream. The both input signals are provided via Slave AXI4-Stream interface. Initially, some rows are buffered in four ring buffers (see range A). The number of the buffered rows depends on baseline between the both cameras and camera orientation. In our case 50 rows are sufficient. This is absolutely necessary so that grey values are also available at the corresponding calculated coordinates during the reverse transformation (see section 2). After calculating a rectified value, the next pixel of the raw image stream is buffered in one of the ring buffers (see range B). The pixels of the stored compressed transformation map are requested at a time offset to this image data stream. The map is requested from the VDMA via the signal fsync. A rectified pixel is calculated as soon as the ring buffer is filled accordingly and the two corresponding decompressed shift values (horizontal and vertical from the transformation map) are available. With the beginning of range B, this is fulfilled for the first time. The rectification takes place a further 50 rows, although the entire raw image has already been written to the ring buffers (see range C). During this time it is possible to write the 50 rows of the next raw image into the ring buffers. The provision of the calculated rectified image is provided via the master AXI4 stream interface m_axis.
“This module is also configured via the Slave AXI4-Lite interface s_axi_lite. This interface inform about possible error messages such as synchronization error or an overflow when the fill level of AXI FIFO ImgRect is exceeded.” 5
PERFORMANCE OF THE SYSTEM
Use of resources
The undistortion and rectification ip core needs the following resources (see Table 1) with a maximum image size of 1280 Px × 960 Px and a maximum and minimum permissible vertical shift value of ± 24 Px. The values can be changed if desired. The IP core needs a BRAM load of 12% when using a Xilinx Zynq 7020 SoC. This is made up of four ring buffers (bufferd raw image: 2 · (24 Px + 1) · imgwith_max · 8 bit) and one line buffer (buffered compressed rectification map: imgwidth_max · 8bit).
Resource utilization of the UndistRect IP-Core in Xilinx Zynq 7020 SoC; at a set image size of 1280 Px × 960 Px and a set vertical maximum of 24 rows.
Deviations from different calculated rectified images
The lens distortion correction and image rectification was calculated without and with a compressed transformation map. In comparison to the use of uncompressed transformation maps the rectified image shows a deviation of one till five grey value at the most (along strong gradients) using a binary precision of seven decimal places for the offset values.
CONCLUSION AND FUTURE WORK
This paper shows an efficient FPGA-based lens distortion correction and image rectification module. To reduce the bandwidth utilization, it is necessary to use compressed lens undistortion and rectification transformation maps. Especially due to the effective lossy compression of the lens undistortion and rectification transformation maps, pixel by pixel rectification of an image is possible. Table 2 shows the bandwidth utilization of the different compressed undistortion and rectification transformation maps. With the new extension of the compression algorithm (see subsection 3.1.2), a reduction of the bandwidth utilization by a factor of eight is achieved. The extension of the compression algorithm in Ref. 8 is the additional subsample of the vertical and horizontal maps (see algorithm 1). Compared to the generated rectified images without and with compressed transformation maps, minor deviations of grey value are achieved (see subsection 4.2).
Difference in memory utilization between OpenCV and different compressed UndistRectMaps (abbr. maps).
|Maps per camera||Map type||Data type||Size (byte)||Memory load (MB/s) 2 Mpix, 60 fps||Bandwidth utilizationa (%)|
|OpenCV||x||float||(M × N) · 4||960||22.5|
|y||float||(M × N) · 4|
|Compressed||merged8||unsigned short||(M × N) · 2||240||5.6|
|Compressed||subsample & merged||unsigned short||(M × N) · 1||120||2.8|
Zynq-7000 32-bit DDR3 memory controller: maximal theoretical bandwidth 4267MB/s [13, p. 13]
There are several applications of this FPGA-based lens undistortion and image rectification module, such as its use in a stereo-based phase measuring Profilometry system.5 The presented lens undistortion and rectification ip core is used in a passive stereo system (using a Xilinx Zynq-7020 SoC) without structured-light illumination.
Within the research group DIADEM, the stereo system will be integrated into a sensor arrangement with free-form projection in future work. The planned stereo sensor setup has a working distance of 500 mm, a base line with about 100 mm and a triangulation angle of 11°.
This research is supported by the Free State of Thuringia, the European Social Fund (ESF) of the European Union and the Thüringer Aufbaubank (TAB) within the research group DIADEM (2016 FGR 0044). Furthermore, this is funded by the Federal Ministry for Europe and Research within the project FASTER (BMBF, FKZ: 03ZZZZ0442E).