Translator Disclaimer
Open Access Paper
17 September 2019 FPGA-based lens undistortion and image rectification for stereo vision applications
Author Affiliations +
Proceedings Volume 11144, Photonics and Education in Measurement Science 2019; 1114416 (2019) https://doi.org/10.1117/12.2530692
Event: Joint TC1 - TC2 International Symposium on Photonics and Education in Measurement Science 2019, 2019, Jena, Germany
Abstract
Lens undistortion and image rectification is a commonly used pre-processing, e.g. for active or passive stereo vision to reduce the complexity of the search for matching points. The undistortion and rectification is implemented in a field programmable gate array (FPGA). The algorithm is performed pixel by pixel. The challenges of the implementation are the synchronisation of the data streams and the limited memory bandwidth. Due to the memory constraints, the algorithm utilises a pre-computed lossy compression of the rectification maps by a ratio of eight. The compressed maps occupy less space by ignoring the pixel indexes, sub-sampling both maps, and reducing repeated information in a row by forming differences to adjacent pixels. Undistorted and rectified images are calculated once without and once with the compressed transformation map. The deviation between the different computed images is minimal and negligible. The functionality of the hardware module, the decompression algorithm and the processing pipeline are described. The algorithm is validated on a Xilinx Zynq-7020 SoC. The stereo setup has a baseline with 46 mm and non-converged optical axis between the cameras. The cameras are configured at 1.3 Mpix @ 60 fps and distortion correction and rectification is performed in real time during image capture. With a camera resolution of 1280 pixels × 960 pixels and a maximum vertical shift of ± 20 pixels, the efficient hardware implementation utilizes 12 % of available block RAM resources.

1.

INTRODUCTION

Three-dimensional reconstruction using passive or active stereo vision have been used in applications such as industrial environment,1, 2 medical imaging3 and other fields. For the high-speed inline measurement systems, an acceleration of the image processing algorithms is required. One solution for this is the use of FGPAs.4, 5 Real time preprocessing, such as lens distortion and image rectification, near the sensor is therefore an advantage.

2.

THEORETICAL BACKROUND

The physical inhomogeneity of the camera sensor causes non-linear lens effects (radial distortion) and the relative positioning offsets of the camera causes tangential distortion. For industrial image processing it is necessary to correct the lens distortion and to rectify the images. After rectification, the epipolar lines in both stereoscopic images run parallel to the image rows [6, pp. 239]. Thus, dense matching takes place in a 1D search area instead of in a 2D search area (both stereo images are line-correspondent).

The following describes the reverse transformation of lens distortion and image correction (as used in FPGAs4, 7, 8). The reverse transformation (see Figure 1) uses per camera two lens undistortion and image rectification transformation maps. In this maps, horizontal and vertical pixel positions in the raw image are defined. The grey value at the pixel position is usually an intermediate value within a two-dimensional rectangular grid. This is calculated by a bilinear interpolation. The four neighboring pixels at the pixel position (see Figure 1, neighboring pixels G0 – G3) and the decimal places of the pixel position are required for this. The interpolated grey value corresponds to the grey value of the rectified image to be calculated. [9, p. 273]

Figure 1.

Overview of the lens undistortion and image recftification module.

00050_PSISDG11144_1114416_page_2_1.jpg

3.

LENS UNDISTORTION AND IMAGE RECTIFICATION MODULE

Figure 1 show an overview of the lens undistortion and rectification module. One module per camera is required. This module calculates the rectified images by reverse transformation (see section 2). This means that each integer pixel in the rectified image corresponds to one pixel coordinate in the raw image. Since the corresponding pixel position in the raw image is not an integer, a bilinear interpolation is performed from the nearby integer pixel locations [10, pp. 437]. Two data must be available at the input of the module - the raw image of the camera and the pre-computed compressed rectification map (see sections 2 and 3.1.1). The module consists of four subsystems - Dual Port RAM with address calculator, decompression of the compressed rectification map, an bilinear interpolation and the verification of invalidity values. The following subsections describe the two necessary preparatory steps (calibration and decompression of rectification maps), the four subsystems and the processing pipeline.

3.1

Preparatory steps

Before using the lens undistortion and image rectification module, a camera calibration and the generation of a compressed rectification map per camera (camera setup k, k ∈ [0,1]) is required.

3.1.1

Calibration

For the camera calibration a calibre plate with a circle grid pattern is used [5, 10, pp. 428]. At least 20 images per camera are taken with different positions of the calibre plate. The four rectification maps are calculated using these calibration images. The following OpenCV functions are used to calculate rectification maps:

  • 1. findCirclesGrid(): Determination of the image points of the circle grid.11

  • 2. stereoCalibrate(): Using for camera calibration to calculate the camera matrices, the distortion coefficients of both cameras and the rotation and translation vector between them. 11

  • 3. stereoRectify(): Calculation of the rectification transformation matrices for both cameras and the 4 × 4 reprojection matrix Q (necessary for 3D reconstruction5).11

  • 4. initUndistortRectifyMap(): Generation of rectification maps using the parameters calculated by step 3. Two maps (x and y) for each camera containing the floating point pixel positions in the raw images.11

3.1.2

Generation of compressed rectification maps

To reduce bandwidth utilization, it is required to compress the rectification maps. Algorithm 1 shows the necessary steps how to compress the two UndistRectMaps mapk_x and mapk_y of one camera k into one compressed map mapk_c. Redundant information is removed using lossy compression. The resulting map mapk_c has the size of N × M bytes. N is the image width (or number of columns) and M is the image height (or number of rows). In the first step, the pixel index in both UndistRectMaps are subtracted so the maps contain only the relative pixel offsets.8 Then each UndistRectMap is subsampled. Subsampling is done according to a chequered pattern. For mapk_y, the index of this checkerboard pattern is offset by one compared to mapk_x. That way the missing value only has to be interpolated once per clock during decompression. Both subsampled UndistRectMaps are merged to mapk_c. The last step is reinterpretation of the values of this mapk_c. The offset values from the second column are quantized to seven binary decimal bits. Moreover, the offset deviations are calculated by the subtraction of the value from the value located two columns before.8 The calculated offset deviations are stored with a sign bit in addition to the seven binary decimal bits. The values of the first two columns are rounded. Thus, these consist only of absolute pixel offset values without decimal places.

3.2

Hardware architecture

The proposed lens undistortion and image rectification module was realized with the System Generator Tool. The generated ip core is integrated into a vivado project (with image acquisition via LVDS and image output via HDMI and GigE). The passive stereo system used is based on a Xilinx Zynq-7020 SoC. The two cameras are equipped with e2V EV76C570 CMOS sensors. The lens undistortion and image rectification ip core is implemented in the programmable logic cells (Zynq PL).

Algorithm 1

Compress UndistRectMaps mapk_* for camera k

00050_PSISDG11144_1114416_page_4_1.jpg

The module consists essentially of four main parts (see Figure 1) - dual port RAM with BRAM address calculator, decompression, bilinear interpolation and verify on invalidity. Figure 2 shows the modified hardware architecture of the lens undistortion and image rectification module for one camera. The bold arrows show the main path of the image. In comparison to the presented hardware architecture in paper 8 a total of four dual port 36 k RAMs are used for intermediate buffering of 50 rows of the raw image. The number of rows to be saved depends on the possible maximum vertical displacement, which depends on the base distance and the angle between both cameras. The maximum vertical displacement value is obtained from the transformation map UndistRecrMapk_y. The compressed rectification map is loaded synchronized by VDMA from the external memory. Figure 3 shows a block diagram of the decompression subsystem. The compressed transformation map mapk_c is compressed in reverse order to the compression algorithm 1 (see subsection 3.1.2). A line buffer is required to interpolate the missing values. After decompression, the two integer and two decimal values are available, which include the vertical and horizontal displacement.5 The BRAM address calculator uses the two integer values to determine the new position of the grey value G0 stored in the dual port RAM. The calculation of the bilinear interpolation is done with the two decimal places and the four neighbouring grey values G0 − G3 (see Figure 1). The four determined grey values G0 − G3 are the grid points for the calculation of the bilinear interpolation. The Verify of Invalidity subsystem checks whether the pixel position of the calculated value is within the image area.

Figure 2.

Architecture of the lens undistrectification and image rectification module.

00050_PSISDG11144_1114416_page_3_1.jpg

Figure 3.

Subsystem Decompression from figures 1 and 2: Recalculation of the horizontal x(m,n) and vertical y(m,n) displacement values from compressed rectification map (M × N).

00050_PSISDG11144_1114416_page_5_1.jpg

3.3

Interfaces and processing pipeline

Figure 2 shows the architecture and interfaces of this module. The module calculates a rectified image pixel by pixel. To do this, both data streams, which are streamed over an AXI4-Stream interface [12, pp. 5], must be synchronous, but still offset accordingly. The data management is very complex. Figure 4 shows the rough and simplified course of time. The two input streams and the output stream are visible - buffered values from raw image stream, decompressed values from the compressed transformation map mapk_c and rectified values from image stream. The both input signals are provided via Slave AXI4-Stream interface. Initially, some rows are buffered in four ring buffers (see range A). The number of the buffered rows depends on baseline between the both cameras and camera orientation. In our case 50 rows are sufficient. This is absolutely necessary so that grey values are also available at the corresponding calculated coordinates during the reverse transformation (see section 2). After calculating a rectified value, the next pixel of the raw image stream is buffered in one of the ring buffers (see range B). The pixels of the stored compressed transformation map are requested at a time offset to this image data stream. The map is requested from the VDMA via the signal fsync. A rectified pixel is calculated as soon as the ring buffer is filled accordingly and the two corresponding decompressed shift values (horizontal and vertical from the transformation map) are available. With the beginning of range B, this is fulfilled for the first time. The rectification takes place a further 50 rows, although the entire raw image has already been written to the ring buffers (see range C). During this time it is possible to write the 50 rows of the next raw image into the ring buffers. The provision of the calculated rectified image is provided via the master AXI4 stream interface m_axis.

Figure 4.

Simplified timeline of the processing pipeline. Filling buffer and start decompressing the transformation map (range A); starting pixelwise rectification (range B); rectify the values of the last 50rows and start saving the values of the next raw image (range C).

00050_PSISDG11144_1114416_page_5_2.jpg

“This module is also configured via the Slave AXI4-Lite interface s_axi_lite. This interface inform about possible error messages such as synchronization error or an overflow when the fill level of AXI FIFO ImgRect is exceeded.” 5

4.

PERFORMANCE OF THE SYSTEM

4.1

Use of resources

The undistortion and rectification ip core needs the following resources (see Table 1) with a maximum image size of 1280 Px × 960 Px and a maximum and minimum permissible vertical shift value of ± 24 Px. The values can be changed if desired. The IP core needs a BRAM load of 12% when using a Xilinx Zynq 7020 SoC. This is made up of four ring buffers (bufferd raw image: 2 · (24 Px + 1) · imgwith_max · 8 bit) and one line buffer (buffered compressed rectification map: imgwidth_max · 8bit).

Table 1.

Resource utilization of the UndistRect IP-Core in Xilinx Zynq 7020 SoC; at a set image size of 1280 Px × 960 Px and a set vertical maximum of 24 rows.

RessourceLUTLUTRAMFFBRAMDSPIOBUFG
in %312127333

4.2

Deviations from different calculated rectified images

The lens distortion correction and image rectification was calculated without and with a compressed transformation map. In comparison to the use of uncompressed transformation maps the rectified image shows a deviation of one till five grey value at the most (along strong gradients) using a binary precision of seven decimal places for the offset values.

5.

CONCLUSION AND FUTURE WORK

This paper shows an efficient FPGA-based lens distortion correction and image rectification module. To reduce the bandwidth utilization, it is necessary to use compressed lens undistortion and rectification transformation maps. Especially due to the effective lossy compression of the lens undistortion and rectification transformation maps, pixel by pixel rectification of an image is possible. Table 2 shows the bandwidth utilization of the different compressed undistortion and rectification transformation maps. With the new extension of the compression algorithm (see subsection 3.1.2), a reduction of the bandwidth utilization by a factor of eight is achieved. The extension of the compression algorithm in Ref. 8 is the additional subsample of the vertical and horizontal maps (see algorithm 1). Compared to the generated rectified images without and with compressed transformation maps, minor deviations of grey value are achieved (see subsection 4.2).

Table 2.

Difference in memory utilization between OpenCV and different compressed UndistRectMaps (abbr. maps).

Maps per cameraMap typeData typeSize (byte)Memory load (MB/s) 2 Mpix, 60 fpsBandwidth utilizationa (%)
OpenCVxfloat(M × N) · 496022.5
yfloat(M × N) · 4
Compressedmerged8unsigned short(M × N) · 22405.6
Compressedsubsample & mergedunsigned short(M × N) · 11202.8

a

Zynq-7000 32-bit DDR3 memory controller: maximal theoretical bandwidth 4267MB/s [13, p. 13]

There are several applications of this FPGA-based lens undistortion and image rectification module, such as its use in a stereo-based phase measuring Profilometry system.5 The presented lens undistortion and rectification ip core is used in a passive stereo system (using a Xilinx Zynq-7020 SoC) without structured-light illumination.

Within the research group DIADEM, the stereo system will be integrated into a sensor arrangement with free-form projection in future work. The planned stereo sensor setup has a working distance of 500 mm, a base line with about 100 mm and a triangulation angle of 11°.

ACKNOWLEDGMENTS

This research is supported by the Free State of Thuringia, the European Social Fund (ESF) of the European Union and the Thüringer Aufbaubank (TAB) within the research group DIADEM (2016 FGR 0044). Furthermore, this is funded by the Federal Ministry for Europe and Research within the project FASTER (BMBF, FKZ: 03ZZZZ0442E).

REFERENCES

[1] 

Winkler, S., Rosenberger, M., Höhne, D., Munkelt, C., Liu, C., and Notni, G., “3d image acquisition and processing with high continuous data throughput for human-machine-interaction and adaptive manufacturing,” in Engineering for a Changing World: Proceedings; 59th IWK, Ilmenau Scientific Colloquium, (2017). Google Scholar

[2] 

Munkelt, C., Heinze, M., Zimmermann, T., Kühmstedt, P., and Notni, G., “3d-sensornetzwerk mit geringer latenz für die echtzeit objektrekonstruktion,” 119. Jahrestagung der DGaO, Aalen, (2018). Google Scholar

[3] 

Won Nam, K., Park, J., Kim, I., and Kim, K., “Application of stereo-imaging technology to medical field,” Healthcare informatics research, 18 158 –63 (2012). https://doi.org/10.4258/hir.2012.18.3.158 Google Scholar

[4] 

Zhan, G., Tang, H., Zhong, K., Li, Z., Shi, Y., and Wang, C., “High-speed fpga-based phase measuring profilometry architecture,” Opt. Express, 25 10553 –10564 (2017). https://doi.org/10.1364/OE.25.010553 Google Scholar

[5] 

Hess, A., Junger, C., Rosenberger, M., and Notni, G., “Fpga-based phase measuring profilometry system,” (2019). https://doi.org/10.1117/12.2520916 Google Scholar

[6] 

Hartley, R. and Zisserman, A., Multiple View Geometry in Computer Vision, 2Cambridge University Press, New York, NY, USA (2003). Google Scholar

[7] 

Staudinger, E., Humenberger, M., and Kubinger, W., “Fpga-based rectification and lens undistortion for a real-time embedded stereo vision sensor,” (2008). Google Scholar

[8] 

Junger, C., Heß, A., Rosenberger, M., and Notni, G., “FPGA-accelerated phase rectification for a stereo-based phase measuring profilometry system,” Journal of Physics: Conference Series, 1065 032017 (2018). Google Scholar

[9] 

Akin, A., Gaemperle, L. M., Najibi, H., Schmid, A., and Leblebici, Y., “Enhanced compressed look-up-table based real-time rectification hardware,” in VLSI-SoC: At the Crossroads of Emerging Trends - 21st IFIP WG 10.5/IEEE International Conference on Very Large Scale Integration, VLSI-SoC 2013, 227 –248 (2013). Google Scholar

[10] 

Bradski, G. and Kaehler, A., Learning OpenCV: Computer Vision with the OpenCV Library, firstO’Reilly Media, Sebastopol (CA)(2008). Google Scholar

[11] 

, “openCV dev team, Camera calibration and 3D reconstruction,” (2019) https://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html Google Scholar

[13] 

Lucero, J. and Slous, B., “Designing high-performance video systems with the Zynq-7000 all programmable SoC using ip integrator XAPP1205,” (2014) https://www.xilinx.com/support/documentation/application_notes/xapp1205-high-performance-video-zynq.pdf Google Scholar
© (2019) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Christina Junger, Albrecht Heß, Maik Rosenberger, and Gunther Notni "FPGA-based lens undistortion and image rectification for stereo vision applications", Proc. SPIE 11144, Photonics and Education in Measurement Science 2019, 1114416 (17 September 2019); https://doi.org/10.1117/12.2530692
PROCEEDINGS
8 PAGES


SHARE
Advertisement
Advertisement
RELATED CONTENT

The PANIC software system
Proceedings of SPIE (July 19 2010)

Back to Top