With the rapid development of machine vision technology and increasing demand on three-dimensional (3-D) measurements, binocular stereo vision technology has been widely applied in many fields such as noncontact measurements, robot navigation, and on-line monitoring.12.–3 Traditionally, a binocular stereo vision system is composed of two cameras or a moving camera, capturing the object’s images from different directions. However, sensors that are built using two cameras are characterized by large size and poor flexibility, while those that utilize one camera lack instantaneity and synchronization. Against this backdrop, virtual binocular stereo vision devices that use camera and catoptric mirrors have become a popular research venue in recent years.45.–6 Compared with conventional two-camera vision systems, a virtual binocular stereo vision system is characterized by good synchronization, compact structure, low cost, and high flexibility. However, in the applications that use such virtual binocular stereo vision systems for 3-D measurements, the two pivotal tasks, calibration of the system and feature matching, are different from those used in traditional two-camera systems.
When using a two-camera stereo vision system, calibration refers to the process of determining the intrinsic parameters of the two cameras and their structural parameters.7 Then, feature matching is performed, and the feature points are effectively constrained using the conventional model with the epipolar constraint.8 However, for a single-camera virtual binocular vision system, each captured image is separated into two parts, which are projected by the real target and its mirror reflection. The usual approach is to separate the two parts of the image, creating a binocular system with two virtual cameras. Therefore, the calibration task of such a system consists of determining the intrinsic parameters of the single camera and the structural parameters of the two virtual cameras. Based on the above interpretation of a virtual binocular system, a two-step calibration method (TSCM),9,10 consists of the following steps. First, the catoptric mirrors are removed, using only the single camera to capture the calibration target images and to calculate the camera’s intrinsic parameters. Then, the catoptric mirrors are reincorporated into the system and one calibration target image is recaptured to determine the system’s structural parameters.
The epipolar geometry of catoptric stereo vision systems with mirrors has received increasing attention.1112.–13 However, previous works mainly aimed at solving the problem for the condition in which the same target is projected onto the single camera at the same reflection times. In many practical conditions, the same target is imaged on the camera at different reflection times, yielding a mirror relation between the two image parts.1415.16.–17 Thus, the two real cameras are replaced by virtual cameras formed by a single camera and mirrors.18 For a single-camera mirror system, the region of interest (ROI) usually covers the entire image, which significantly increases the matching error. To establish the geometric principles of the feature-matching process of a single-camera mirror binocular stereo vision system, we derived the epipolar constraint between two image parts for a single-camera model. This model combines the traditional epipolar constraint and the particularity of a single-camera mirror binocular stereo vision system, providing the constraint between the image coordinates of the real target and its mirror reflection.
Review of the Traditional Epipolar Constraint Model
Epipolar geometry is convenient for describing and analyzing multicamera vision systems. It is used to represent the geometric relationship between two viewpoints of the same scene based on a few corresponding points in a pair of images. This relationship, which is formulated as a matrix (called the fundamental matrix), can further be used for simplifying the ROI, computing the displacements between cameras, and rectifying the stereo image pairs.
Image formation in a camera can be described by a widely used pin-hole model.19 The coordinates of a 3-D point in the world coordinate system and its image plane coordinate are related through
Two-Camera Epipolar Constraint Model
Epipolar constraint is one of the most important principles in the binocular stereo vision, and is also a fundamental constraint underlying all self-calibration techniques.20 Consider a two-camera stereo vision system shown in Fig. 1. Note that is a 3-D point; and are its projections onto image and image , respectively; and are the optical centers of the left and right cameras, respectively. The plane , defined by the three spatial points , , and , is known as the epipolar plane. The intersection of the epipolar plane with the image is termed as the epipolar line, and is denoted by . Thus, the corresponding point in the image plane of must be constrained to the line . This model can also be described by geometric deduction, as shown in Fig. 1.
Define and as the intrinsic matrices of the two cameras, respectively, and let denote the transformation between the coordinate systems of the two cameras. Under the pin-hole model, the following equation holds:
Single-Camera Mirror Epipolar Geometry
The epipolar geometry of a two-mirror system was first investigated by Gluckman and Nayar.21 They showed that the number of free parameters in the fundamental matrix can be reduced from 7 to 6 for a two-mirror system with no constraint on the locations of mirrors. In what follows, we develop a precise description of the epipolar constraint model of a single-camera binocular vision system.
Single-Camera Mirror Binocular System
There are two types of mirror binocular stereo vision systems. For systems in the first category, the tested target is imaged using one real space path and one reflection path.16 These types of binocular stereo vision systems with one real image are shown in Fig. 2. There are two images of the point , corresponding to the two different paths. For the mirror point , the image is captured after one reflection. But for the real point , the image is captured directly. Thus, this binocular vision device is equivalent to a device in which a real camera and a virtual camera are at fixed mirror symmetric positions.
For systems in the second category, the tested target is projected onto a single camera via two different reflection paths.17 In this system, the tested target is projected onto a single camera after one or two reflections, as shown in Fig. 3. Imaging of the target in the field of view (FOV) can be separated into two reflection paths. Using the upper slope mirror, the target can be captured after one reflection. On the other hand, two reflections are needed for imaging performed using the lower mirror. Therefore, for this virtual binocular structure, the left and right virtual cameras and images exhibit a mirror relationship. Figure 3(a) shows two different paths from the target to the camera. As a four-side symmetric system, four pairs of target images from different directions can be captured simultaneously, as shown in Fig. 3(b). For each pair, binocular images are mirror-symmetric, as shown in Fig. 3(c).
Single-Camera Mirror Epipolar Constraint Model
As is well known, in the traditional two-camera stereo reconstruction process, feature matching can be effectively performed using the epipolar constraint.22 However, for these mirror images, the epipolar constraint model exhibits different characteristics in comparison to traditional two-real-camera systems, because the same target point is captured from different paths by a single camera and forms two image points, as shown in Fig. 4.
Owing to the unicity of the real camera and the symmetry of the real camera and its reflection, two virtual epipolar points and should coincide in the single image. In addition, the two epipolar points and two virtual target points and should be coplanar. Thus, in the real image plane, the two epipolar points and two target points should be collinear, and the two epipolar points should have the same positions. In this approach, the analysis is started from the target point and its symmetric point . According to Eq. (1), the perspective projection of the two points can be expressed as follows:
As stated in the previous section, the 3-D point can be denoted by its homogeneous coordinates . Thus, the three relationships can be written as follows:
To simplify this model of imaging, the same element can be eliminated from the above two equations, yielding the following expression for and
Considering the above equation from a purely geometrical perspective, the expression describes the vector in the image’s coordinate system, and is equivalent to the vector in the camera’s coordinate system. Thus, Eq. (9) can be interpreted as follows. The two-dimensional (2-D) vector is the projection of the 3-D vector from the camera’s coordinate system to the image’s coordinate system. Because is the projection center from any viewed point including , the principle shown in Eq. (9) can be used to derive another relationship among , , and
According to the above equation, it can be derived that the three image points , , and are on the same straight line in the image’s plane. Because of the unicity of the real camera and the symmetry between the virtual camera and the real one, two virtual epipolar points and coincide at point . In addition, the two epipolar points and two target points and are coplanar. Thus, in the real single image plane, the two epipolar points and the two target points are collinear, and the two epipolar points are at the same position, as shown in Fig. 5.
Single-Camera Multimirror Epipolar Constraint Model
Here, we introduce a system that is different from the one described in the previous section. In this system, the space points in the FOV can project on the single camera through one or two mirrors. As before, the analysis starts from the target point and its symmetric points , , and that are reflected by different mirrors, as shown in Fig. 6. Here, is the virtual point reflected by mirror , and is the virtual point reflected by mirror . is the symmetric point of for mirror . The symmetric virtual cameras and of the real camera are formatted according to the same principle. The virtual cameras and make the system equivalent to a binocular system.
The virtual cameras and are the reflections of the real camera in mirrors and , which are formatted by one reflection. The virtual camera is symmetric with the virtual camera about mirror , which is formatted by two reflections from the real camera . According to Eq. (12), the space point projects to the real camera and the virtual camera and formats two image points and , which conform to the following relationship:
The same relationships exist for the real camera and the virtual camera , and the virtual camera is symmetric with the virtual camera about mirror . Thus, the following equations can be derived:
The element can be eliminated by combining Eqs. (13)–(15). Thus, the relationship between two image points in the virtual cameras and can be derived as follows:Fig. 7.
Aiming at evaluating the performance of the proposed epipolar constraint model in practical applications, real experiments and analyses were performed using a mirror virtual binocular stereo vision system. Before performing the experiments and analyses, the virtual binocular vision system was calibrated. Thus, the coordinates of the epipolar point and the principal point, the distortion coefficients, and the focal length of the camera lens were obtained. In future experiments, the results of this calibration will be used in a direct manner.
The experimental system was established according to the one-mirror system described in the previous section. 16 The experimental system included a camera and a reflecting mirror, which were fixed on the experimental platform. The camera was IMPERX-IGV-B1601M version, with the frame frequency 15 fps, resolution of , focal length of the lens at 8.5 mm, and size of the charge coupled device at 2/3 in. The setup is shown in Fig. 8.
To calibrate the system precisely, we used the TSCM9 and Zhou et al.’s method in Ref. 6. For the former method, the system’s intrinsic parameters and structural parameters were calibrated separately by removing the mirror and fixing it, while for the later one, these parameters were calibrated without removing the mirror. The calibration target used in this process was a ceramic plane with circular features organized in a array; the diameters of the dots and the center-to-center distances between adjacent dots were 4 and 8 mm, respectively. The calibration plane images captured for calibrating the system’s intrinsic parameters and structural parameters are shown in Fig. 9.
In the calibration process using the TSCM, eight calibration plane images were used for the extraction of intrinsic parameters; two of them are shown in Figs. 9(a) and 9(b). Calibration of structural parameters requires only one image, which includes the real calibration plane image and its mirror reflection, as shown in Fig. 9(c). However, in the calibration process using Zhou et al.’s method, four mirror images of the calibration plane were captured for the extraction of intrinsic parameters, which included a total of eight calibration plane subimages, as shown in Fig. 10. Calibration of structural parameters requires only one image, which is the same as TSCM. The results of the two calibration procedures are listed in Table 1.
System calibration results using two methods.
|Parameters||Results using TSCM||Results using Zhou et al.’s method|
|(pixel)||1948.149, 1348.217||1948.158, 1348.175|
|(pixel)||(827.193, 620.490)||(827.253, 620.587)|
|, 0.02315||, 0.02434|
|(pixel)||(, 712.248)||(, 712.587)|
Here, and denote the focal lengths of the lens in the and directions; are the image coordinates of the principal point; and represent the two-order lens distortion coefficients; and are the rotation matrix and the translation vector, respectively, of the virtual binocular structure; denotes the reprojection error; and are the coordinates of the single epipolar point. The structural parameters were calibrated using the same image and have the same rotation matrix and transition vector . For a single camera vision system, the reprojection error is a significant parameter showing the mapping accuracy from the 3-D space to 2-D images. According to the calibration results, the TSCM achieved higher reprojection accuracy compared to Zhou et al.’s method. Thus, in the following feature-matching experiment, the TSCM calibration results will be used.
The experimental setup has been precisely calibrated using the TSCM, as described in the previous section. To validate the proposed single-image mirror epipolar constraint model, a feature-matching experiment of the calibration plane based on the single image was performed. The validated process is shown in Fig. 11. First, an image of the calibration plane was captured by the calibrated experimental setup for testing, as shown in Fig. 11(a). The calibration plane that was used here was the same as that used in the calibration experiment.
Then, we extracted the coordinates of all of the feature points of the tested image, using the ellipse fitting method.23 The results of this extraction, after correcting for the image distortion, are shown in Fig. 11(b). Next, we established the relation between the two arrays of coordinates of feature points in a point-by-point manner, according to the proposed epipolar constraint model, as shown in Fig. 11(c). Finally, by rebuilding all the 49 feature points according to the stereo vision model and previous calibration results, we obtained the 3-D space point coordinates. For a more scrupulous validation of the proposed mirror epipolar constraint model, three additional experiments were performed according to the same process as the first experiment. The four reconstruction results are shown in Fig. 11(d).
The error analysis of the four experiments was performed with the aim of precisely evaluating the errors between real feature and reconstructed points. We calculated coordinates of all points in the real camera coordinate system and analyzed the offset distances between these points and the real calibration plane by comparing the measured distance of the adjacent feature point with the real value of 8 mm. The results of the four experiments are listed in Table 2. It can be seen that the average absolute and relative errors are 0.05 mm and 0.6%, respectively.
Measured distance of adjacent feature point.
|Image||Number of feature points||Average absolute errors (mm)||Average relative errors (%)|
Real Feature-Matching Experiment
To validate the proposed epipolar constraint rule on practical applications, a real feature-matching experiment was performed and is reported in this section. For comparison, we also report the results of feature matching performed without the epipolar constraint. The real target image captured by the experimental system is shown in Fig. 12(a). Feature extraction was performed using the oriented FAST and rotated BRIEF (ORB) method,24 and the results of matching obtained without the constraint rule are shown in Fig. 12(b). Apparently, the matching errors are very high. The results obtained using the proposed epipolar constraint model are shown in Fig. 12(c). Clearly, the proposed epipolar constraint model yields better results, and we conclude that it increases the accuracy of the feature-matching process. It should be pointed out that the proposed epipolar constraint model is used to constrain the target feature point to a line, which is different from a point-to-point matching method.
In many cases of using mirror virtual binocular vision systems, the same target is imaged using one camera and different reflections, leading to a mirror relation between the left and right parts of the captured image; in this situation, the two real cameras of the traditional binocular system are replaced by virtual cameras, which are formed by a single camera and mirrors. To perform the feature matching process effectively, a single-camera mirror feature-matching rule, i.e., a mirror epipolar constraint model, used in the 3-D reconstruction process, was established here for a mirror virtual binocular vision system. To validate the proposed epipolar constraint model and to evaluate its performance in practical applications, system calibration experiments, error analysis, and realistic feature-matching experiments were performed using a virtual binocular stereo vision system and the results of these experiments were analyzed and reported. The results showed that the proposed epipolar constraint method is feasible and can increase the accuracy of feature matching.
This work was supported by the National Natural Science Foundation of China (No. 61372177).
Xinghua Chai received his MS degree from Beijing Information Science and Technology University in 2013. He is working toward his PhD in the School of Instrumentation Science and Optoelectronics Engineering, Beihang University. His research directions are vision measurement and vision sensors.
Fuqiang Zhou received his BS, MS and PhD degrees from the School of Instrument, Measurement and Test Technology from Tianjin University in 1994, 1997, and 2000, respectively. He joined the School of Automation Science and Electrical Engineering at Beihang University as a postdoctoral research fellow in 2000. Now, he is a professor in the School of Instrumentation Science and Optoelectronics Engineering at Beihang University. His research directions include computer vision, image processing, and optical metrology.