Recently, 3-D imaging systems using multiview images that allow viewers to take any viewing posture by providing full parallax images have been gaining popularity. IVR provides a solution for the problem presented by large data involved in displaying natural 3-D images.^{1} The disparity estimation module is a key processor for IVR. A pixel-based approach is better than a block-based approach in terms of quality, but it demands a larger computational complexity because the disparities of all pixels have to be estimated.^{2} For that reason, a fast algorithm with good performance is required.

Fast algorithms^{3}
^{4} that are based on the block-based approach are for stereoscopic image coding and can process only stereo images that are taken from parallel camera. Therefore, they aren’t suitable for IVR applications, which demand high-quality images. Fast algorithms^{5} are for estimating dense disparities. However, the algorithm in Ref. 5 isn’t useful in implementing a real-time disparity estimation processor for 720×480 stereo images because of insufficient reduction in the computational complexity.

In this letter, we propose a new fast algorithm that is based on the pixel-based approach and reduces computational complexity in comparison with the conventional pixel-based full search approach. The algorithm can process stereo images taken from parallel and convergent cameras in implementing a real-time disparity estimation processor.

Figure 1 shows parallel and convergent camera geometries.^{6} For parallel cameras, the respective locations of image points X_{cl}
and X_{cr}
on the left and right imaging sensor can be given by

*f*is the lens focal length,

*t*is the camera separation,

*h*is the sensor axial offset.

^{6}

For convergent cameras, imaged points on sensor X_{cl}
and X_{cr}
are

## (2)

$${X}_{cl}=f\times \mathrm{tan}(\alpha -\phi ),\text{}{X}_{cr}=-f\times \mathrm{tan}(\beta -\phi )$$_{0})/2Z

_{0}and β=arctan(t−2X

_{0})/2Z

_{0}.

^{6}

If X_{0}>X_{1}>X_{2}
and Z_{0}<C are assumed in the parallel and convergent camera geometries as shown in Fig. 1, the conditions of X_{0cl}>X_{1cl}>X_{2cl}
and X_{0cr}>X_{1cr}>X_{2cr}
are met. Also, if Z_{0}>C is assumed in the convergent cameras, the conditions of X_{0cl}<X_{1cl}<X_{2cl}
and X_{0cr}<X_{1cr}<X_{2cr}
are met. We realize the important property of stereo matching from the above example: in stereo images, the disparity vector of a pixel is located between those of previous and subsequent pixels.

The search range can be predicted by using the property of the referred stereo matching. Figure 2 shows an example. First, the disparities of the 0’th and λ’th pixels are estimated by Eq. (3). Then the disparity of (λ/2)’th pixel is estimated by Eq. (4). Since the predicted search range (PSR) is usually smaller than the search range, computational complexity is reduced.

## (3)

$${d}^{f}(i)=\text{arg}\underset{r}{\text{min}}[U\left\{{I}_{cur}(i),\text{}{I}_{ref}(i+r)\right\}],$$## (4)

$${d}^{p}(i)=\text{arg}\underset{\overline{r}}{\text{min}}[U\{{I}_{cur}(i),\text{}{I}_{ref}(i+\overline{r})\}],$$^{f}is the disparity estimated using the full search approach, d

^{p}is the disparity estimated using the predictive search approach, I

_{cur}is the intensity of current image, I

_{ref}is the intensity of reference image,

*r*is the search range,

*r¯*is PSR, and

*U*is the cost function, such as the sum of absolute difference.

The procedure of the predictive search approach of the proposed algorithm, which is based on Eqs. (3) and (4), is as follows. First, *x*, *y*, and *n* are initialized to 0, and λ is determined. Second, the disparities of the pixels of the λ space are estimated by using the full search approach until *x* is smaller than the image size. Third, if *x* is larger than the image size, *x* is changed to λ/2^{n+1}. Fourth, the disparities of the pixels of the λ/2^{n}
space are estimated by using the predictive search approach until *x* is smaller than the image size. Then if PSR is smaller than 0, PSR is changed with the full search range to prevent wrong disparity estimation. Fifth, if *x* is larger than the image size, *n* is increased by 1. Sixth, we repeat steps 3 to 5 until the disparity estimation of all pixels of a line is finished. Seventh, if *y* is smaller than the image height, *y* is increased by 1 and both *x* and *n* are initialized to 0. Eighth, we repeat steps 2 to 7 until *y* is equal to the image height.

The proposed algorithm predicts the search range but it almost doesn’t cause the propagation and the prediction errors because the disparities for the pixels of the λ space are estimated by using the full search approach, and the PSR that is smaller than 0 is changed with the full search range. As a result, the proposed algorithm is as exact as the full search approach.

Both the full search approach and the proposed algorithm have been tested for the 450×375 “Teddy,” 434×380 “Sawtooth,” and 434×383 “Venus” stereo images,^{7} which are taken from a parallel camera, and for the 720×480 “Puppy” stereo images recorded by the Realistic Broadcasting Research Team of the Electronics and Telecommunications Research Institute, Daejoen, Korea, which are taken from a convergent camera. The full search range for disparity estimation is ±32 pixels. The window size of 16 by 16 has been used. In view of optimal trade-off between quality and computational complexity, λ of 16 is chosen.

Figure 3 compares the proposed algorithm with the full search approach in views of the computational complexity, the PSNR, and the reconstructed images. The reduction of computational complexity is evaluated by dividing the number of searched windows in the full search approach to the number of searched windows in the proposed algorithm. The PSNR is computed by the reconstructed image and the original image.

In Fig. 3, the proposed algorithm achieves competitive or slightly better PSNR and reduces the computational complexity by about 8 times in comparison with the full search approach. Therefore, the proposed algorithm is useful in implementing the real-time disparity estimation processor, which achieves over 30 frames/s for arbitrary 720×480 stereo images. The number of frames can be given by (maximum frequency×reduced computational complexity)/(image size×search range)=[over (90×10^{6})×7.4]/(720×480×64)⩾30.11 frames/s.

We have proposed a predictive search algorithm based on a pixel-based approach for fast and efficient disparity estimation. By using the matching property of stereo imaging, we reduce the computational complexity. The experimental results show that the proposed algorithm is about 8 times faster than the full search approach while achieving competitive or slightly better PSNR. The algorithm is proven to be effective in the implementation of the real-time disparity estimation processor for arbitrary 720×480 stereo images. Hence, the algorithm can be applied to stereo image applications that require real-time and high-performance solutions, such as 3-D TV.

## Acknowledgments

This research was supported by University IT Research Center Project and partly by the Digital Media R&D Center, Samsung Electronic Co., Ltd.

## REFERENCES

*Applications of Digital Image Processing*, Proc. SPIE 5203, 590–598 (2003). Google Scholar

*Stereoscopic Display and Applications IV*, Proc. SPIE 1915, 36–47 (1993). Google Scholar