An arbitrary view synthesis method from 2D-Plus-Depth image for real-time auto-stereoscopic display is presented.
Traditional methods use depth image based rendering (DIBR) technology, which is a process of synthesizing “virtual”
views of a scene from still or moving images and associated per-pixel depth information. All the virtual view images are
generated and then the ultimate stereo-image is synthesized. DIBR can greatly decrease the number of reference images
and is flexible and efficient as the depth images are used. However it causes some problems such as the appearance of
holes in the rendered image, and the occurrence of depth discontinuity on the surface of the object at virtual image plane.
Here, reversed disparity shift pixel rendering is used to generate the stereo-image directly, and the target image won’t
generate holes. To avoid duplication of calculation and also to be able to match with any specific three-dimensional
display, a selecting table is designed to pick up appropriate virtual viewpoints for auto-stereoscopic display. According to
the selecting table, only sub-pixels of the appropriate virtual viewpoints are calculated, so calculation amount is
independent of the number of virtual viewpoints. In addition, 3D image warping technology is used to translate depth
information to parallax between virtual viewpoints and parallax, and the viewer can adjust the
zero-parallax-setting-plane (ZPS) and change parallax conveniently to suit his/her personal preferences. The proposed
method is implemented with OPENGL and demonstrated on a laptop computer with a 2.3 GHz Intel Core i5 CPU and
NVIDA GeForce GT540m GPU. We got a frame rate 30 frames per second with 4096×2340 video. High synthesis
efficiency and good stereoscopic sense can be obtained. The presented method can meet the requirements of
real-time ultra-HD super multi-view auto-stereoscopic display.
High-immersion three-dimensional (3D) displays making them valuable tools for many applications, such as
designing and constructing desired building houses, industrial architecture design, aeronautics, scientific
research, entertainment, media advertisement, military areas and so on. However, most technologies provide
3D display in the front of screens which are in parallel with the walls, and the sense of immersion is decreased.
To get the right multi-view stereo ground image, cameras’ photosensitive surface should be parallax to the
public focus plane and the cameras’ optical axes should be offset to the center of public focus plane both
atvertical direction and horizontal direction. It is very common to use virtual cameras, which is an ideal pinhole
camera to display 3D model in computer system. We can use virtual cameras to simulate the shooting method
of multi-view ground based stereo image. Here, two virtual shooting methods for ground based
high-immersion 3D display are presented. The position of virtual camera is determined by the people's eye
position in the real world. When the observer stand in the circumcircle of 3D ground display, offset perspective
projection virtual cameras is used. If the observer stands out the circumcircle of 3D ground display, offset
perspective projection virtual cameras and the orthogonal projection virtual cameras are adopted. In this paper,
we mainly discussed the parameter setting of virtual cameras。The Near Clip Plane parameter setting is the
main point in the first method, while the rotation angle of virtual cameras is the main point in the second
method. In order to validate the results, we use the D3D and OpenGL to render scenes of different viewpoints
and generate a stereoscopic image. A realistic visualization system for 3D models is constructed and
demonstrated for viewing horizontally, which provides high-immersion 3D visualization. The displayed 3D
scenes are compared with the real objects in the real world.
With the progress of 3D technology, the huge computing capacity for the real-time autostereoscopic
display is required. Because of complicated sub-pixel allocating, masks providing arranged sub-pixels are
fabricated to reduce real-time computation. However, the binary mask has inherent drawbacks. In order to
solve these problems, weighted masks are used in displaying based on partial sub-pixel. Nevertheless, the
corresponding computations will be tremendously growing and unbearable for CPU. To improve calculating
speed, Graphics Processing Unit (GPU) processing with parallel computing ability is adopted. Here the
principle of partial sub-pixel is presented, and the texture array of Direct3D 10 is used to increase the number
of computable textures. When dealing with a HD display and multi-viewpoints, a low level GPU is still able
to permit a fluent real time displaying, while the performance of high level CPU is really not acceptable.
Meanwhile, after using texture array, the performance of D3D10 could be double, and sometimes be triple
faster than D3D9. There are several distinguishing features for the proposed method, such as the good
portability, less overhead and good stability. The GPU display system could also be used for the future Ultra
HD autostereoscopic display.
Reconstruction of three-dimensional (3D) scenes is an active research topic in the field of computer vision and 3D
display. It’s a challenge to model 3D objects rapidly and effectively. A 3D model can be extracted from multiple images.
The system only requires a sequence of images taken with cameras without knowing the parameters of camera, which
provide flexibility to a high degree. We focus on quickly merging point cloud of the object from depth map sequences.
The whole system combines algorithms of different areas in computer vision, such as camera calibration, stereo
correspondence, point cloud splicing and surface reconstruction. The procedure of 3D reconstruction is decomposed into
a number of successive steps. Firstly, image sequences are received by the camera freely moving around the object.
Secondly, the scene depth is obtained by a non-local stereo matching algorithm. The pairwise is realized with the Scale
Invariant Feature Transform (SIFT) algorithm. An initial matching is then made for the first two images of the sequence.
For the subsequent image that is processed with previous image, the point of interest corresponding to ones in previous
images are refined or corrected. The vertical parallax between the images is eliminated. The next step is to calibrate
camera, and intrinsic parameters and external parameters of the camera are calculated. Therefore, The relative position
and orientation of camera are gotten. A sequence of depth maps are acquired by using a non-local cost aggregation
method for stereo matching. Then point cloud sequence is achieved by the scene depths, which consists of point cloud
model using the external parameters of camera and the point cloud sequence. The point cloud model is then
approximated by a triangular wire-frame mesh to reduce geometric complexity and to tailor the model to the
requirements of computer graphics visualization systems. Finally, the texture is mapped onto the wire-frame model,
which can also be used for 3D display. According to the experimental results, we can reconstruct a 3D point cloud model
more quickly and efficiently than other methods.