In this paper, we analyze the modulation transfer function (MTF) of coded aperture imaging in a flat panel display. The flat panel display with a sensor panel forms lens-less multi-view cameras through the imaging pattern of the modified redundant arrays (MURA) on the display panel. To analyze the MTF of the coded aperture imaging implemented on the display panel, we first mathematically model the encoding process of coded aperture imaging, where the projected image on the sensor panel is modeled as a convolution of the scaled object and a function of the imaging pattern. Then, system point spread function is determined by incorporating a decoding process which is dependent on the pixel pitch of the display screen and the decoding function. Finally, the MTF of the system is derived by the magnitude of the Fourier transform of the determined system point spread function. To demonstrate the validity of the mathematically derived MTF in the system, we build a coded aperture imaging system that can capture the scene in front of the display, where the system consists of a display screen and a sensor panel. Experimental results show that the derived MTF of coded aperture imaging in a flat panel display system well corresponds to the measured MTF.
This paper presents a human pose recognition method which simultaneously reconstructs a human volume based on ensemble of voxel classifiers from a single depth image in real-time. The human pose recognition is a difficult task since a single depth camera can capture only visible surfaces of a human body. In order to recognize invisible (self-occluded) surfaces of a human body, the proposed algorithm employs voxel classifiers trained with multi-layered synthetic voxels. Specifically, ray-casting onto a volumetric human model generates a synthetic voxel, where voxel consists of a 3D position and ID corresponding to the body part. The synthesized volumetric data which contain both visible and invisible body voxels are utilized to train the voxel classifiers. As a result, the voxel classifiers not only identify the visible voxels but also reconstruct the 3D positions and the IDs of the invisible voxels. The experimental results show improved performance on estimating the human poses due to the capability of inferring the invisible human body voxels. It is expected that the proposed algorithm can be applied to many fields such as telepresence, gaming, virtual fitting, wellness business, and real 3D contents control on real 3D displays.
In this paper, we propose a novel depth estimation method from multiple coded apertures for 3D interaction. A flat panel
display is transformed into lens-less multi-view cameras which consist of multiple coded apertures. The sensor panel
behind the display captures the scene in front of the display through the imaging pattern of the modified uniformly
redundant arrays (MURA) on the display panel. To estimate the depth of an object in the scene, we first generate a stack
of synthetically refocused images at various distances by using the shifting and averaging approach for the captured
coded images. And then, an initial depth map is obtained by applying a focus operator to a stack of the refocused images
for each pixel. Finally, the depth is refined by fitting a parametric focus model to the response curves near the initial
depth estimates. To demonstrate the effectiveness of the proposed algorithm, we construct an imaging system to capture
the scene in front of the display. The system consists of a display screen and an x-ray detector without a scintillator layer
so as to act as a visible sensor panel. Experimental results confirm that the proposed method accurately determines the
depth of an object including a human hand in front of the display by capturing multiple MURA coded images, generating
refocused images at different depth levels, and refining the initial depth estimates.
In this paper, we propose an efficient synthetic refocusing method from multiple coded aperture images for 3D user
interaction. The proposed method is applied to a flat panel display with a sensor panel which forms lens-less multi-view cameras. To capture the scene in front of the display, the modified uniformly redundant arrays (MURA) patterns are displayed on the LCD screen without the backlight. Through the imaging patterns on the LCD screen, MURA coded
images are captured in the sensor panel. Instead of decoding all coded images to synthetically generate a refocused
image, the proposed method only decodes one coded image corresponding to the refocusing image at a certain distance after circularly shifting and averaging all coded images. Further, based on the proposed refocusing method, the depth of an object in front of the display is estimated by finding the most focused image for each pixel through a stack of the refocused images at different depth levels. Experimental results show that the proposed method captures an object in front of the display, generates refocused images at different depth levels, and accurately determines the depth of an object including real human hands near the display