This paper introduces a dual-projector phase measuring profiler that adds a second projector to a traditional structured light illumination system to improve the overall quality of 3D scanning. With this method, two projectors are synchronized to a single camera, but each one projects structured light patterns of a unique frequency. The system performance benefits from a wider projection angle and doubled light intensity. In particular, a detailed system implementation in hardware is described. Moreover, the major difference between the phase unwrapping of our dual-projector system versus a single-projector system is discussed with a LUTbased phase unwrapping scheme proposed.
Structured Light Imaging (SLI) is a means of digital reconstruction, or Three-Dimensional (3D) scanning, and has uses that span many disciplines. A projector, camera and Personal Computer (PC) are required to perform such 3D scans. Slight variances in synchronization between these three devices can cause malfunctions in the process due to the limitations of PC graphics processors as real-time systems. Previous work used a Field Programmable Gate Array (FPGA) to both drive the projector and trigger the camera, eliminating these timing issues, but still needing an external camera. This work proposes the incorporation of the camera with the FPGA SLI controller by means of a custom printed circuit board (PCB) design. Featuring a high speed image sensor as well as High Definition Multimedia Interface (HDMI) input and output, this PCB enables the FPGA to perform SLI scans as well as pass through HDMI video to the projector for Spatial Augmented Reality (SAR) purposes. Minimizing ripple noise on the power supply by means of effective circuit design and PCB layout, creates a compact and cost effective machine vision sensing solution.
Traditionally temporal phase unwrapping for phase measuring profilometry needs to employ the phase computed from unit-frequency patterned images; however, it has recently been reported that two phases with co-prime frequencies can be absolutely unwrapped each other. However, a manually man-made look-up table for two known frequencies has to be used for correctly unwrapping phases. If two co-prime frequencies are changed, the look-up table has to be manually rebuilt. In this paper, a universal phase unwrapping algorithm is proposed to unwrap phase flexibly and automatically. The basis of the proposed algorithm is converting a signal-processing problem into a geometric analysis one. First, we normalize two wrapped phases such that they are of the same needed slope. Second, by using the modular operation, we unify the integer-valued difference of the two normalized phases over each wrapping interval. Third, by analyzing the properties of the uniform difference mathematically, we can automatically build a look-up table to record the corresponding correct orders for all wrapping intervals. Even if the frequencies are changed, the look-up table will be automatically updated for the latest involved frequencies. Finally, with the order information stored in the look-up table, the wrapped phases can be correctly unwrapped. Both simulations and experimental results verify the correctness of the proposed algorithm.
Coded aperture compressive spectral imagers sense a three-dimensional cube by using two-dimensional projections of the coded and spectrally dispersed source. These imagers systems often rely on FPA detectors, SLMs, micromirror devices (DMDs), and dispersive elements. The use of the DMDs to implement the coded apertures facilitates the capture of multiple projections, each admitting a different coded aperture pattern. The DMD allows not only to collect the sufficient number of measurements for spectrally rich scenes or very detailed spatial scenes but to design the spatial structure of the coded apertures to maximize the information content on the compressive measurements. Although sparsity is the only signal characteristic usually assumed for reconstruction in compressing sensing, other forms of prior information such as side information have been included as a way to improve the quality of the reconstructions. This paper presents the coded aperture design in a compressive spectral imager with side information in the form of RGB images of the scene. The use of RGB images as side information of the compressive sensing architecture has two main advantages: the RGB is not only used to improve the reconstruction quality but to optimally design the coded apertures for the sensing process. The coded aperture design is based on the RGB scene and thus the coded aperture structure exploits key features such as scene edges. Real reconstructions of noisy compressed measurements demonstrate the benefit of the designed coded apertures in addition to the improvement in the reconstruction quality obtained by the use of side information.
In structured light illumination (SLI), the nonlinear distortion of the optical devices dramatically ruins accuracy of three-dimensional reconstruction when using only a small number of projected patterns. We propose a universal algorithm to calibrate these device nonlinearities to accurately precompensate the patterns. Thus, no postprocessing is needed to correct for the distortions while the number of patterns can be reduced down to as few as possible. Theoretically, the proposed method can be applied to any SLI pattern strategy. Using a three-pattern SLI method, our experimental results will show a 25× to 60× reduction in surface variance for a flat target, depending upon any surface smoothing that might be applied to remove Gaussian noise.
Phase measurement profilometry is a well-known technique for making 3D measurements. The technique
involves the projection of patterns with a sinusoidally varying spatial intensity. This approach has been
used extensively to make highly accurate measurements of static images. The use of structured light to
make highly accurate measurements on human subjects is more difficult because of the inherent motion of
the subject under test. In this paper, we discuss the implementation of LUT based processing in
combination with novel architectures to enable accurate measurements of human subjects. Two specific
applications are reviewed: human body scanning and intra-oral dental scanning.
The use of fingerprints as a biometric is both the oldest mode of computer-aided personal identification and the most-relied-on technology in use today. However, current acquisition methods have some challenging and peculiar difficulties. For higher performance fingerprint data acquisition and verification, a novel noncontact 3-D fingerprint scanner is investigated, where both the detailed 3-D and albedo information of the finger is obtained. The obtained high-resolution 3-D prints are further converted into 3-D unraveled prints, to be compatible with traditional 2-D automatic fingerprint identification systems. As a result, many limitations imposed on conventional fingerprint capture and processing can be reduced by the unobtrusiveness of this approach and the extra depth information acquired. To compare the quality and matching performances of 3-D unraveled with traditional 2-D plain fingerprints, we collect both 3-D prints and their 2-D plain counterparts. The print quality and matching performances are evaluated and analyzed by using National Institute of Standard Technology fingerprint software. Experimental results show that the 3-D unraveled print outperforms the 2-D print in both quality and matching performances.
Structured-light illumination (SLI) means projecting a series of structured or striped patterns from a projector onto an object and then using a camera, placed at an angle from the projector, to record the target's 3-D shape. For multiplexing these structured patterns in time, traditional SLI systems require the target object to remain still during the scanning process. Thus, the technique of composite-pattern design was introduced as a means of combining multiple SLI patterns, using principles of frequency modulation, into a single pattern that can be continuously projected and from which 3-D surface can be reconstructed from a single image, thereby enabling the recording of 3-D video. But the associated process of modulation and demodulation is limited by the spatial bandwidth of the projector-camera pair, which introduces distortion near surface or albedo discontinuities. Therefore, this paper introduces a postprocessing step to refine the reconstructed depth surface. Simulated experiments show an 78% reduction in depth error.
Human visual system (HVS) modeling has become a
critical component in the design of digital halftoning algorithms.
Methods that exploit the characteristics of the HVS include the direct
binary search (DBS) and optimized tone-dependent halftoning approaches.
The spatial sensitivity of the HVS is low-pass in nature,
reflecting the physiological characteristics of the eye. Several HVS
models have been proposed in the literature, among them, the
broadly used Näsänen’s exponential model, which was later shown
to be constrained in shape. Richer models are needed to attain
better halftone attributes and to control the appearance of undesired
patterns. As an alternative, models based on the mixture of bivariate
Gaussian density functions have been proposed. The mathematical
characteristics of the HVS model thus play a key role in the synthesis
of model-based halftoning. In this work, alpha stable functions,
an elegant class of functions richer than mixed Gaussians, are exploited
to design HVS models to be used in two different contexts:
monochrome halftoning over rectangular and hexagonal sampling
grids. In the two scenarios, alpha stable models prove to be more
efficient than Gaussian mixtures, as they use less parameters to
characterize the tails and bandwidth of the model. It is shown that a
decrease in the model’s bandwidth leads to homogeneous halftone
patterns, and conversely, models with heavier tails yield smoother
textures. These characteristics, added to their simplicity, make alpha
stable models a powerful tool for HVS characterization.
We present an eight million point structured light illumination scanner design. It has a single patch projection resolution
of 12,288 lines along the phase direction. The Basler CMOS video cameras are 2352 by 1726 pixel resolution. The
configuration consists of a custom Boulder Nonlinear Systems Spatial Light Modulator for the projection system and
dual four mega pixel digital video cameras. The camera field of views are tiled with minimal overlap region and a
potential capture rate of 24 frames per second. This report is a status report of a project still under development. We will
report on the concept of applying a 1D-square footprint projection chip and give preliminary results of single camera
scans. The structured light illumination technique we use is the multi-pattern, multi-frequency phase measuring
profilometry technique already published by our group.
Structured light illumination refers to a scanning process of projecting a series of patterns such that, when viewed
from an angle, a camera is able to extract range information. Ultimately, resolution in depth is controlled by the number
of patterns projected which, in turn, increases the total time that the target object must remain still. By adding a second
camera sensor, it becomes possible to not only achieve wrap around scanning but also reduce the number of patterns
needed to achieve a certain degree of depth resolution. But a second camera also makes it possible to reconstruct 3-D
surfaces through stereo-vision techniques and triangulation between the cameras instead of between the cameras and the
projectors. For both of these two tasks, correspondence between points from two cameras is essential. In this paper, we
develop a new method to find the correspondence between the two cameras using both the phase information generated
by the temporal multiplexed illumination patterns and stereo triangulation. We also analyze the resulting
correspondence accuracy as a function of the number of structured patterns as well as the geometric position of projector
Fingerprints are one of the most commonly used and relied-upon biometric technology. But often the captured
fingerprint image is far from ideal due to imperfect acquisition techniques that can be slow and cumbersome
to use without providing complete fingerprint information. Most of the diffculties arise due to the contact of
the fingerprint surface with the sensor platen. To overcome these diffculties we have been developing a noncontact
scanning system for acquiring a 3-D scan of a finger with suffciently high resolution which is then
converted into a 2-D rolled equivalent image. In this paper, we describe certain quantitative measures evaluating
scanner performance. Specifically, we use some image software components developed by the National Institute
of Standards and Technology, to derive our performance metrics. Out of the eleven identified metrics, three
were found to be most suitable for evaluating scanner performance. A comparison is also made between 2D
fingerprint images obtained by the traditional means and the 2D images obtained after unrolling the 3D scans
and the quality of the acquired scans is quantified using the metrics.
Structured light illumination refers to a technique of acquiring 3-D surface scans through triangulation between a camera and a projector. Because traditional structured-light systems use multiple patterns projected sequentially in time, SLI is not typically associated with applications involving moving surfaces. To address this problem, the authors have introduced a technique referred to as composite pattern projection which involves the combining of a set of standard SLI patterns into a continuously projected pattern such that depth can be recovered from a single, captured image. As such, composite patterns can be used for tracking moving objects in 3-D space. The problem with composite patterns, though, is the added computational complexity associated with demodulating the captured image and extract the component SLI patterns. So in this paper, we introduce a means of achieving real-time pattern demodulation through the use of optical correlators with demonstrated results achieving a processing rate of over 100 frames per second.
LIDAR-based systems measure the time-of-flight of a laser source onto the scene and back to the sensor, building a wide
field of view 3D raster image, but as a scanning process, there are problems associated with motion inside the scene over
the duration of the scan. By illuminating the entire scene simultaneously using a broad laser pulse, a 2D camera
equipped with a high speed shutter can measure the time-of-flight over the entire field of view (FOV), thereby, recording
an instantaneous snap-shot of the entire scene. However, spreading the laser reduces the range. So what is required is a
programmable system that can track multiple regions of interest by varying the field of regard to (1) a single direction, (2)
the entire FOV, or (3) intermediate views of interest as required by the evolving scene environment. In this project, the
investigators intend to add this variable illumination capability to existing instantaneous ranging hardware by using a
liquid crystal spatial light modulator (SLM) beam steering system that adaptively varies the (single or multi) beam
intensity profiles and pointing directions. For autonomous satellite rendezvous, docking, and inspection, the system can
perform long-range sensing with a narrow FOV while being able to expand the FOV as the target object approaches the
sensor. To this end in a previous paper, we analyzed the performance of a commercially available TOF sensor
(3DVSystems' Zmini) in terms of the depth sensitivity versus target range and albedo. In this paper, we will analyze the
laser system specifications versus range of field-of-view when beam steering is performed by means of a Boulder
Nonlinear Systems' phase-only liquid crystal SLM. Experimental results show that the adjustable laser beam FOV
extensively compensate the reflected image grayscale from objects at long range, and prove the feasibility of expanding
range with the projection from the SLM.
The use of 3-Dimensional information in face recognition requires pose estimation. We present the use of 3-Dimensional composite correlation filter to obtain pose estimation without the need for feature identification. Composite correlation filter research has been vigorously pursued in the last three decades due to their applications in many areas, but mainly in distortion-invariant pattern recognition. While most of this research is in two-dimensional space, we have extended our study of composite filters to three-dimensions, specifically emphasizing Linear Phase Coefficient Composite Filter (LPCCF). Unlike previous approaches to composite filter design, this method considers the filter design and the training set selection simultaneously. In this research, we demonstrate the potential of implementing LPCCF in head pose estimation. We introduce the utilization of LPCCF in the application of head pose recovery through full correlation using a set of 3-D voxel maps instead of the typical 2-D pixel images/silhouettes. Unlike some existing approaches to pose estimation, we are able to acquire 3-D head pose without locating salient features of a subject. In theory, the correlation phase response contains information about the angle of head rotation of the subject. Pose estimation experiments are conducted for two degrees of freedom in rotation, that is, yaw and pitch angles. The results obtained are very much inline with our theoretical hypothesis on head orientation estimation.
The use of fingerprints as a biometric is both the oldest mode of computer aided personal identification and the most relied-upon technology in use today. But current fingerprint scanning systems have some challenging and peculiar difficulties. Often skin conditions and imperfect acquisition circumstances cause the captured fingerprint image to be far from ideal. Also some of the acquisition techniques can be slow and cumbersome to use and may not provide the complete information required for reliable feature extraction and fingerprint matching. Most of the difficulties arise due to the contact of the fingerprint surface with the sensor platen. To attain a fast-capture, non-contact, fingerprint scanning technology, we are developing a scanning system that employs structured light illumination as a means for acquiring a 3-D scan of the finger with sufficiently high resolution to record ridge-level details. In this paper, we describe the postprocessing steps used for converting the acquired 3-D scan of the subject's finger into a 2-D rolled equivalent image.
Human visual system (HVS) modeling has become a critical component in the design of digital halftoning algorithms. Methods that exploit the characteristics of the HVS include the direct binary search (DBS) and optimized tone-dependent halftoning approaches. The spatial sensitivity of the HVS is lowpass in nature, reflecting the physiological characteristics of the eye. Several HVS models have been proposed in the literature, among them, the broadly used Nasanen's exponential model. As shown experimentally by Kim and Allebach,1 Nasanen's model is constrained in shape and richer models are needed in order to attain better halftone attributes and to
control the appearance of undesired patterns. As an alternative, they proposed a class of HVS models based on mixtures of bivariate Gaussian density functions. The mathematical characteristics of the HVS model thus play a key role in the synthesis of model-based halftoning. In this work, alpha stable functions, an elegant class of
models richer than mixed Gaussians, are exploited. These are more efficient than Gaussian mixtures as they use less parameters to characterize the tails and bandwidth of the model. It is shown that a decrease in the model's bandwidth leads to homogeneous halftone patterns and conversely, models with heavier tails yield smoother
textures. These characteristics, added to their simplicity, make alpha stable models a powerful tool for HVS characterization.
The focus of this paper is on the broad extension of multi-spectral color to both scientist and consumer by creating camera/projector arrays composed of commodity hardware. In contrast to expensive, high-maintainance systems which rely on the physical registration of device spaces, we rely on the virtual alignment of viewing spaces in software where real-time alignment is achieved using the processing capacity of the graphical processing units of consumer PC video cards. Specifically, this paper focuses on the inclusion of real-time, composite pattern, structured light illumination (SLI) as a means of recording the 3D shape of objects, which will then be used for the registration of single-color images taken from multiple view points simultaneously. As such, the described system is able to achieve a cost per unit that scales linearly with the number of color primaries.
Halftoning approaches to image rendering on binary devices have traditionally relied on rectangular grids for dot placement. This practice has been followed mainly due to restrictions on printer hardware technology. However, recent advances on printing devices coupled with the availability of efficient interpolation and resampling algorithms are making the implementation of halftone prints over alternate dot placement tessellations feasible. This is of particular interest since blue noise dithering principles indicate that the visual artifacts at several tone densities, which appear in rectangular-grid halftones, can be overcome through the use of hexagonal tessellations. While the spectral analysis of blue noise dithering provides the desired spectral characteristics one must attain, it does not provide the dithering structures needed to achieve these. In this paper, these optimal dithering mechanisms are developed through modifications of the Direct Binary Search (DBS) algorithm extensively used for rectangular grids. Special attention is given to the effects of the new geometry on the Human Visual System
(HVS) models and on the efficient implementation of the hexagonal-grid DBS. This algorithm provides the best possible output at the expense of high computational complexity, and while the DBS algorithm is not practical in most applications, it provides a performance benchmark for other more practical algorithms. Finally, a tone-dependent, hexagonal-grid, error-diffusion algorithm is developed, where the DBS algorithm is used to optimize the underlying filter weights. The characteristics of the HVS are thus implicitly used in the optimization. Extensive simulations show that hexagonal grids do indeed reduce disturbing artifacts, providing smoother halftone textures over the entire gray-scale region. Results also show that tone-dependent error-diffusion can provide comparable results to that of the DBS algorithms but at a significantly lower computational complexity.
Phase-only spatial light modulators provide active pattern projection. Unlike incoherent techniques, the pattern energy is inversely proportional to the total pattern area. If the patterns consist of spots or regions of light energy, it is possible to achieve a high signal-to-noise ratio within these regions. A 3DV Systems’ Zmini range finder works with fast switching of the illumination source to form the “light wall” and fast gating of the reflected image entering the camera. Zmini operates by using a high speed shutter to temporally clip the energy field going to one camera chip while allowing the full pulsed energy to go to a second camera chip. The second chip captures the albedo which is effectively pixelwise divided out of the shuttered chip, leaving values that are proportional to depth. Thus, video rate time of flight depth information is attained. By combining these two technologies, we can extend the operating range of the Zmini shuttered depth finder significantly. In this paper, we present a feasibility report on the range property of the Zmini. The spatial light modulator under investigation is a 512 × 512 element, phase-only, liquid crystal device recently produced by Boulder Non-linear
Interacting with computer technology while wearing a space suit is difficult at best. We present a sensor that can interpret body gestures in 3-Dimensions. Having the depth dimension allows simple thresholding to isolate the hands as well as use their positioning and orientation as input controls to digital devices such as computers and/or robotic devices. Structured light pattern projection is a well known method of accurately extracting 3-Dimensional information of a scene. Traditional structured light methods require several different patterns to recover the depth, without ambiguity and albedo sensitivity, and are corrupted by object motion during the projection/capture process. The authors have developed a methodology for combining multiple patterns into a single composite pattern by using 2-Dimensional spatial modulation techniques. A single composite pattern projection does not require synchronization with the camera so the data acquisition rate is only limited by the video rate. We have incorporated dynamic programming to greatly improve the resolution of the scan. Other applications include machine vision, remote controlled robotic interfacing in space, advanced cockpit controls and computer interfacing for the disabled. We will present performance analysis, experimental results and video examples.
For consumer imaging applications, multi-spectral color refers to capturing and displaying images in more than three primary colors in order to achieve color gamuts significantly larger than those produced by RGB devices. In this paper, we describe the building of both a multi-camera recording system and multi-projector display system using off-the-shelf components that, unlike existing multi-camera/projector systems that rely on expensive and time consuming optical alignment of camera/projector views, relies upon the virtual alignment of views performed in software. Once images are properly aligned, the described systems represent recording/display platforms that scale linearly in cost with the number of color primaries where new colors are added by simply attaching more devices. In this paper, we illustrate frames of the color video produced using a five camera system as well as an image of the aligned six projectors of the display system.
Structured light pattern projection is a well known method of accurately extracting 3-Dimensional information of a scene. Traditional multi-frame structured light methods require several different patterns to recover the depth, without ambiguity and albedo sensitivity, and are corrupted by object motion during the projection/capture process. The authors have developed a methodology for combining multiple patterns into a single composite pattern by using spatial modulation techniques. A single composite pattern projection does not require synchronization with the camera so the data acquisition rate is only limited by the video rate and therefore suitable for high-speed depth measurement. However, the composite pattern is restrained by the spatial bandwidth directly related to the number of embedded patterns and the lateral resolution of the camera. Another problem is the processing requires image demodulation which is computational intensive. As part of a NASA Phase I STTR, we address the first limitation by analysis of the source of the error and post-processing the reconstruction data with dynamic programming approach. For the second problem we propose the use of a 4-f optical correlator, not as a correlator, but instead as an optical demodulator. Simulation results show reasonable depth reconstruction using our strategy for composite pattern after the post-processing.
Based on recent discoveries, we present a method to project a single structured pattern and then reconstruct the three-dimensional range from the distortions in the reflected and captured image. Traditional structured light methods require several different patterns to recover the depth, without ambiguity and albedo sensitivity, and are corrupted by object movement during the projection/capture process. Our method efficiently combines multiple patterns into a single composite pattern projection -- allowing for real-time implementations. Because structured light techniques require standard image capture and projection technology, unlike time of arrival techniques, they are relatively low cost. Attaining low cost 3D video acquisition would have a profound impact on most applications that are presently limited to 2D video imaging. Furthermore, it would enable many other applications. In particular, we are studying real time depth imagery for tracking hand motion and rotation as an interface to a virtual reality. Applications include remote controlled robotic interfacing in space, advanced cockpit controls and computer interfacing for the disabled.
Green noise is the mid-frequency component of white noise and has been shown to have visually pleasing attributes when applied to digital halftoning. Unlike blue noise dither patterns, which are composed exclusively of isolated pixels, green noise dither patterns are composed of pixel-clusters making them less susceptible to image degradation from non- ideal printing artifacts such as dot-loss. Clearly, these patterns reduce the spatial variation in tone produced by electrophotographic printers when printing a constant shade of gray, but to date, no study has been presented showing the amount of reduction. In this paper, we address this problem by studying the effects of changing the average cluster size in a green noise dither pattern, measuring the resulting spatial variations for a Lexmark Optra laser printer in 1200 dpi mode. The print quality is evaluated in terms of the visibility of printer mechanism noise and the average change in tone across the printed page.