We have developed a shape and structure capture system which constructs accurate, realistic 3D models from video imagery taken with a single freely moving handheld camera. Using an inexpensive off the shelf acquisition system such as a hand-held video camera, we demonstrate the feasibility of fast and accurate generation of these 3D models at a very low cost.
In our approach the operator freely moves the camera within some very simple constraints. Our process identifies and tracks high interest image features and computes the relative pose of the camera based on those tracks. Using a RANSAC-like approach we solve for the camera pose and 3D structure based on a homography or essential matrix.
Once we have the pose for many frames in the sequence we perform correlation-based stereo to obtain dense point clouds. After these point clouds are computed we integrate them into an octree. By replacing the points in a particular cell with statistics representing the point distribution we can efficiently store the computed model. While being efficient, the integration technique also enables filtering based on occupancy counts which eliminates many stereo outliers and results in an aesthetic viewable 3D model.
In this paper we describe our approach in detail as well as show reconstructed results of a synthetic room, an empty room, a lightly furnished room, and an experimental vehicle.
This work addresses the issue of Terrain Classification that can be applied for path planning for an Unmanned Ground Vehicle (UGV) platform. We are interested in classification of features such as rocks, bushes, trees and dirt roads. Currently, the data is acquired from a color camera mounted on the UGV as we can add range data from a second sensor in the future. The classification is accomplished by first, coarse segmenting a frame and then refining the initial segmentations through a convenient user interface. After the first frame, temporal information is exploited to improve the quality of the image segmentation and help classification adapt to changes due to ambient lighting, shadows, and scene changes as the platform moves. The Mean Shift Classifier algorithm provides segmentation of the current frame data. We have tested the above algorithms with four sequence of frames acquired in an environment with terrain representative of the type we expect to see in the field. A comparison of the results from this algorithm was done with accurate manually-segmented (ground-truth) data, for each frame in the sequence.
This system provides real-time guidance for training and problem-solving on production-line machinery. A prototype of a wearable, real-time, video guidance, interactive system for use in manufacturing, has been developed and demonstrated. Anticipated benefits are: relatively inexperienced personnel can provide machine servicing and the dependency on the vendor to repair or maintain equipment is significantly reduced. Additionally, servicing, training or part change-over schedules can be exercised more predictably and with less training. This approach utilizes Head Worn Display or Head Mounted Display (HMD) technology that can be readily adapted for various machines on the factory floor with training steps for a new location. Such a system can support various applications in manufacturing such as direct video guiding or applying scheduled maintenance and training to effectively resolve servicing emergencies and reduce machine downtime. It can also provide training of inexperienced operators and maintenance personnel. The gap between production line complexity and ability of production personnel to effectively maintain equipment is expected to widen in the future and advanced equipment will require complex servicing procedures that are neither well documented nor user-friendly. This system offers benefits in increased manufacturing equipment availability by facilitating effective servicing and training and can interface to a server system for additional computational resources on an as-needed basis. This system utilizes markers to guide the user and enforces a well defined sequence of operations. It performs augmentation of information on the display in order to provide guidance in real-time.