In order to maximize the use of a robotic probe during its limited lifetime, scientists immediately have to be provided the
best achievable visual quality of 3D data products. The EU FP7-SPACE Project PRoVisG (2008-2012) develops
technology for the rapid processing and effective representation of visual data by improving ground processing facilities.
In September 2011 PRoVisG held a Field Trials campaign in the Caldera of Tenerife to verify the implemented 3D
Vision processing mechanisms and to collect various sets of reference data in representative environment. The campaign
was strongly supported by the Astrium UK Rover Bridget as a representative platform which allows simultaneous onboard
mounting and powering of various vision sensors such as the Aberystwyth ExoMars PanCam Emulator (AUPE).
The paper covers the preparation work for such a campaign and highlights the experiments that include standard
operations- and science- related components but also data capture to verify specific processing functions.
We give an overview of the captured data and the compiled and envisaged processing results, as well as a summary of
the test sites, logistics and test assets utilized during the campaign.
Mobile systems exploring Planetary surfaces in future will require more autonomy than today. The EU FP7-SPACE
Project ProViScout (2010-2012) establishes the building blocks of such autonomous exploration systems in terms of
robotics vision by a decision-based combination of navigation and scientific target selection, and integrates them into a
framework ready for and exposed to field demonstration.
The PRoViScout on-board system consists of mission management components such as an Executive, a Mars Mission
On-Board Planner and Scheduler, a Science Assessment Module, and Navigation & Vision Processing modules. The
platform hardware consists of the rover with the sensors and pointing devices.
We report on the major building blocks and their
functions & interfaces, emphasizing on the computer vision parts such
as image acquisition (using a novel zoomed 3D-Time-of-Flight & RGB camera), mapping from 3D-TOF data,
panoramic image & stereo reconstruction, hazard and slope maps, visual odometry and the recognition of potential
scientifically interesting targets.
Ski jumping has continuously raised major public interest since the early 70s of the last century, mainly in Europe and
Japan. The sport undergoes high-level analysis and development, among others, based on biodynamic measurements
during the take-off and flight phase of the jumper. We report on a vision-based solution for such measurements that
provides a full 3D trajectory of unique points on the jumper's shape. During the jump synchronized stereo images are
taken by a calibrated camera system in video rate. Using methods stemming from video surveillance, the jumper is
detected and localized in the individual stereo images, and learning-based deformable shape analysis identifies the
jumper's silhouette. The 3D reconstruction of the trajectory takes place on standard stereo forward intersection of
distinct shape points, such as helmet top or heel. In the reported study, the measurements are being verified by an
independent GPS measurement mounted on top of the Jumper's helmet, synchronized to the timing of camera exposures.
Preliminary estimations report an accuracy of +/-20 cm in 30 Hz imaging frequency within 40m trajectory. The system is
ready for fully-automatic on-line application on ski-jumping sites that allow stereo camera views with an approximate
base-distance ratio of 1:3 within the entire area of investigation.
One of the most important monitoring tasks of tunnel inspection is the observation of cracks. This paper describes an approach for crack following using mid-resolution (2-5mm per pixel) images of the tunnel surface. A mosaic on the basis of the tunnel design surface is built from images taken with a mobile platform. On this image representing the unwrapped tunnel surface texture the starting points of each crack are found semiautomatically using a modified Hough transform. Crack following takes place on the basis of local line fitting and exhaustive search in both directions of the crack, taking into account several restrictions, rules and optimization criteria to find the correct crack trajectory. A practical implementation polygonizes the extracted cracks and feeds them into a tunnel inspection data base. The method is applicable to various types of background texture as expected in the tunnel environment.
Within the European Mars Express Mission to be launched 2003 the Beagle2 Lander will foresee the access to stereoscopic views of the surrounding Martian surface after touchdown. For scientific purposes the necessity for a high resolution three dimensional (3D) reconstruction of the landing site is evident. A lander vision subsystem capable of reconstructing the landing site and its vicinity using a stereo camera mounted on the robotic arm of the lander is used therefore. Knowledge about the geometric camera features (position and pointing with respect to each other, position and pointing with respect to the lander, intrinsic parameters and lens distortion) are determined in a calibration step on ground before takeoff. The 3D reconstruction of the landing site is performed after landing by means of stereo matching using the transmitted images. Merging several stereo reconstructions uses the respective robotic arm states during image acquisition for calibration. This paper describes the full processing chain consisting of calibration of the sensor system, stereo matching, 3D reconstruction and merging of results. Emphasis is laid on the stereo reconstruction step. A software system configuration is proposed. Tests using Mars Pathfinder images as example data show the feasibility of the approach and give accuracy estimations.
One approach to stereo matching is to use different local features to find correspondences. The selection of an optimum feature set is the content of this paper. An operational software tool based on the principle of comparing feature vectors is used for stereo matching. A relatively large set of different local features is sought for optimum combinations of 6 - 10 of them. This is done by a genetic process that uses an intrinsic quality criterion that evaluates the correctness of each individual match. The convergence of the genetic feature selection process is demonstrated on a real stereo pair of a tunnel surface. Four areas were used for individual optimization. After several hundred generations for each of the areas, it is shown that the identified feature sets result in a considerably better stereo matching result than the currently used features, which were the result of an initial manual choice. The experiments described in this paper use a `super-set' of 145 features for every pixel, which are created by filtering the image with convolution kernels (averaging, Gaussian filters, bandpass, highpass), median filters and Gabor kernels. From these 145 filters, the genetic feature selection process selects an optimal set of operators. Using the selected filters results in a 15% improvement of the matching accuracy and robustness.
A vision based navigation system is a basic tool to provide autonomous operations of unmanned vehicles. For offroad navigation that means that the vehicle equipped with a stereo vision system and perhaps a laser ranging device shall be able to maintain a high level of autonomy under various illumination conditions and with little a priori information about the underlying scene. The task becomes particularly important for unmanned planetary exploration with the help of autonomous rovers. For example in the LEDA Moon exploration project currently under focus by the European Space Agency (ESA), during the autonomous mode the vehicle (rover) should perform the following operations: on-board absolute localization, elevation model (DEM) generation, obstacle detection and relative localization, global path planning and execution. Focus of this article is a computational solution for fully autonomous path planning and path execution. An operational DEM generation method based on stereoscopy is introduced. Self-localization on the DEM and robust natural feature tracking are used as basic navigation steps, supported by inertial sensor systems. The following operations are performed on the basis of stereo image sequences: 3D scene reconstruction, risk map generation, local path planning, camera position update during the motion on the basis of landmarks tracking, obstacle avoidance. Experimental verification is done with the help of a laboratory terrain mockup and a high precision camera mounting device. It is shown that standalone tracking using automatically identified landmarks is robust enough to give navigation data for further stereoscopic reconstruction of the surrounding terrain. Iterative tracking and reconstruction leads to a complete description of the vehicle path and its surrounding with an accuracy high enough to meet the specifications for autonomous outdoor navigation.
3D reconstruction of highly textured surfaces on unvegetated terrain is of major interest for stereo vision based mapping applications. We describe a prototype system for automatic modeling of such scenes. It is based on two frame CCD cameras, which are tightly attached to each other ensuring constant relative orientation. One camera is used to acquire known reference points to get the exterior orientation of the system, the other records the surface images. The system is portable to keep image acquisition as short as possible. Automatic calibration using the images acquired by the calibration camera permits the computation of exterior orientation parameters of the surface camera via a transformation matrix. A robust matching method providing dense disparities together with a flexible reconstruction algorithm renders an accurate grid of 3D points on arbitrarily shaped surfaces. The results of several stereo reconstructions are merged. Projection onto the global shape allows easy evaluation of volumes, and thematic mapping with respect to the desired surface geometry in construction processes. We report on accuracy and emphasize on the practical usage. It is shown that the prototpye system is able to generate a proper data set of surface descriptions that is accurate and dense enough to serve as documentation, planning and accounting basis.
Planetary space exploration by unmanned missions strongly relies on automatic navigation methods. Computer vision has been recognized as a key to the feasibility of robust navigation, landing site identification and hazard avoidance. We present a scenario that uses computer vision methods for the early identification of landing spots, emphasizing the phase between ten kilometers from ground and the identification of the lander position relative to the selected landing site. The key element is a robust matching procedure between the elevation model (and imagery) acquired during orbit, and ground features observed during approach to the desired landing site. We describe how (1) preselection of characteristic landmarks reduces the computational efforts, and (2) a hierarchical data structure (pyramid) on graylevels and elevation models can be successfully combined to achieve a robust landing and navigation system. The behavior of such a system is demonstrated by simulation experiments carried out in a laboratory mock-up. This paper follows up previous work we have performed for the Mars mission scenario, and shows relevant changes that emerge in the Moon mission case of the European Space Agency.
This paper deals with stereo matching, which is reformulated as a statistical pattern recognition problem. In stereo, the computation of correspondences of image points in the right and left image is viewed as a two-class pattern recognition problem. The two matching left-right points are said to constitute class 1 (matching) and the points in the neighborhood of these points form class 2 (non-matching). We have argued before that matching can be drastically improved by using several features rather than just graylevels (usually called area- based matching) or edges (usually called edge-based matching). Based on this formulation of matching as a pattern recognition problem well-known theories to optimize feature extraction and feature selection should be applied to stereo as well. In the paper we show the results of experiments to support the statistical framework for stereo and how the performance of a stereo system can be improved by taking into account the findings of statistical pattern recognition.
3D reconstruction of highly textured surfaces like those found in roads, as well as unvegetated (rock-like) terrain is of major interest for applications like autonomous navigation, or the 3D modeling of terrain for mapping purposes. We describe a system for automatic modeling of such scenes. It is based on two frame CCD cameras, which are tightly attached to eachother to ensure constant relative orientation. One camera is used for the acquisition of photogrammetrically measure reference points, the other records the surface images. The system is moved from the first position to the next by an operator carrying it. Automatic calibration using the images acquired by the calibration camera permits the computation of exterior orientation parameters of the surface camera. A fast matching method providing dense disparities together with a robust reconstruction algorithm renders an accurate grid of 3D points. We also describe procedures to merge stereo reconstruction results from all images taken, and report on accuracy, computational complexity, and practical experience in a road engineering application.
Planetary space exploration by unmanned missions strongly relies on automatic navigation. Computer vision has been recognized as a key to the feasibility of robust navigation, landing site identification and hazard avoidance. We are studying a scenario, where remote sensing methods from the orbit around the planet are used to preselect a landing site. The accuracy of the atmospheric entry is restricted by various parameters. One area of uncertainty results from inexact estimation of the landing position. The touch down point must be located an elliptic image area which is called the `ellipsis of uncertainty'. During landing, the early recognition of the preselected landing site in this image is an important factor. It improves the probability for a successful touchdown, since it allows real-time corrections of the trajectory to reach the planned touch down spot. We present a scenario that uses computer vision methods for this early identification emphasizing the phase between ten kilometers from ground and the identification of the lander position relative to the selected landing site.
This paper describes a system for real-time inspection of 2D surfaces. It was initially planned as system for classification of wooden surfaces, but was successfully used also in the context of other inspection tasks like metallic surface inspection and leather inspection. The system has two major modules. One is a 2D object segmentation and recognition part, where key elements of the underlying elements have been published before. This includes hierarchical processing of the incoming gray-level images leading to a symbolic description of the surface; syntactic segmentation; and the decision network methodology used. Beyond these features, a new track has been added, which is entirely devoted to texture classification in real-time. This two-way analysis of wooden surfaces was first implemented on a heterogeneous architecture containing Zoran vector processors and Transputers (all commercially available). The current version uses only TMS32C40 processors. The system has been successfully implemented in a production plant in Austria. We describe major elements of the system and the underlying algorithms.
Three dimensional reconstruction of alpine terrain based on stereoscopic views from remote line scanners like SPOT requires robust and fast methods that find correspondences between homologue scene points in the stereo images. This matching of a large number of corresponding pairs of points is the most crucial step in the processing chain, both in computational effort and demands on robustness and accuracy. We propose a novel method coming from statistical pattern recognition that utilizes the advantages of hierarchical image representations. It is verified with a SPOT stereo pair. A comparison with the traditional gray- level correlation based matching philosophy shows better accuracy performance, while the decrease factor of necessary computational effort is of several orders of magnitude. A final discussion lists problems and tradeoffs that arise for the matching step, especially in alpine terrain.
The reconstruction of a surface having already matched corresponding points from stereo images (disparities) is a nontrivial task. This paper presents a new technique, the so-called Locus method, that exploits sensor geometry to efficiently build a terrain representation from stereo disparities. The power of this approach is the efficient and direct computation of a dense elevation map in arbitrary resolution. Additionally it proposes to solve problems like occlusions, ambiguities, and uncertainties caused by stereo matching errors. We extended the Locus method for active range finder data to the stereo disparity mapping case. For this reason, a newly developed fast matching method is utilized that provides dense disparity maps, hence a disparity for each input pixel. Once this data set is given, the Locus method can be applied in a straightforward and efficient way to gain a robust 3D reconstruction of the observed surface. It operates directly in image space, using dense and uniform measurements instead of first converting to object space. Experiments on synthetic and natural environment data show that the Locus method is less sensitive to disparity noise than traditional reconstruction.
A novel technique for automatic elevation model computation is introduced. The technique employs a binocular camera system and an algorithm termed hierarchical feature vector matching to derive an elevation model, as well as to compute the interframe correspondences for tracking. It is argued that this algorithm unifies the procedures of range estimation (i.e., stereo correspondence), tracking (i.e., interframe correspondence), and recognition (input/model correspondence). This technique is demonstrated using a physical model of the Mars surface and a binocular camera system with seven geometrical degrees of freedom. This system provides a tool to generate realistic test imagery for the mock-up of a spacecraft approaching the landing site. The trajectory of the spacecraft can be predefined and is controlled by a computer interfaced to seven motorized positioners. Several experiments defined to estimate the accuracy of the computer vision system are reported.