During the last two decades the number of visual odometry algorithms has grown rapidly. While it is straightforward to obtain a qualitative result, if the shape of the trajectory is in accordance with the movement of the camera, a quantitative evaluation is needed to evaluate the performances and to compare algorithms. In order to do so, one needs to establish a ground truth either for the overall trajectory or for each camera pose. To this end several datasets have been created. We propose a review of the datasets created over the last decade. We compare them in terms of acquisition settings, environment, type of motion and the ground truth they provide. The purpose is to allow researchers to rapidly identifies the datasets that best fit their work. While the datasets cover a variety of techniques to establish a ground truth, we provide also the reader with techniques to create one that were not present among the reviewed datasets.
In computer vision, the epipolar geometry embeds the geometrical relationship between two views of a scene. This geometry is degenerated for planar scenes as they do not provide enough constraints to estimate it without ambiguity. Nearly planar scenes can provide the necessary constraints to resolve the ambiguity. But classic estimators such as the 5-point or 8-point algorithm combined with a random sampling strategy are likely to fail in this case because a large part of the scene is planar and it requires lots of trials to get a nondegenerated sample. However, the planar part can be associated with a homographic model and several links exist between the epipolar geometry and homographies. The epipolar geometry can indeed be recovered from at least two homographies or one homgraphy and two noncoplanar points. The latter fits a wider variety of scenes, as it is unsure to be able to find a second homography in the noncoplanar points. This method is called plane-and-parallax. The equivalence between the parallax and the epipolar lines allows to recover the epipole as their common intersection and the epipolar geometry. Robust implementations of the method are rarely given, and we encounter several limitations in our implementation. Noisy image features and outliers make the lines not to be concurrent in a common point. Also off-plane features are unequally influenced by the noise level. We noticed that the bigger the parallax is, the lesser the noise influence is. We, therefore, propose a model for the parallax that takes into account the noise on the features location to cope with the previous limitations. We call our method the “parallax beam.” The method is validated on the KITTI vision benchmark and on synthetic scenes with strong planar degeneracy. The results show that the parallax beam improves the estimation of the camera motion in the scene with planar degeneracy and remains usable when there is not any particular planar structure in the scene.
In the past few years, a new type of camera has been emerging on the market: a digital camera capable of capturing both the intensity of the light emanating from a scene and the direction of the light rays. This camera technology called a light-field camera uses an array of lenses placed in front of a single image sensor, or simply, an array of cameras attached together. An optical device is proposed: a four minilens ring that is inserted between the lens and the image sensor of a digital camera. This device prototype is able to convert a regular digital camera into a light-field camera as it makes it possible to record four subaperture images of the scene. It is a compact and cost-effective solution to perform both postcapture refocusing and depth estimation. The minilens ring makes also the plenoptic camera versatile; it is possible to adjust the parameters of the ring so as to reduce or increase the size of the projected image. Together with the proof of concept of this device, we propose a method to estimate the positions of each optical component depending on the observed scene (object size and distance) and the optics parameters. Real-world results are presented to validate our device prototype.
This work shows the interest of combining polarimetric and light-field imaging. Polarimetric imaging is known for its capabilities to highlight and reveal contrasts or surfaces that are not visible in standard intensity images. This imaging mode requires to capture multiple images with a set of different polarimetric filters. The images can either be captured by a temporal or spatial multiplexing, depending on the polarimeter model used. On the other hand, light-field imaging, which is categorized in the field of computational imaging, is also based on a combination of images that allows to extract 3D information about the scene. In this case, images are either acquired with a camera array, or with a multi-view camera such as a plenoptic camera. One of the major interests of a light-field camera is its capability to produce different kind of images, such as sub-aperture images used to compute depth images, full focus images or images refocused at a specific distance used to detect defects for instance. In this paper, we show that refocused images of a light-field camera can also be computed in the context of polarimetric imaging. The 3D information contained in the refocused images can be combined with the linear degree of polarization and can be obtained with an unique device in one acquisition. An example illustrates how these two coupled imaging modes are promising, especially for the industrial control and inspection by vision.
Conference Committee Involvement (1)
Fourteenth International Conference on Quality Control by Artificial Vision