This paper describes a multimodal vision sensor that integrates three types of cameras, including a stereo camera, a polarization camera and a panoramic camera. Each sensor provides a specific dimension of information: the stereo camera measures depth per pixel, the polarization obtains the degree of polarization, and the panoramic camera captures a 360° landscape. Data fusion and advanced environment perception could be built upon the combination of sensors. Designed especially for autonomous driving, this vision sensor is shipped with a robust semantic segmentation network. In addition, we demonstrate how cross-modal enhancement could be achieved by registering the color image and the polarization image. An example of water hazard detection is given. To prove the multimodal vision sensor’s compatibility with different devices, a brief runtime performance analysis is carried out.
Panoramic images have advantages in information capacity and scene stability due to their large field of view (FoV). In this paper, we propose a method to synthesize a new dataset of panoramic image. We managed to stitch the images taken from different directions into panoramic images, together with their labeled images, to yield the panoramic semantic segmentation dataset denominated as SYNTHIA-PANO. For the purpose of finding out the effect of using panoramic images as training dataset, we designed and performed a comprehensive set of experiments. Experimental results show that using panoramic images as training data is beneficial to the segmentation result. In addition, it has been shown that by using panoramic images with a 180 degree FoV as training data the model has better performance. Furthermore, the model trained with panoramic images also has a better capacity to resist the image distortion. Our codes and SYNTHIA-PANO dataset are available: https://github.com/Francis515/SYNTHIA-PANO.