Depth information perception of unstructured scene images is an important problem for applications using computer vision. This paper proposes a method based on deep learning combined with self-attention mechanism to reason the depth information of unstructured indoor targets, which effectively solves the problem of blurred image detail and insufficient layering in depth information reasoning in unstructured scenes. First, the deep learning-based encoder-decoder model is trained to learn the depth information of indoor scenes on large 3D datasets. The trained model has good results for general structured indoor scenes. Secondly, the soft self-attention mechanism is used to obtain the disparity information between the upper and lower sequences of the input image, by which the depth map obtained in the first step is corrected to enhance the accuracy of depth. Finally, in order to get clear objects with obvious boundaries in the depth response map, the nearest neighbor regression is used to correct the contour of the objects. The experimental results show that the proposed method has very good depth information reasoning ability for indoor unstructured scenes. Through depth information reasoning, the obtained objects have obvious texture structure, strong geometric features, clear contour edges and delicate layers, and also the misleading of deep information reasoning in reflective and highlight areas is eliminated.
Obstacles detection is one of the most important parts for ADAS (Advanced Driver Assistance Systems). Camera provides excellent recognition but with limits to range information; nevertheless, the LiDAR allows for better range information but with limits to the object identification. This paper deals with the problem of efficiently and accurately detecting vehicles on-load by fusing color images and LiDAR point clouds. Firstly, a neural network is used to detect road and vehicles. This neural network has high accuracy and speed on detection for the encoder in it is shared by different tasks. In the second step, the point clouds are processed to remove some invalid points and positions that potential represent targets generate by clustering point clouds. Positions are projected to images plane to get the ROI (Region of Interest), then the ROI will be matched with detection results of image to check if any targets are missed. In the paper, we adopt the RANSAC (Random Sample Consensus) algorithm to remove ground points. A parameter adaptive DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm is proposed to cluster points, where parameters can change adaptively according to the characteristics of different density point clouds. Through neural network, we recognize the types of obstacles. Experiment is performed on KITTI dataset, using left color images and Velodyne64 point clouds to verify our method. The result shows satisfactory accuracy in detection work.