Panoptic segmentation is an important method for UAV platforms to implement road condition monitoring and urban planning. In recent years, the panoptic segmentation technology provides more comprehensive information than the current semantic segmentation technology. In this paper, the framework of the panoptic segmentation algorithm is designed for the UAV application scenario. Due to the large target scene and small target of UAV, resulting in the lack of foreground targets in the segmentation results and the poor quality of the segmentation mask. To solve these problems, this paper introduces deformable convolution in the feature extraction network to improve the ability of network feature extraction. In addition, the MaskIoU module is introduced in the instance segmentation branch to improve the overall quality of the foreground target mask. In this paper, a series of data are collected by UAV and organized into UAV_OUC panoptic segmentation dataset. We tested on the UAV_OUC panoptic segmentation dataset. The experimental results on UAV_OUC panoptic benchmark validate the effectiveness of our proposed method.
Multiple people tracking is a significant sub-problem of object tracking with high demand during recent years. In the large view scene, the main difficulties are that the objects are small and they may be occluded or have sudden appearance changes. So most existing methods have high ID switches (a evaluation metric for multiple people tracking) in large view scene. We propose a multiple people tracking method that focus on solving high ID switches in large view scene. Our method uses intersection over union (IOU) information that is not sensitive to appearance changes and Euclidean distance-based appearance similarity that is helpful in solving the problem of occlusions to associate data. In order to make our Euclidean distance-based appearance similarity metric work better, we employ a soft-margin loss function to train a convolutional neural network (CNN), it can make the features extracted by the CNN more suitable for our similarity metric, so our method can effectively solve high ID switches problem. IOU-based data association has low computational complexity and the CNN is a lightweight network, it makes our method have real-time speed. On the other hand, we propose a multiple people tracking dataset of large view scene for research. We design our dataset according to the standards of MOT Challenge benchmark and we select yolov3 detector that has relatively good performance for small objects as a public detector. Finally, our method is compared with several multiple people tracking methods on our dataset. The experimental results show that our method has a better performance in large view scene.
Person re-identification (ReID) is an important task in video surveillance and can be applied in various practical applications. The traditional methods and deep learning model cannot satisfy the real-world challenges of environmental complexity and scene dynamics, especially under fixed scene. What’s more, most of the existing datasets are outdoor and has a single style, which is not good for indoor person re-identification. Focusing on these problems, the paper improves a Stride Convolutional Neural Network (S-CNN) to process indoor images based on multi-features fusion. The deep model is established in which the identity information, stride information and other information are learned to handle more challenging indoor images. Then a metric learning method (Joint Bayesian) is employed based on the deep model. Finally, the entire classifier is retrained with supervised learning. The experiment is tested on the OUC365 dataset created by us which is captured for 365 days including all seasons style. Compared with other state-of-the-art methods, the performance of the proposed method yields best results
Pedestrian detection is a canonical sub-problem of object detection with high demand during recent years. Although recent deep learning object detectors such as Fast/Faster R-CNN have shown excellent performance for general object detection, they have limited success for small size pedestrian detection in large-view scene. We study that the insufficient resolution of feature maps lead to the unsatisfactory accuracy when handling small instances. In this paper, we investigate issues involving Fast R-CNN for pedestrian detection. Driven by the observations, we propose a very simple but effective baseline for pedestrian detection based on Fast R-CNN, employing the DPM detector to generate proposals for accuracy, and training a fast R-CNN style network to jointly optimize small size pedestrian detection with skip connection concatenating feature from different layers to solving coarseness of feature maps. And the accuracy is improved in our research for small size pedestrian detection in the real large scene.
With the development of earth observation programs, many multitemporal synthetic aperture radar (SAR) images over the same geographical area are available. It is demanding to develop automatic change detection techniques to take advantage of these images. Most existing techniques directly analyze the difference image (DI), and therefore, they are easily affected by the speckle noise. We proposed an SAR image change detection method based on frequency-domain analysis and random multigraphs. The proposed method follows a coarse-to-fine procedure: in the coarse changed regions localization stage, frequency-domain analysis is utilized to select distinctive and salient regions from the DI. Therefore, nonsalient regions are neglected, and noisy unchanged regions incurred by the speckle noise are suppressed. In the fine changed regions classification stage, random multigraphs are employed as the classification model. By selecting a subset of neighborhood features to create graphs, the proposed method can efficiently exploit the nonlinear relations between multitemporal SAR images. The experimental results on two real SAR datasets and one simulated dataset have demonstrated the effectiveness of the proposed method.