This paper proposes a new irregular remote sensing object detection algorithm that different from the ROI or rotating BOX obtained by traditional one. The architecture is designed to jointly learn four bounding box corner points and their association via two branches of the same sequential prediction process. The algorithm predicts four key points of the object and their associated connection, Bounding Box Fields(BBF) via convolutional neural network(CNN), and thus obtains the detail spatial distribution of the objects.
In order to improve the positioning accuracy of the key points, network architecture reduced Receptive Field from large to small stage by stage. It has achieved ROI free finally. In this method, the object detection problem is framed as CNN convolution point detection and bounding box field detection, it achieved the one stage object detection with high precision and high speed.
We verified the effectiveness and efficiency of the algorithm through experiments, which proved that the new data structure could locate the object attitude and spatial direction more accurately in real time with strong practicability.
With the development of remote sensing technology, we can obtain more and more target information from remote sensing images. Among them, the 6D pose contains the position and attitude of the target relative to the camera in the three-dimensional coordinate system. The traditional 6d pose algorithm for predicting targets is calculated by predicting the target RoI or inclined box. However, the detection standard IoU of the traditional method cannot reflect the direction information of the target, and there is ambiguity of the inclination of the target inclined box, such as 0°and 180°, 0° and 360°. In this paper, we present a new algorithm for predicting the target's 6D pose in remote sensing images, Anchor Points Prediction (APP). Different from the previous methods, the target results of the final output can get the direction information. Different from the traditional method, we predict the target's multiple feature points based on the neural network to obtain the homograph between the object plane and the ground. The resulting 6d pose can accurately describe the three-dimensional position and attitude of the target. We tested our algorithm on the HRSC2016 dataset and the DOTA dataset with accuracy rates of 0.863 and 0.701, respectively. The experimental results show that the accuracy of the APP algorithm detection target is significantly improved. At the same time, the algorithm can achieve one stage prediction, which makes the calculation process easier and more efficient.