Automatic ship detection in optical remote sensing images has attracted wide attention for its broad applications. Major challenges for this task include the interference of cloud, wave, wake, and the high computational expenses. We propose a fast and robust ship detection algorithm to solve these issues. The framework for ship detection is designed based on deep convolutional neural networks (CNNs), which provide the accurate locations of ship targets in an efficient way. First, the deep CNN is designed to extract features. Then, a region proposal network (RPN) is applied to discriminate ship targets and regress the detection bounding boxes, in which the anchors are designed by intrinsic shape of ship targets. Experimental results on numerous panchromatic images demonstrate that, in comparison with other state-of-the-art ship detection methods, our method is more efficient and achieves higher detection accuracy and more precise bounding boxes in different complex backgrounds.
Sparse coding exhibits good performance in many computer vision applications by finding bases which capture highlevel semantics of the data and learning sparse coefficients in terms of the bases. However, due to the fact that bases are non-orthogonal, sparse coding can hardly preserve the samples’ similarity, which is important for discrimination. In this paper, a new image representing method called maximum constrained sparse coding (MCSC) is proposed. Sparse representation with more active coefficients means more similarity information, and the infinite norm is added to the solution for this purpose. We solve the optimizer by constraining the codes’ maximum and releasing the residual to other dictionary atoms. Experimental results on image clustering show that our method can preserve the similarity of adjacent samples and maintain the sparsity of code simultaneously.
Traditional saliency detection can effectively detect possible objects using an attentional mechanism instead of automatic object detection, and thus is widely used in natural scene detection. However, it may fail to extract salient objects accurately from remote sensing images, which have their own characteristics such as large data volumes, multiple resolutions, illumination variation, and complex texture structure. We propose a sparsity-guided saliency detection model for remote sensing images that uses a sparse representation to obtain the high-level global and background cues for saliency map integration. Specifically, it first uses pixel-level global cues and background prior information to construct two dictionaries that are used to characterize the global and background properties of remote sensing images. It then employs a sparse representation for the high-level cues. Finally, a Bayesian formula is applied to integrate the saliency maps generated by both types of high-level cues. Experimental results on remote sensing image datasets that include various objects under complex conditions demonstrate the effectiveness and feasibility of the proposed method.
The airport target recognition method for remote sensing images is generally based on image matching, which is significantly affected by the variations of illumination, viewpoints, scale, and so on. As a well-known semantic model for target recognition, bag-of-features (BoF) performs k-means clustering on enormous local feature descriptors and thus generates the visual words to represent the images. We propose a fast automatic recognition framework for an airport target of a low-resolution remote sensing image under a complicated environment. It can be viewed as a two-phase procedure: detection and then classification. Concretely, it first utilizes a visual attention model for locating the salient region, and then detects possible candidate targets and extracts saliency-constrained scale invariant feature transform descriptors to build a high-level semantics model. Consequently, BoF is applied to mine the high-level semantics of targets. Different from k-means in a traditional BoF, we employ locality preserving indexing (LPI) to obtain the visual words. Because LPI can consider the intrinsic local structure of descriptors and further enhance the ability of words to describe the image content, it can accurately classify the detected candidate targets. Experiments on the dataset of 10 kinds of airport aerial images demonstrate the feasibility and effectiveness of the proposed method.
Conventional graph embedding framework uses the Euclidean distance to determine the similarities of neighbor samples, which causes the graph structure to be sensitive to outliers and lack physical interpretation. Moreover, the graph construction suffers from the difficulty of neighbor parameter selection. Although sparse representation (SR) based graph embedding methods can automatically select the neighbor parameter, the computational cost of SR is expensive. On the other hand, most discriminant projection methods fail to perform feature selection. In this paper, we present a novel joint discriminant analysis and feature selection method that employs regularized least square for graph construction and l 2,1 -norm minimization on projection matrix for feature selection. Specifically, our method first uses the regularized least square coefficients to measure the intraclass and interclass similarities from the viewpoint of reconstruction. Based on this graph structure, we formulate an object function with scatter difference criterion for learning the discriminant projections, which can avoid the small sample size problem. Simultaneously, the l 2,1 -norm minimization on projection matrix is applied to gain row-sparsity for selecting useful features. Experiments on two face databases (ORL and AR) and COIL-20 object database demonstrate that our method not only achieves better classification performance, but also has lower computational cost than SR.
A real-time orientation feature descriptor for portable devices is introduced. The descriptor requires very low
computational resources and has 16 dimensions shorter than all existing methods. The patch of a candidate feature is
firstly segmented into polar arranged sub-regions, which enables us to achieve rotation invariance rapidly. Furthermore,
the principal orientation is used to describe each sub-region. The computations can be considerably accelerated by using
integral image. The descriptor is used for object tracking and achieves 25 fps frame rate on mobile phone. Experimental
results demonstrate that the proposed method offers sufficient matching performance.
In order to realize high-precision registration of high-resolution remote sensing image, an improved multi-scale phase
correlation matching method is proposed based on wavelet modulus value. This method takes the advantages of wavelet
multi-scale antinoise characteristics by replacing gray-scale information in traditional phase correlation method with the
wavelet module value, thus it not only improved the matching accuracy, but also had a strong robustness for the image
with illumination changes, blurring and occlusion. Furthermore, a coarse to fine match is adopted to reduce the
computational complexity according to the capabilities of wavelet multi-resolution analysis, by which the algorithm is
initialized by matching two low-resolution images and increasing the accuracy by applying the coarse matching result to
high-resolution images. It is possible to meet the real-time requirement. From the experimental results, automatic
registration is quickly realized for high-resolution remote sensing image under the complex background condition with
strong noise jamming. Comparing with traditional phase correlation method, the improvement on both the higher
robustness on noise and the processing speed is obvious.
This paper proposes a method used to detect big moving object in the complicated dynamic background, which integrates
the phase correlation technique including singular value decomposition and the method in which multi-frames difference
images is multiplied. The phase correlation algorithm based on singular value decomposition is insensitive to noise and
change of gray and contrast. Comparing with many complex phase correlation algorithm and registration algorithm in
spatial domain, our method not only can effectively restrain noise, but also enhancing the registration precision, whose
speed is nearly two times as quickly as original phase correlation algorithm. The fact is found by the result of experiment
that the phase correlation matrix is rank one for a noise-free rigid translation model. A new phase correlation matrix is
recast based on the property which can effectively restrain noise and change of gray. By estimating global moving vector
of two images using phase correlation based on singular value decomposition, background is accurately matched. The
matched images are processed to calculate the image differences between the first and fourth, the second and fifth, the third
and sixth. After these difference images are multiplied, clear edge of moving object is obtained. Thus the accurate location
of object is realized by calculating barycentre of image. At last, simulation results prove that this proposed method can
overcome effectiveness well in the lighting variations and noise. It is also efficient and applicable for accurate moving
object orientation in the complicated dynamic background.
This paper addresses the problem of rejecting fixed star background in star-background image. For most sensors with a
fine spatial resolution, phenomenological effects, such as background, and system effects, such as noise, contribute
significant numbers of spurious points to each frame. In star-background images, fixed stars are uppermost source of
spurious points. Since background and noise do not behave like targets, a good tracking algorithm would eventually
reject the spurious points as non-targets. However, the computation required to consider which points appearing in a
frame are from the target grows geometrically with the number of points to be considered. Simply considering each of
these points as a candidate target point unnecessarily burdens the tracking algorithm and in many cases would require
computational resources that cannot be provided to the mission. In this paper, we proposed a new method for rejecting
fixed stars based on star-point matching in star-background image. We decide the fixed stars using point matching
between points in actual image and points in ideal image which relies on the catalog. This work extends applied domain
of Hausdorff Distance (HD) which is one of commonly used measures for object matching. In our experiments, Least
Trimmed Square HD (LTS-HD) was used in point matching, and the result is effective.