Ship classification in remote sensing images has been rarely studied because of relative scarcity of publicly available datasets. It is well known that datasets have played an important role in object classification research, especially for CNN-based algorithms which have been proved to perform well. In this paper, we introduce a public Dataset for Ship Classification in Remote sensing images (DSCR). We collect 1,951 remote sensing images from DOTA, HRSC2016, NWPU VHR-10 and Google Earth, containing warships and civilian ships of various scales. For object classification, we cut out ships of different categories from the collected images. The whole dataset contains about 20,675 instances which are divided into seven categories, i.e. aircraft carrier, destroyer, assault ship, combat ship, cruiser, other military ship and civilian ship. Each image contains ships of the same category, which is labeled by the category name. Since our dataset contains most models of major warships, it is relatively comprehensive for ship classification. To build a benchmark for ship classification, we evaluated six popular CNN-based object classification algorithms on our dataset, including ResNet, ResNext, VGG, GoogLeNet, DenseNet, and AlexNet. Experiments demonstrates that our dataset can be used for verifying ship classification algorithms and may advance the development of ship classification in remote sensing images.
Crop classification is a representative problem in multispectral remote sensing image (RSI) classification, and has significance in country food security, ecological security, production estimate, crop growth supervision, and so on. It has attracted increasing attention of many researchers around the world especially after the development of convolutional neural networks (CNN). General CNN-based multispectral RSI classification methods may be not suitable for labeled samples with limited numbers and areas. Other pixel-based classification methods are always affected by noise and ignore spatial information. Focusing on these problems, this paper presents an approach based on lightened CNN for crop classification with a small number of tiny size labeled samples in multispectral images. The contribution of this work is to construct a lightened CNN model for crop classification with small samples in multispectral image. It avoids overfitting of deep CNN and reduces the requirement for the size of training samples. We adopt two-layer fully convolutional network (FCN) to extract features. The first layer uses a convolutional kernel of size 1 and outputs 16-band feature map to obtain spectral band information. Spatial information is extracted in the sequential layer using convolutional kernel of size 3, step 1 and padding 1. Thus the feature map after FCN and the labeled area have the same size. Finally, we use a fully connected layer and a softmax classifier for classification. Our experiment was conducted on 8-band multispectral image of size 50362-by-17810 pixels. There are 5 classes in the multispectral image, namely rice, soy, corn, non-crop, and uncertainty. The experimental result which achieves 86.28% accuracy indicates the good performance of our network for crop classification in multispectral RSIs.
Image deblurring is a challenging ill-posed problem in computer vision. In this paper, we propose two endto- end generative networks to solve the problem of blind image deblurring and blurring. We chain them together to enhance each other constantly, which means that the output of the one generator is delivered to the another and a more realistic and relevant output is expected. We propose the deblur generator to generate sharp images from blur ones, which is what exactly we want in blind image deblurring. We also propose the self augmented block to enhance the performance of the generative network. Every generative filter is also associated with its own discriminator to compose a conditional GAN to promote the result of the generator. Additionally, to emphasize the edges of the image on the deblur generator, we use reconstructed loss to constrain the generator. The experiments on the benchmark datasets prove the effective of the deblur generator against state-of-the-art algorithms both quantitatively and qualitatively.
Automatic ship detection in optical remote sensing images has attracted wide attention for its broad applications. Major challenges for this task include the interference of cloud, wave, wake, and the high computational expenses. We propose a fast and robust ship detection algorithm to solve these issues. The framework for ship detection is designed based on deep convolutional neural networks (CNNs), which provide the accurate locations of ship targets in an efficient way. First, the deep CNN is designed to extract features. Then, a region proposal network (RPN) is applied to discriminate ship targets and regress the detection bounding boxes, in which the anchors are designed by intrinsic shape of ship targets. Experimental results on numerous panchromatic images demonstrate that, in comparison with other state-of-the-art ship detection methods, our method is more efficient and achieves higher detection accuracy and more precise bounding boxes in different complex backgrounds.
Since blur kernel estimation is an ill-posed problem, it is essential that it be constrained by parametric image priors. However, the previous normalized sparsity measure alters the kernel structure during estimation. To address the problem of single-image blur kernel estimation, a local smoothness prior is introduced to the normalized sparsity model to constrain the blurred image gradient to be similar to the unblurred one. Moreover, based on the inequality constraints, a kernel optimization algorithm is proposed to weaken the noise. Experimental results show that the proposed method is robust against noise and is able to estimate a stable blur kernel. It outperforms other state-of-the-art methods on both synthetic and real data.
Content-based image retrieval (CBIR) has been widely researched for medical images. In application of histo- pathological images, there are two issues that need to be carefully considered. The one is that the digital slide is stored in a spatially continuous image with a size of more than 10K x 10K pixels. The other is that the size of query image varies in a large range according to different diagnostic conditions. It is a challenging work to retrieve the eligible regions for the query image from the database that consists of whole slide images (WSIs). In this paper, we proposed a CBIR framework for the WSI database and size-scalable query images. Each WSI in the database is encoded and stored in a matrix of binary codes. When retrieving, the query image is first encoded into a set of binary codes and analyzed to pre-choose a set of regions from database using hashing method. Then a multi-binary-code-based similarity measurement based on hamming distance is designed to rank proposal regions. Finally, the top relevant regions and their locations in the WSIs along with the diagnostic information are returned to assist pathologists in diagnoses. The effectiveness of the proposed framework is evaluated in a fine-annotated WSIs database of epithelial breast tumors. The experimental results show that proposed framework is both effective and efficiency for content-based whole slide image retrieval.
In this paper, a new ship detection method is proposed after analyzing the characteristics of panchromatic remote sensing images and ship targets. Firstly, AdaBoost(Adaptive Boosting) classifiers trained by Haar features are utilized to make coarse detection of ship targets. Then LSD (Line Segment Detector) is adopted to extract the line features in target slices to make fine detection. Experimental results on a dataset of panchromatic remote sensing images with a spatial resolution of 2m show that the proposed algorithm can achieve high detection rate and low false alarm rate. Meanwhile, the algorithm can meet the needs of practical applications on DSP (Digital Signal Processor).
Sparse coding exhibits good performance in many computer vision applications by finding bases which capture highlevel semantics of the data and learning sparse coefficients in terms of the bases. However, due to the fact that bases are non-orthogonal, sparse coding can hardly preserve the samples’ similarity, which is important for discrimination. In this paper, a new image representing method called maximum constrained sparse coding (MCSC) is proposed. Sparse representation with more active coefficients means more similarity information, and the infinite norm is added to the solution for this purpose. We solve the optimizer by constraining the codes’ maximum and releasing the residual to other dictionary atoms. Experimental results on image clustering show that our method can preserve the similarity of adjacent samples and maintain the sparsity of code simultaneously.
Traditional saliency detection can effectively detect possible objects using an attentional mechanism instead of automatic object detection, and thus is widely used in natural scene detection. However, it may fail to extract salient objects accurately from remote sensing images, which have their own characteristics such as large data volumes, multiple resolutions, illumination variation, and complex texture structure. We propose a sparsity-guided saliency detection model for remote sensing images that uses a sparse representation to obtain the high-level global and background cues for saliency map integration. Specifically, it first uses pixel-level global cues and background prior information to construct two dictionaries that are used to characterize the global and background properties of remote sensing images. It then employs a sparse representation for the high-level cues. Finally, a Bayesian formula is applied to integrate the saliency maps generated by both types of high-level cues. Experimental results on remote sensing image datasets that include various objects under complex conditions demonstrate the effectiveness and feasibility of the proposed method.
Most recent image deblurring methods only use valid information found in input image as the clue to fill the deblurring region. These methods usually have the defects of insufficient prior information and relatively poor adaptiveness. Patch-based method not only uses the valid information of the input image itself, but also utilizes the prior information of the sample images to improve the adaptiveness. However the cost function of this method is quite time-consuming and the method may also produce ringing artifacts. In this paper, we propose an improved non-blind deblurring algorithm based on learning patch likelihoods. On one hand, we consider the effect of the Gaussian mixture model with different weights and normalize the weight values, which can optimize the cost function and reduce running time. On the other hand, a post processing method is proposed to solve the ringing artifacts produced by traditional patch-based method. Extensive experiments are performed. Experimental results verify that our method can effectively reduce the execution time, suppress the ringing artifacts effectively, and keep the quality of deblurred image.
The airport target recognition method for remote sensing images is generally based on image matching, which is significantly affected by the variations of illumination, viewpoints, scale, and so on. As a well-known semantic model for target recognition, bag-of-features (BoF) performs k-means clustering on enormous local feature descriptors and thus generates the visual words to represent the images. We propose a fast automatic recognition framework for an airport target of a low-resolution remote sensing image under a complicated environment. It can be viewed as a two-phase procedure: detection and then classification. Concretely, it first utilizes a visual attention model for locating the salient region, and then detects possible candidate targets and extracts saliency-constrained scale invariant feature transform descriptors to build a high-level semantics model. Consequently, BoF is applied to mine the high-level semantics of targets. Different from k-means in a traditional BoF, we employ locality preserving indexing (LPI) to obtain the visual words. Because LPI can consider the intrinsic local structure of descriptors and further enhance the ability of words to describe the image content, it can accurately classify the detected candidate targets. Experiments on the dataset of 10 kinds of airport aerial images demonstrate the feasibility and effectiveness of the proposed method.
Detecting aircrafts is important in the field of remote sensing. In past decades, researchers used various approaches
to detect aircrafts based on classifiers for overall aircrafts. However, with the development of high-resolution
images, the internal structures of aircrafts should also be taken into consideration now. To address this issue, a
novel aircrafts detection method for satellite images based on probabilistic topic model is presented. We model
aircrafts as the connected structural elements rather than features. The proposed method contains two major
steps: 1) Use Cascade-Adaboost classier to identify the structural elements of aircraft firstly. 2) Connect these
structural elements to aircrafts, where the relationships between elements are estimated by hierarchical topic
model. The model places strict spatial constraints on structural elements which can identify differences between
similar features. The experimental results demonstrate the effectiveness of the approach.
Automatic target detection in remote sensing images remains a challenging problem. In this paper, we present
a new oil tank detection method based on salient region and geometric features. Salient region detection and
Otsu threshold are used for image segmentation to get candidate regions effectively, and four geometric features
are employed for reducing the false alarms. Experimental results show that our method can provide a promising
way to detect oil tanks accurately, and it is also robust in complicated conditions such as occlusion, shadow or
Grey world algorithm is a simple but widely used global white balance method for color cast images. However,
this algorithm only assumes that the mean values of the R, G, and B components tend to be equal, which may
lead to false alarms in some normal images with large areas of single color background, for example, images in
ocean background. Another defect is that grey world algorithm may cause luminance variations in the channels
having no cast. We note that though different in mean values, standard deviations of the three channels are
supposed to converge in color cast images, which is not suitable for those false alarms. Based on this discrepancy,
through a mathematical manipulation both on mean values and standard deviations of the three channels, a novel
color correction model is proposed by weighting the gain coefficients in grey world model. All the three weighted
gain coefficients in the proposed model tend to be 1 on images containing large single color regions so as to
avoid false alarms. For the color cast images, the channel existing color cast is given a weighted gain coefficient
much less than 1 to correct color cast, while the other two channels are distributed weighted gain coefficients
approximately equal to 1 thus to ensure that the proposed model has little negative effects on channels with no
color cast. Experiments show that our model presents better performance in color correction.
Conventional graph embedding framework uses the Euclidean distance to determine the similarities of neighbor samples, which causes the graph structure to be sensitive to outliers and lack physical interpretation. Moreover, the graph construction suffers from the difficulty of neighbor parameter selection. Although sparse representation (SR) based graph embedding methods can automatically select the neighbor parameter, the computational cost of SR is expensive. On the other hand, most discriminant projection methods fail to perform feature selection. In this paper, we present a novel joint discriminant analysis and feature selection method that employs regularized least square for graph construction and l 2,1 -norm minimization on projection matrix for feature selection. Specifically, our method first uses the regularized least square coefficients to measure the intraclass and interclass similarities from the viewpoint of reconstruction. Based on this graph structure, we formulate an object function with scatter difference criterion for learning the discriminant projections, which can avoid the small sample size problem. Simultaneously, the l 2,1 -norm minimization on projection matrix is applied to gain row-sparsity for selecting useful features. Experiments on two face databases (ORL and AR) and COIL-20 object database demonstrate that our method not only achieves better classification performance, but also has lower computational cost than SR.
In this paper, we proposed a two-step algorithm based on the combination of the exemplar-based algorithm and the
illumination model to deal with specular images, especially those contain saturated pixels in the highlight areas. First the
proposed modified exemplar-based algorithm is employed to process the unsaturated specular pixels under the
supervision of illumination model. Then we inpaint the rest regions in which the pixels are saturated with original
exemplar-based algorithm to obtain the final result. Experimental results demonstrate that the proposed algorithm
performs better on the images with saturated pixels in the highlight areas compared with classical highlight removal and
image inpainting algorithms.
A real-time orientation feature descriptor for portable devices is introduced. The descriptor requires very low
computational resources and has 16 dimensions shorter than all existing methods. The patch of a candidate feature is
firstly segmented into polar arranged sub-regions, which enables us to achieve rotation invariance rapidly. Furthermore,
the principal orientation is used to describe each sub-region. The computations can be considerably accelerated by using
integral image. The descriptor is used for object tracking and achieves 25 fps frame rate on mobile phone. Experimental
results demonstrate that the proposed method offers sufficient matching performance.
Digital pathological image retrieval plays an important role in computer-aided diagnosis for breast cancer. The retrieval
results of an unknown pathological image, which are generally previous cases with diagnostic information, can provide
doctors with assistance and reference. In this paper, we develop a novel pathological image retrieval method for breast
cancer, which is based on stain component and probabilistic latent semantic analysis (pLSA) model. Specifically, the
method firstly utilizes color deconvolution to gain the representation of different stain components for cell nuclei and
cytoplasm, and then block Gabor features are conducted on cell nuclei, which is used to construct the codebook.
Furthermore, the connection between the words of the codebook and the latent topics among images are modeled by
pLSA. Therefore, each image can be represented by the topics and also the high-level semantic concepts of image can be
described. Experiments on the pathological image database for breast cancer demonstrate the effectiveness of our method.
For building detection from single very high spatial resolution (VHR) satellite images, we take advantage of visual saliency and Bayesian model to rapidly locate roof-top areas. We firstly generate saliency map of an image by a salient contrast filter using low-level feature. This filter distinguishes salient pixels if a pixel is visually different from its surroundings in color or texture. Secondly, a Bayesian model is proposed to generate all closed rectangular contours as mid-level content in the image. We suggest the area enclosed by contour corresponds to high saliency values. Finally, the roof-top areas are extracted by fusing different level information mentioned above. Experimental results demonstrate the effectiveness of our algorithm.
KEYWORDS: Image segmentation, Visualization, Remote sensing, Data modeling, Visual process modeling, Medical imaging, Statistical analysis, Statistical modeling, Associative arrays, Image processing algorithms and systems
We present an effective solution for unsupervised texture segmentation by taking advantage of the latent Dirichlet allocation (LDA) model. LDA is a generative topic model that is capable of hierarchically organizing discrete data including texts and images. We propose a new texture model by connecting texture primitives to the topic of LDA. The model is able to extract the characteristic features of a texture primitive and group them into a topic based on their frequencies of co-occurrence. Here, the feature descriptor is the connection of Haar-like features of multiple sizes. The segments of an image are finally obtained by identifying the homogeneous regions in the corresponding topic assignment map. The evaluation results for synthetic texture mosaics, remote sensing images, and natural scene images are illustrated.
Based on the four-wave mixing mechanism and fanning effect, the threshold coupling constant for mutually pumped
phase conjugator (MPPC) with one interaction region and two interaction regions is studied in theory. The relation of the
the threshold coupling strength for MPPC and the fanning intensity is studied. Due to the more efficient fanning, preset
grating in MPPC has the ability to reduce threshold coupling constant and improve output efficiency. These
characteristics predicted by theory have been used to explain the previous experiment phenomenon. The dependence of
threshold coupling constants for MPPC on the amplitude ratio of incident pump beams is presented. The threshold
coupling constant for one-interaction-region MPPC is lower than that for two-interaction-regions MPPC.
Acquiring optical images of space objects is one of the most important goals of space-based optical surveillance systems.
However, it's actually difficult to obtain enough high resolution optical images for space object recognition, attitude
measurement and situational awareness. To solve this problem, the imaging model of space-based optical camera and the
imaging characteristics of space objects are analyzed in this paper, and a novel method of image simulation is proposed.
The high resolution images of space objects simulated by our method are visually similar to the actual imaging results
and may provide data support for further research on space technology.
The photorefractive adaptive optical heterodyne detection system (PAOHDS) is proposed. The dynamic properties of
mutually pumped phase conjugate (MPPC), the key technology to the PAOHDS, are studied theoretically. The
three-dimensional distribution of MPPC refraction index grating in time and length axis is simulated numerically. The
dependence of dynamic properties of MPPC on the intensity of the fanning light is presented.The stronger the intensity
of the fanning light is, the less response time for MPPC is. The dependence of dynamic properties of MPPC on the
coupling strength is presented. The greater the coupling strength is, the less response time for MPPC is. These results
provide theoretical basis to reduce response time of PAOHDS.
Hyperspectral remote sensing is widely used in many fields suchas agriculture, military detection, mineral exploration, and so on. Hyperspectral data has very high spectral resolution, but much lower spatial resolution than the data obtained by other types of sensors. The low spatial resolution restrains its wide applications. On the contrary, we easily obtain images with high spatial resolution but insufficient spectral resolution (like panchromatic images). Naturally, people expect to obtain images that have high spatial and spectral resolution at the same time by the hyperspectral image fusion. In this paper, a similarity measure-based variational method is proposed to achieve the fusion process. The main idea is to transform the image fusion problem to an optimization problem based on the variational model. We first establish a fusion model that constrains the spatial and spectral information of the original data at the same time, then use the split bregman iteration to obtain the final fused data. Also, we analyze the convergence of the method. The experiments on the synthetic and real data show that the fusion method preserves the information of the original images efficiently, especially on the spectral information.
Automatic target detection is an important application in the hyperspectral image processing field. Most statistics-based detection algorithms use second-order statistics to construct detectors. However, for target detection in a real hyperspectral image, targets of interest usually occupy a few pixels with small population. In this case, high-order statistics could characterize targets more effectively than second-order statistics. Also, the inherent variation of spectra of targets is an obstacle to successful target detection. In this paper, we propose a regularized high-order matched filter (RHF) which uses high-order statistics to build an objective function and uses a regularized term to make the algorithm robust to target spectral variation. A gradient descent method is used to solve this optimization problem, and we obtain the convergence properties of the RHF. According to the experimental hyperspectral data, the results have shown that the proposed algorithm performed better than those classical second-order statistics-based algorithms and some kernel-based methods.
Hair removal from skin melanoma image is one of the key problems for the precise segmentation and analysis of the skin
malignant melanoma. In this paper, an automatically hair removal algorithm in dermoscopy images of pigmented skin
lesions is proposed. This algorithm includes three steps: firstly, the melanoma image with hairs are enhanced by
morphologic closing-based top-hat operator and then segmented through statistic threshold; secondly, the hairs are
extracted based on the elongate of connected region; thirdly, the hair-occluded information is repaired by replacing the
hair pixels with the nearby non-hair pixels. As a matter of fact, with the morphologic closing-based top-hat operator both
strong and weak hairs can be enhanced simultaneously, and the elongate state of band-like connected region can be
correctly described by the elongate function proposed in this paper so as to measure the hair effectively. Therefore, the
unsupervised hair removal problem in dermoscopy melanoma image can be resolved very well through combining the
hair extraction with information repair. The experiment results show that various hairs can be extracted accurately and
the repaired effect of textures can satisfy the requirement of medical diagnosis.
This paper proposes a method used to detect big moving object in the complicated dynamic background, which integrates
the phase correlation technique including singular value decomposition and the method in which multi-frames difference
images is multiplied. The phase correlation algorithm based on singular value decomposition is insensitive to noise and
change of gray and contrast. Comparing with many complex phase correlation algorithm and registration algorithm in
spatial domain, our method not only can effectively restrain noise, but also enhancing the registration precision, whose
speed is nearly two times as quickly as original phase correlation algorithm. The fact is found by the result of experiment
that the phase correlation matrix is rank one for a noise-free rigid translation model. A new phase correlation matrix is
recast based on the property which can effectively restrain noise and change of gray. By estimating global moving vector
of two images using phase correlation based on singular value decomposition, background is accurately matched. The
matched images are processed to calculate the image differences between the first and fourth, the second and fifth, the third
and sixth. After these difference images are multiplied, clear edge of moving object is obtained. Thus the accurate location
of object is realized by calculating barycentre of image. At last, simulation results prove that this proposed method can
overcome effectiveness well in the lighting variations and noise. It is also efficient and applicable for accurate moving
object orientation in the complicated dynamic background.
This paper addresses the problem of rejecting fixed star background in star-background image. For most sensors with a
fine spatial resolution, phenomenological effects, such as background, and system effects, such as noise, contribute
significant numbers of spurious points to each frame. In star-background images, fixed stars are uppermost source of
spurious points. Since background and noise do not behave like targets, a good tracking algorithm would eventually
reject the spurious points as non-targets. However, the computation required to consider which points appearing in a
frame are from the target grows geometrically with the number of points to be considered. Simply considering each of
these points as a candidate target point unnecessarily burdens the tracking algorithm and in many cases would require
computational resources that cannot be provided to the mission. In this paper, we proposed a new method for rejecting
fixed stars based on star-point matching in star-background image. We decide the fixed stars using point matching
between points in actual image and points in ideal image which relies on the catalog. This work extends applied domain
of Hausdorff Distance (HD) which is one of commonly used measures for object matching. In our experiments, Least
Trimmed Square HD (LTS-HD) was used in point matching, and the result is effective.
In this paper, an effective medical micro-optical image matching algorithm based on relativity is described. The algorithm includes the following steps: Firstly, selecting a sub-area that has obvious character in one of the two images as standard image; Secondly, finding the right matching position in the other image; Thirdly, applying coordinate transformation to merge the two images together. As a kind of application of image matching in medical micro-optical image, this method overcomes the shortcoming of microscope whose visual field is little and makes it possible to watch a big object or many objects in one view. Simultaneously it implements adaptive selection of standard image, and has a satisfied matching speed and result.
An effective immune cell image segmentation algorithm based on mathematical morphology is presented in this paper. In order to get better segmentation results in addition to the morphology based watershed growth algorithm the histogram potential function is involved, which means, the image spectral information is combined with spacial information. How to get the exact segmentation result is a major issue for immune cell image analysis. Obtaining an effective and credible marker is a crucial step of watershed segmentation. By involving the histogram potential function, the markers suitable for watershed segmentation can be clearly improved and the segmentation result is quite consistent with human vision and also the segmentation speed and repeatability are quite acceptable.