We propose a new dual-level convolutional neural network model based on Inception modules and residual connections. First, Inception has filters with different kernel sizes, and its output feature maps contain different scales of receptive fields. The feature map with the wide receptive field receives the global information, while the feature map with the small receptive field contains some local information. The multiscale feature provides more comprehensive information. Second, with the help of residual connections, the training process is simple and can avoid overfitting. Third, the proposed network adopts two levels, i.e., a low level and a high level, and uses the feature fusion operation to take full advantage of the complementary and correlated information of the two levels. Fourth, we combine the spatial features and the spectral features of hyperspectral image (HSI). The pixels to be classified with their neighborhood information serve as the input of the neural network to realize spectral–spatial classification for HSI. Experimental results show that our model performs better than other state-of-the-art methods.
Most existing feature learning methods optimize inflexible handcrafted features and the affinity matrix is constructed by shallow linear embedding methods. Different from these conventional methods, we pretrain a generative neural network by stacking convolutional autoencoders to learn the latent data representation and then construct an affinity graph with them as a prior. Based on the pretrained model and the constructed graph, we add a self-expressive layer to complete the generative model and then fine-tune it with a new loss function, including the reconstruction loss and a deliberately defined locality-preserving loss. The locality-preserving loss designed by the constructed affinity graph serves as prior to preserve the local structure during the fine-tuning stage, which in turn improves the quality of feature representation effectively. Furthermore, the self-expressive layer between the encoder and the decoder is based on the assumption that each latent feature is a linear combination of other latent features, so the weighted combination coefficients of the self-expressive layer are used to construct a new refined affinity graph for representing the data structure. We conduct experiments on four datasets to demonstrate the superiority of the representation ability of our proposed model over the state-of-the-art methods.
An iterative joint bilateral filter is used to obtain a natural weight map. Images from different modalities are merged by a weighted-sum rule in the spatial domain. Saliency maps are determined by the gradient of the pairwise raw images. Comparing the pairwise values of saliency maps, a coarse weight map is attained to determine which pixel is preferred. Since such a coarse weight map obtained by pairwise comparison is not a natural weight map subjectively, i.e., it is inconsistent with human visual system, the weight map is modified by using an iterative joint bilateral filter. With the iterative joint bilateral filter, the weight map becomes natural. We use the refined weight map to obtain the fused image and we seamlessly merge images from different modalities effectively. Experiments were conducted on several pairs of multimodal images to verify the effectiveness and superiority of the proposed image fusion algorithm compared to the state-of-the-art methods.
A spatial domain multifocus image fusion method is proposed using a structure-preserving filter. In particular, the latest recursive filter (RF) is introduced as the structure-preserving filter in the proposed spatial domain method. Moreover, a focused region detection method is presented to determine initial weight maps based on an average low-pass filter. Then a fused image can be generated by the final weight maps, which are obtained using the RF to refine the initial weight maps and can well preserve the structures of source images. Experimental results show that the proposed method is superior to the state-of-the-art multifocus fusion methods in terms of subjective and objective evaluation.
We present a framework based on the development of adaptive scalable kernel (ASK) for hyperspectral image classification, which can achieve an excellent status in removing insignificant details and defending crucial features. The proposed method consists of three steps. First, the spectral feature extraction based on interval gradient and a fast morphological filter is used to reduce the high dimensionality. Second, a powerful spatial structure extraction method based on adaptive scale kernels is adopted to enhance the performance of structure-preserving filtering. Depending on patch-based statistics, this model identifies small-scale texture from large-scale structure and finds an optimal per-pixel smoothing scale. Third, the obtained spectral structure feature maps are classified with the large-margin distribution machine. The experimental results show that the proposed spatial structure extraction method based on ASK achieves the state-of-the-art performance in terms of classification accuracy and computational efficiency.
Image fusion aims at exploiting complementary information in multimodal images to create a single composite image with extended information content. An image fusion framework is proposed for different types of multimodal images with fast filtering in the spatial domain. First, image gradient magnitude is used to detect contrast and image sharpness. Second, a fast morphological closing operation is performed on image gradient magnitude to bridge gaps and fill holes. Third, the weight map is obtained from the multimodal image gradient magnitude and is filtered by a fast structure-preserving filter. Finally, the fused image is composed by using a weighed-sum rule. Experimental results on several groups of images show that the proposed fast fusion method has a better performance than the state-of-the-art methods, running up to four times faster than the fastest baseline algorithm.
Image intensity value is determined by both the albedo component and the shading component. The albedo component describes the physical nature of different objects at the surface of the earth, and land-cover classes are different from each other because of their intrinsic physical materials. We, therefore, recover the intrinsic albedo feature of the hyperspectral image to exploit the spatial semantic information. Then, we use the support vector machine (SVM) to classify the recovered intrinsic albedo hyperspectral image. The SVM tries to maximize the minimum margin to achieve good generalization performance. Experimental results show that the SVM with the intrinsic albedo feature method achieves a better classification performance than the state-of-the-art methods in terms of visual quality and three quantitative metrics.
The hyperchaotic sequence and the DNA sequence are utilized jointly for image encryption. A four-dimensional hyperchaotic system is used to generate a pseudorandom sequence. The main idea is to apply the hyperchaotic sequence to almost all steps of the encryption. All intensity values of an input image are converted to a serial binary digit stream, and the bitstream is scrambled globally by the hyperchaotic sequence. DNA algebraic operation and complementation are performed between the hyperchaotic sequence and the DNA sequence to obtain a robust encryption performance. The experiment results demonstrate that the encryption algorithm achieves the performance of the state-of-the-art methods in term of quality, security, and robustness against noise and cropping attack.
Support vector machine (SVM) classifiers are widely applied to hyperspectral image (HSI) classification and provide significant advantages in terms of accuracy, simplicity, and robustness. SVM is a well-known learning algorithm that maximizes the minimum margin. However, recent theoretical results pointed out that maximizing the minimum margin leads to a lower generalization performance than optimizing the margin distribution, and proved that the margin distribution is more important. In this paper, a large margin distribution machine (LDM) is applied to HSI classification, and optimizing the margin distribution achieves a better generalization performance than SVM. Since the raw HSI feature space is not the most effective space for representing HSI, we adopt factor analysis to learn an effective HSI feature and the learned features are further filtered by a structure-preserved filter to fully exploit the spatial structure information of HSI. The spatial structure information is integrated in the feature learning process to obtain a better HSI feature. Then we propose a multiclass LDM to classify the filtered HSI feature. Experimental results show that the proposed LDM with feature learning method achieves the classification performance of the state-of-the-art methods in terms of visual quality and three quantitative evaluations and indicates that LDM has a high generalization performance.
On the basis of the different strengths of synaptic connections between actual neurons, this paper proposes a heterogeneous pulse coupled neural network (HPCNN) algorithm to perform quantization on images. HPCNNs are developed from traditional pulse coupled neural network (PCNN) models, which have different parameters corresponding to different image regions. This allows pixels of different gray levels to be classified broadly into two categories: background regional and object regional. Moreover, an HPCNN also satisfies human visual characteristics. The parameters of the HPCNN model are calculated automatically according to these categories, and quantized results will be optimal and more suitable for humans to observe. At the same time, the experimental results of natural images from the standard image library show the validity and efficiency of our proposed quantization method.
We address the problem of fusing multifocus images based on the phase congruency (PC). PC provides a sharpness feature of a natural image. The focus measure (FM) is identified as strong PC near a distinctive image feature evaluated by the complex Gabor wavelet. The PC is more robust against noise than other FMs. The fusion image is obtained by a new fusion rule (FR), and the focused region is selected by the FR from one of the input images. Experimental results show that the proposed fusion scheme achieves the fusion performance of the state-of-the-art methods in terms of visual quality and quantitative evaluations.