Vehicle color recognition is easily affected by subtle environmental changes. The existing recognition methods cannot achieve an accurate result. A high-accuracy vehicle color recognition method using a hierarchical fine-tuning strategy for urban surveillance videos is proposed. Different from the conventional convolutional neural networks-based methods, which usually obtain a single classification model, the proposed method combines pretraining and hierarchical fine-tunings to obtain different classification models that can adapt to the change of illumination conditions. First, the GoogLeNet is pretrained using the ILSVRC-2012 dataset to obtain the initial weight parameters of the network. During the first stage of fine-tuning, the whole vehicle color dataset is used to fine-tune the pretrained results to get the initial classification model. Then, an image quality assessment method is proposed to evaluate the illumination conditions of the image. The whole vehicle color dataset is divided into some subdatasets according to the evaluation results. The second stage of fine-tuning is performed on the initial classification model using each subdataset. Thus, the final classification models for the subdatasets are obtained. The experimental results on different databases demonstrate that the recognition accuracy of the proposed method can achieve superior performance over the state-of-the-art methods.
In this paper, we propose a region of interest-based (ROI-adaptive) fusion algorithm of infrared and visible images by
using the Laplacian Pyramid method. Firstly, we estimate the saliency map of infrared images, and then divide the infrared
image into two parts: the regions of interest (RoI) and the regions of non-interest (nRoI), by normalizing the saliency map.
Visible images are also segmented into two parts by using the Gauss High-pass filter: the regions of high frequency (RoH)
and the regions of low frequency (RoL). Secondly, we down-sampled both the nRoI of infrared image and the RoL of
visible image as the input of next level processing. Finally, we use normalized saliency map of infrared images as the
weighted coefficient to get the basic image on the top level and choose max gray value of the RoI of infrared image and
the RoH of visible image to get the detail image. In this way, our method can keep target feature of infrared image and
texture detail information of visual image at the same time. Experiment results show that such fusion scheme performs
better than the other fusion algorithms both on human visual system and quantitative metrics.
Accurate and fast detection of small infrared target has very important meaning for infrared precise guidance, early
warning, video surveillance, etc. Based on human visual attention mechanism, an automatic detection algorithm for
small infrared target is presented. In this paper, instead of searching for infrared targets, we model regular patches that do
not attract much attention by our visual system. This is inspired by the property that the regular patches in spatial domain
turn out to correspond to the spikes in the amplitude spectrum. Unlike recent approaches using global spectral filtering,
we define the concept of local maxima suppression using local spectral filtering to smooth the spikes in the amplitude
spectrum, thereby producing the pop-out of the infrared targets. In the proposed method, we firstly compute the
amplitude spectrum of an input infrared image. Second, we find the local maxima of the amplitude spectrum using cubic
facet model. Third, we suppress the local maxima using the convolution of the local spectrum with a low-pass Gaussian
kernel of an appropriate scale. At last, the detection result in spatial domain is obtained by reconstructing the 2D signal
using the original phase and the log amplitude spectrum by suppressing local maxima. The experiments are performed
for some real-life IR images, and the results prove that the proposed method has satisfying detection effectiveness and
robustness. Meanwhile, it has high detection efficiency and can be further used for real-time detection and tracking.