This PDF file contains the front matter associated with SPIE Proceedings Volume 10608, including the Title Page, Copyright information, Table of Contents, Introduction, and Conference Committee listing.
VIBE algorithm is one of the effective methods that based on dynamic model, which can deal with the detection of moving objects in the slowly changing background. The traditional VIBE uses a fixed threshold to realize target cutting ,that would result in a poor detection when the target enters a background area whose pixel value is not much different from its own pixel value.Thus,we propose a method of moving target detection based on dynamic threshold in this paper.
The purpose of visual tracking is to associate the target object in a continuous video frame. In recent years, the method based on the kernel correlation filter has become the research hotspot. However, the algorithm still has some problems such as video capture equipment fast jitter, tracking scale transformation. In order to improve the ability of scale transformation and feature description, this paper has carried an innovative algorithm based on the multi feature fusion and multi-scale transform. The experimental results show that our method solves the problem that the target model update when is blocked or its scale transforms. The accuracy of the evaluation (OPE) is 77.0%, 75.4% and the success rate is 69.7%, 66.4% on the VOT and OTB datasets. Compared with the optimal one of the existing target-based tracking algorithms, the accuracy of the algorithm is improved by 6.7% and 6.3% respectively. The success rates are improved by 13.7% and 14.2% respectively.
Infrared maritime target detection using a probabilistic single Gaussian model of sea clutter in Fourier domain
For ship targets detection in cluttered infrared image sequences, a robust detection method, based on the probabilistic single Gaussian model of sea background in Fourier domain, is put forward. The amplitude spectrum sequences at each frequency point of the pure seawater images in Fourier domain, being more stable than the gray value sequences of each background pixel in the spatial domain, are regarded as a Gaussian model. Next, a probability weighted matrix is built based on the stability of the pure seawater’s total energy spectrum in the row direction, to make the Gaussian model more accurate. Then, the foreground frequency points are separated from the background frequency points by the model. Finally, the false-alarm points are removed utilizing ships’ shape features. The performance of the proposed method is tested by visual and quantitative comparisons with others.
Ship detection from optical images taken by high-altitude aircrafts such as unmanned long-endurance airships and unmanned aerial vehicles has broad applications in marine fishery management, ship monitoring and vessel salvage. However, the major challenge is the limited capability of information processing on unmanned high-altitude platforms. Furthermore, in order to guarantee the wide detection range, unmanned aircrafts generally cruise at high altitudes, resulting in imagery with low-resolution targets and strong clutters suffered by heavy clouds. In this paper, we propose a low-resolution ship detection method to extract ships from these high-altitude optical images. Inspired by a recent research on visual saliency detection indicating that small salient signals could be well detected by a gradient enhancement operation combined with Gaussian smoothing, we propose the facet kernel filtering to rapidly suppress cluttered backgrounds and delineate candidate target regions from the sea surface. Then, the principal component analysis (PCA) is used to compute the orientation of the target axis, followed by a simplified histogram of oriented gradient (HOG) descriptor to characterize the ship shape property. Finally, support vector machine (SVM) is applied to discriminate real targets and false alarms. Experimental results show that the proposed method actually has high efficiency in low-resolution ship detection.
In order to realize long-term on-orbit running of satellites, space stations, etc spacecrafts, in addition to the long life design of devices, The life of the spacecraft can also be extended by the on-orbit servicing and maintenance. Therefore, it is necessary to keep precise and detailed maintenance of key components. In this paper, a high-precision relative position and attitude measurement method used in the maintenance of key components is given. This method mainly considers the design of the passive cooperative marker, light-emitting device and high resolution camera in the presence of spatial stray light and noise. By using a series of algorithms, such as background elimination, feature extraction, position and attitude calculation, and so on, the high precision relative pose parameters as the input to the control system between key operation parts and maintenance equipment are obtained. The simulation results show that the algorithm is accurate and effective, satisfying the requirements of the precision operation technique.
This paper studies the initial orbit determination (IOD) based on space-based angle measurement. Commonly, these space-based observations have short durations. As a result, classical initial orbit determination algorithms give poor results, such as Laplace methods and Gauss methods. In this paper, an advanced analysis method of initial orbit determination is developed for space-based observations. The admissible region and triangulation are introduced in the method. Genetic algorithm is also used for adding some constraints of parameters. Simulation results show that the algorithm can successfully complete the initial orbit determination.
Since the resolution of remote sensing infrared images is low, the features of ship targets become unstable. The issue of how to recognize ships with fuzzy features is an open problem. In this paper, we propose a novel ship target recognition algorithm based on Gaussian mixture models (GMMs). In the proposed algorithm, there are mainly two steps. At the first step, the Hu moments of these ship target images are calculated, and the GMMs are trained on the moment features of ships. At the second step, the moment feature of each ship image is assigned to the trained GMMs for recognition. Because of the scale, rotation, translation invariance property of Hu moments and the power feature-space description ability of GMMs, the GMMs-based ship target recognition algorithm can recognize ship reliably. Experimental results of a large simulating image set show that our approach is effective in distinguishing different ship types, and obtains a satisfactory ship recognition performance.
Dim target detection is one key problem in digital image processing field. With development of multi-spectrum imaging sensor, it becomes a trend to improve the performance of dim target detection by fusing the information from different spectral images. In this paper, one dim target detection method based on salient graph fusion was proposed. In the method, Gabor filter with multi-direction and contrast filter with multi-scale were combined to construct salient graph from digital image. And then, the maximum salience fusion strategy was designed to fuse the salient graph from different spectral images. Top-hat filter was used to detect dim target from the fusion salient graph. Experimental results show that proposal method improved the probability of target detection and reduced the probability of false alarm on clutter background images.
Image location methods are the key technologies of visual navigation, most previous image location methods simply assume the ideal inputs without taking into account the real-world degradations (e.g. low resolution and blur). In view of such degradations, the conventional image location methods first perform image restoration and then match the restored image on the reference image. However, the defective output of the image restoration can affect the result of localization, by dealing with the restoration and location separately. In this paper, we present a joint image restoration and location (JRL) method, which utilizes the sparse representation prior to handle the challenging problem of low-quality image location. The sparse representation prior states that the degraded input image, if correctly restored, will have a good sparse representation in terms of the dictionary constructed from the reference image. By iteratively solving the image restoration in pursuit of the sparest representation, our method can achieve simultaneous restoration and location. Based on such a sparse representation prior, we demonstrate that the image restoration task and the location task can benefit greatly from each other. Extensive experiments on real scene images with Gaussian blur are carried out and our joint model outperforms the conventional methods of treating the two tasks independently.
Early detection of goals and high-precision of target tracking is two important performance indicators which need to be balanced in actual target search tracking system. This paper proposed a target tracking system with preliminary and precise two - stage compound. This system using a large field of view to achieve the target search. After the target was searched and confirmed, switch into a small field of view for two field of view target tracking. In this system, an appropriate filed switching strategy is the key to achieve tracking. At the same time, two groups PID parameters are add into the system to reduce tracking error. This combination way with preliminary and precise two-stage compound can extend the scope of the target and improve the target tracking accuracy and this method has practical value.
Discriminative correlation filter (DCF) based trackers have recently achieved excellent performance with great computational efficiency. However, DCF based trackers suffer boundary effects, which result in the unstable performance in challenging situations exhibiting fast motion. In this paper, we propose a novel method to mitigate this side-effect in DCF based trackers. We change the search area according to the prediction of target motion. When the object moves fast, broad search area could alleviate boundary effects and reserve the probability of locating object. When the object moves slowly, narrow search area could prevent effect of useless background information and improve computational efficiency to attain real-time performance. This strategy can impressively soothe boundary effects in situations exhibiting fast motion and motion blur, and it can be used in almost all DCF based trackers. The experiments on OTB benchmark show that the proposed framework improves the performance compared with the baseline trackers.
Image prominence is the most important region in an image, which can cause the visual attention and response of human beings. Preferentially allocating the computer resources for the image analysis and synthesis by the significant region is of great significance to improve the image area detecting. As a preprocessing of other disciplines in image processing field, the image prominence has widely applications in image retrieval and image segmentation. Among these applications, the super-pixel segmentation significance detection algorithm based on linear spectral clustering (LSC) has achieved good results. The significance detection algorithm proposed in this paper is better than the regional contrast ratio by replacing the method of regional formation in the latter with the linear spectral clustering image is super-pixel block. After combining with the latest depth learning method, the accuracy of the significant region detecting has a great promotion. At last, the superiority and feasibility of the super-pixel segmentation detection algorithm based on linear spectral clustering are proved by the comparative test.
Discriminative correlation filters (DCF) have recently shown excellent performance in visual object tracking area. In this paper we summarize the methods of updating model filter from discriminative correlation filter (DCF) based tracking algorithms and analyzes similarities and differences among these methods. We deduce the relationship among updating coefficient in high dimension (kernel trick), updating filter in frequency domain and updating filter in spatial domain, and analyze the difference among these different ways. We also analyze the difference between the updating filter directly and updating filter’s numerator (object response power) with updating filter’s denominator (filter’s power). The experiments about comparing different updating methods and visualizing the template filters are used to prove our derivation.
This paper presents a street rubbish detection algorithm based on image registration with Sift feature and RCNN. Firstly, obtain the rubbish region proposal on the real-time street image and set up the CNN convolution neural network trained by the rubbish samples set consists of rubbish and non-rubbish images; Secondly, for every clean street image, obtain the Sift feature and do image registration with the real-time street image to obtain the differential image, the differential image filters a lot of background information, obtain the rubbish region proposal rect where the rubbish may appear on the differential image by the selective search algorithm. Then, the CNN model is used to detect the image pixel data in each of the region proposal on the real-time street image. According to the output vector of the CNN, it is judged whether the rubbish is in the region proposal or not. If it is rubbish, the region proposal on the real-time street image is marked. This algorithm avoids the large number of false detection caused by the detection on the whole image because the CNN is used to identify the image only in the region proposal on the real-time street image that may appear rubbish. Different from the traditional object detection algorithm based on the region proposal, the region proposal is obtained on the differential image not whole real-time street image, and the number of the invalid region proposal is greatly reduced. The algorithm has the high mean average precision (mAP).
Complex background suppression using global-local registration strategy for the detection of small-moving target on moving platform
Target detection is a very important and basic problem of computer vision and image processing. The most often case we meet in real world is a detection task for a moving-small target on moving platform. The commonly used methods, such as Registration-based suppression, can hardly achieve a desired result. To crack this hard nut, we introduce a Global-local registration based suppression method. Differ from the traditional ones, the proposed Global-local Registration Strategy consider both the global consistency and the local diversity of the background, obtain a better performance than normal background suppression methods. In this paper, we first discussed the features about the small-moving target detection on unstable platform. Then we introduced a new strategy and conducted an experiment to confirm its noisy stability. In the end, we confirmed the background suppression method based on global-local registration strategy has a better perform in moving target detection on moving platform.
The detection and tracking of moving dim target in infrared image have been an research hotspot for many years. The target in each frame of images only occupies several pixels without any shape and structure information. Moreover, infrared small target is often submerged in complicated background with low signal-to-clutter ratio, making the detection very difficult. Different backgrounds exhibit different statistical properties, making it becomes extremely complex to detect the target. If the threshold segmentation is not reasonable, there may be more noise points in the final detection, which is unfavorable for the detection of the trajectory of the target. Single-frame target detection may not be able to obtain the desired target and cause high false alarm rate. We believe the combination of suspicious target detection spatially in each frame and temporal association for target tracking will increase reliability of tracking dim target. The detection of dim target is mainly divided into two parts, In the first part, we adopt bilateral filtering method in background suppression, after the threshold segmentation, the suspicious target in each frame are extracted, then we use LSTM(long short term memory) neural network to predict coordinates of target of the next frame. It is a brand-new method base on the movement characteristic of the target in sequence images which could respond to the changes in the relationship between past and future values of the values. Simulation results demonstrate proposed algorithm can effectively predict the trajectory of the moving small target and work efficiently and robustly with low false alarm.
To overcome the problem of extracting line segment from an image, a method of line segment detection was proposed based on the graph search algorithm. After obtaining the edge detection result of the image, the candidate straight line segments are obtained in four directions. For the candidate straight line segments, their adjacency relationships are depicted by a graph model, based on which the depth-first search algorithm is employed to determine how many adjacent line segments need to be merged. Finally we use the least squares method to fit the detected straight lines. The comparative experimental results verify that the proposed algorithm has achieved better results than the line segment detector (LSD).
Background modeling is the critical technology to detect the moving target for video surveillance. Most background modeling techniques are aimed at land monitoring and operated in the spatial domain. A background establishment becomes difficult when the scene is a complex fluctuating sea surface. In this paper, the background stability and separability between target are analyzed deeply in the discrete cosine transform (DCT) domain, on this basis, we propose a background modeling method. The proposed method models each frequency point as a single Gaussian model to represent background, and the target is extracted by suppressing the background coefficients. Experimental results show that our approach can establish an accurate background model for seawater, and the detection results outperform other background modeling methods in the spatial domain.
In this paper, two main algorithms about monocular visual odometry is introduced based on 3D-2D motion estimation. A 3D-2D motion estimation method needs to maintain a consistent and accurate set of triangulated 3D features and to create 3D-2D feature matches. Therefore, a keyframe selection strategy is proposed to construct the precise 3D point sets. Based on this strategy, an algorithm is designed to get more proper keyframes by restricting the number of feature points and taking translation amount into account. This keyframe selection strategy will discard inferior frames and construct more precise 3D point sets. We also designed a method to filter 3D-2D feature matches in two different ways. This method contributes to estimating camera pose more accurately. The effectiveness and feasibility of the proposed algorithms were verified in both KITTI outdoor dataset and a real indoor environment. The result of experiment showed that our algorithms can recover the motion trajectory of the camera accurately. And it meet the requirements of real-time and accuracy in monocular visual odometry.
The discriminant cut is used to segment the oil spills in synthetic aperture radar (SAR) images. The proposed approach is a region-based one, which is able to capture and utilize spatial information in SAR images. The real SAR images, i.e. ALOS-1 PALSAR and Sentinel-1 SAR images were collected and used to validate the accuracy of the proposed approach for oil spill segmentation in SAR images. The accuracy of the proposed approach is higher than that of the fuzzy C-means classification method.
With the development of sensors and computer vision research community, cameras, which are accurate, compact, wellunderstood and most importantly cheap and ubiquitous today, have gradually been at the center of robot location. Simultaneous localization and mapping (SLAM) using visual features, which is a system getting motion information from image acquisition equipment and rebuild the structure in unknown environment. We provide an analysis of bioinspired flights in insects, employing a novel technique based on SLAM. Then combining visual and inertial measurements to get high accuracy and robustness. we present a novel tightly-coupled Visual-Inertial Simultaneous Localization and Mapping system which get a new attempt to address two challenges which are the initialization problem and the calibration problem. experimental results and analysis show the proposed approach has a more accurate quantitative simulation of insect navigation, which can reach the positioning accuracy of centimeter level.
Subject to the complex battlefield environment, it is difficult to establish a complete knowledge base in practical application of vehicle recognition algorithms. The infrared vehicle recognition is always difficult and challenging, which plays an important role in remote sensing. In this paper we propose a new unsupervised feature learning method based on K-feature to recognize vehicle in infrared images. First, we use the target detection algorithm which is based on the saliency to detect the initial image. Then, the unsupervised feature learning based on K-feature, which is generated by Kmeans clustering algorithm that extracted features by learning a visual dictionary from a large number of samples without label, is calculated to suppress the false alarm and improve the accuracy. Finally, the vehicle target recognition image is finished by some post-processing. Large numbers of experiments demonstrate that the proposed method has satisfy recognition effectiveness and robustness for vehicle recognition in infrared images under complex backgrounds, and it also improve the reliability of it.
When dealing with high-resolution digital images, detection of feature points is usually the very first important step. Valid feature points depend on the threshold. If the threshold is too low, plenty of feature points will be detected, and they may be aggregated in the rich texture regions, which consequently not only affects the speed of feature description, but also aggravates the burden of following processing; if the threshold is set high, the feature points in poor texture area will lack. To solve these problems, this paper proposes a threshold auto-adjustment method of feature extraction based on grid. By dividing the image into numbers of grid, threshold is set in every local grid for extracting the feature points. When the number of feature points does not meet the threshold requirement, the threshold will be adjusted automatically to change the final number of feature points The experimental results show that feature points produced by our method is more uniform and representative, which avoids the aggregation of feature points and greatly reduces the complexity of following work.
In this paper, we proposed a Highly Coupled Network (HCNet) for joint objection detection and semantic segmentation. It follows that our method is faster and performs better than the previous approaches whose decoder networks of different tasks are independent. Besides, we present multi-scale loss architecture to learn better representation for different scale objects, but without extra time in the inference phase. Experiment results show that our method achieves state-of-the-art results on the KITTI datasets. Moreover, it can run at 35 FPS on a GPU and thus is a practical solution to object detection and semantic segmentation for autonomous driving.
Multi-spectral image contains abundant spectral information, which is widely used in all fields like resource exploration, meteorological observation and modern military. Image preprocessing, such as image feature extraction and matching, is indispensable while dealing with multi-spectral remote sensing image. Although the feature matching algorithm based on linear scale such as SIFT and SURF performs strong on robustness, the local accuracy cannot be guaranteed. Therefore, this paper proposes an improved KAZE algorithm, which is based on nonlinear scale, to raise the number of feature and to enhance the matching rate by using the adjusted-cosine vector. The experiment result shows that the number of feature and the matching rate of the improved KAZE are remarkably than the original KAZE algorithm.
Improved CORF model of simple cell combined with non-classical receptive field and its application on edge detection
Simple cells in primary visual cortex are believed to extract local edge information from a visual scene. In this paper, inspired by different receptive field properties and visual information flow paths of neurons, an improved Combination of Receptive Fields (CORF) model combined with non-classical receptive fields was proposed to simulate the responses of simple cell’s receptive fields. Compared to the classical model, the proposed model is able to better imitate simple cell’s physiologic structure with consideration of facilitation and suppression of non-classical receptive fields. And on this base, an edge detection algorithm as an application of the improved CORF model was proposed. Experimental results validate the robustness of the proposed algorithm to noise and background interference.
Precise parallax detection through definition evaluation and the adjustment of the assembly position of the objective lens or the reticle are important means of eliminating the parallax of the telescope system, so that the imaging screen and the reticle are clearly focused at the same time. An adaptive definition evaluation function based on Susan-Zernike moments is proposed. First, the image is preprocessed by the Susan operator to find the potential boundary edge. Then, the Zernike moments operator is used to determine the exact region of the reticle line with sub-pixel accuracy. The image definition is evaluated only in this related area. The evaluation function consists of the gradient difference calculated by the Zernike moments operator. By adjusting the assembly position of the objective lens, the imaging screen and the reticle will be simultaneously in the state of maximum definition, so the parallax can be eliminated. The experimental results show that the definition evaluation function proposed in this paper has the advantages of good focusing performance, strong anti-interference ability compared with the other commonly used definition evaluation functions.