A modular neural network classifier has been applied to the problem of automatic target recognition using forward- looking infrared (FLIR) imagery. The classifier consists of several independently trained neural networks operating on features extracted from a local portion of a target image. The classification decisions of the individual networks are combined to determine the final classification. Experiments show that decomposition of the input features results in performance superior to a fully connected network in terms of both network complexity and probability of classification. Performance of the classifier is further improved by the use of multi-resolution features and by the introduction of a higher level neural network on the top of expert networks, a method known as stacked generalization. In addition to feature decomposition, we implemented a data decomposition classifier network and demonstrated improved performance. Experimental results are reported on a large set of FLIR images.
This paper describes a Neural Network based target detection system for Forward-Looking Infrared (FLIR) imagery. We apply a series of four algorithms (detection, two layers of clutter rejection and one of centering) to successively reduce the False Alarm Rate while maintaining a high probability of detection (Pd). The detection stage scans the entire image to find regions approximately the size of a target with pixel statistics that differ from their local background. The clutter rejection stages eliminate portions of these detections, while the centering algorithm moves each detection to the point near it which is most like prior examples of perfectly centered targets. The system was trained and tested on a large set of second generation FLIR data.
Detecting objects in images containing strong clutter is an important issue in a variety of applications such as medical imaging and automatic target recognition. Artificial neural networks are used as non-parametric pattern recognizers to cope with different problems due to their inherent ability to learn from training data. In this paper we propose a neural approach based on the Random Neural Network model (Gelenbe 1989, 1990, 1991, 1993), to detect shaped targets with the help of multiple neural networks whose outputs are combined for making decisions.
There are two basic unknowns in ATR: (1) target types, and (2) parametric representations of their occurrence (such as pose, location, thermal profile etc). This paper addresses the question: what metrics can be used: (1) for optimization in parameter space, and (2) for analyzing target recognition performance?
Enhancing image quality and combining observations into a coherent description are essential tools in various image processing applications such as multimedia publishing, target recognition, and medical imaging. In this paper we propose two novel approaches for image enlargement and image fusion using the Random Neural Network (RNN) model, which has already been successfully applied to the problems such as still and moving image compression, and image segmentation. The advantage of the RNN model is that it is closer to biophysical reality and mathematically more tractable than standard neural methods, especially when used as a recurrent structure.
A collection of related N by M images, such as a set of faces, may be modeled by a manifold embedded in an NM- dimensional Euclidean space called an image manifold. With the modeling of image spaces as manifolds, geometrical properties of image manifolds can be studied either theoretically or experimentally. A practical result of the investigation of image manifolds provides an insight into image source entropy (i.e., image compressibility), a subject about which, oddly, little is known. The investigation begins with the most basic properties of a manifold, its dimension and its curvature. The study of dimensionality reveals a high embedding ratio, which gives promise of very high compression rates. The curvature of image manifolds is shown to be large indicating that application of traditional linear transform techniques may not fulfill this promise.
A number of novel adaptive image compression methods have been developed using a new approach to data representation, a mixture of principal components (MPC). MPC, together with principal component analysis and vector quantization, form a spectrum of representations. The MPC network partitions the space into a number of regions or subspaces. Within each subspace the data are represented by the M principal components of the subspace. While Hebbian learning has been effectively used to extract principal components for the MPC, its stability is still a concern in practice. As a result, computationally more expensive methods such as batch eigendecomposition have produced more consistent results. This paper compares the performance of a number of Hebbian- based training schemes for the MPC network. These include training the entire network, network growing techniques, and a new tree-structured method. In the new tree-structured approach, each level in the tree, M, corresponds to an M- dimensional representation. A node and all its M - 1 parents represents a single M-dimensional subspace or class. The evaluation shows that the use of tree-structured approach improves training and results in reduced squared error.
In the research, we introduced an artificial neural network model named as coupled lattice neural network to reconstruct an original image from a degrade one in the blind deconvolution, where the original image and blurring function are not known. In the coupled lattice neural network, each neuron connects with its nearest neighbor neurons, the neighborhood corresponds to the weights of the neural network and is defined by a finite domain. Outputs of neurons shows the intensity distribution of an estimated original image. Weights of each neuron correspond to an estimated blur function and are the same for different neurons. The coupled lattice neural network includes two main operations, one is a nearest neighbor coupling or diffusion, the other is a local nonlinear reflection and learning. First a rule for a blur function growing is introduced. Then the coupled lattice neural network implements an estimated original image evolving based on an estimated blur function. Moreover we define a growing error criterion to control the evolution of the coupled lattice neural network. Whenever the error criterion is minimized, the coupled neural network gets stable, then outputs of the neural network correspond to the reconstructed original image, the weights are the blur function. In addition we demonstrate a method for the option of initial state variables of the coupled lattice neural network. The new approach to blind deconvolution can recover a digital binary image successful. Moreover the coupled lattice neural network can be used in the reconstruction of a gray-scale image.
Multi-valued neurons are the neural processing elements with complex-valued weights, huge functionality (it is possible to implement arbitrary mapping described by partial-defined multiple-valued function on the single neuron), fast converged learning algorithms. Such features of the multi- valued neurons may be used for solution of the different kinds of problems. Special kind of neural network with multi-valued neurons for image recognition will be considered in the paper. Such a network analyzes the spectral coefficients corresponding to low frequencies. A quickly converged learning algorithm and example of face recognition are also presented. The next application of multi-valued neurons proposed in this paper is their using as basic elements of a cellular neural network. Such an approach makes it possible to implement high effective non- linear multi-valued filters. These filters are very effective for reduction of Gaussian, uniform and speckle noise. They are also highly effective for solution of the frequency correction problem. A correction of the high and medium spatial frequencies using multi-valued filters leads to highly effective extraction of details and local contrast enhancement. Two methods for solution of the super- resolution problem using prediction of high frequency coefficients on multi-valued neuron, and correction of the high frequency part of spectrum by multi-valued filtering are proposed also.
Neural networks have been applied to many kinds of image processing with well performance. When dealing with the large image, a large number of neurons is required so as to (1) make the construction model more complex, (2) make the speed of processing slower than the traditional methods due to heavy computation load. In this paper, an encoder- segmented neural network is constructed for image segmentation in which the available data can be obtained by a weight matrix containing maximum region information when a large number of input data are compressed by encoder network, meantime, the fuzzy clustering strategy applied on Hopfield neural network for the fine segmentation eliminates the tedious work of finding weighting factors. The experimental results indicate the performance of image segmentation can be improved effectively.
In this paper, a new 3D reconstruction approach for 3D object recognition in neuro-vision system is presented. First, a phase based stereo matching using Hopfield neural network approach is presented, the stereo matching problems are treated in frequency domain by using local phase. Instead of matching feature or texture of images, the stereo matching process is performed by using local phases of left image and right image in stereo image pair. The Hopfield neural network is adopted to implement the stereo matching process. A suitable architecture of neural network is established, so that the computation can be implemented efficiently in parallel. A suitable matching function is created by using the local phase property. The energy function for neural network is constructed with satisfying some necessary constraints. The stereo matching process them is carried to find the minimum energy corresponding to the solution of the problem. Second, a 3D object reconstruction neural networks is constructed by using BP neural network. So the 3D configuration and shape can be reconstructed by this neural network. With multiple neural networks the 3D reconstruction processes can be performed in parallel. The examples for both synthetic and real images are shown in the experiment, and good results are obtained.
We present an improved unbiased algorithm for determining principal curves in high dimensional spaces, and then propose two novel applications of principal curve to feature extraction and pattern classification--the Principal Curve Feature Extractor (PCFE) and the Principal Curve Classifier (PCC). The PCFE extracts features from a subset of principal curves computed via the principal components of the input data. With its flexible partitioning choice and non- parametric nature, the PCFE is capable of modeling nonlinear data effectively. The PCC is a general non-parametric classification method that involves computing a principal curve template for each class during the training phase. In the test or application phase, an unlabeled data point is assigned the class label of the nearest principal curve template. PCC performs well for non-gaussian distributed data and data with low local intrinsic dimensionality. Experiments comparing the PCC to established classification methods are performed on selected benchmarks from the UC Irvine machine learning database and the PROBEN1 benchmark dataset, to highlight situations where PCC is advantageous for feature extraction, data characterization, and classification.
As we published in the last few years, when the given input- output training vector pairs satisfy a PLI (positive-linear- independency) condition, the training of a hard-limited neural network to recognize untrained patterns can be achieved noniteratively with very short training time and very robust recognition. The key feature in this novel pattern recognition system is the use of slack constants in solving the connection matrix when the PLI condition is satisfied. Generally there are infinitely many ways of selecting the slack constants for meeting the training- recognition goal, but there is only one way to select them if an optimal robustness is sought in the recognition of the untrained patterns. This particular way of selecting the slack constants carries some special physical properties of the system--the automatic feature extraction in the learning mode and the automatic feature competition in the recognition mode. Physical significance as well as mathematical analysis of these novel properties are to be explained in detail in this paper. Real-time experiments are to be presented in an unedited movie. It is seen that in the system, the training of 4 hand-written characters is close to real time (< 0.1 sec.) and the recognition of the untrained hand-written characters is > 90% accurate.
HNC developed a unique context vector approach to image retrieval in Image Contrast Addressable Retrieval System. The basis for this approach is the context vector approach to image representation. A context vector is a high dimensional vector of real numbers, derived from a set of features that are useful in discriminating between images in a particular domain. The image features are trained based upon the constrained 2D self-organizing learning law. The image context vector encodes both intra-image features and inter-image relationship. The similarity in the directions of the context vectors of a pair of images indicates their similarity of content. The context vector approach to image representation simplifies the image and retrieval indexing problem because simple Euclidean distance measurements between sets of context vectors are used as a measure of similarity.
Automated annotation and analysis of video sequences requires efficient methods to abstract video information. The identification of shots in video sequences is an important step for summarizing the content of the video. In general, video shots need to be clustered to form more semantically significant units, such as scenes and sequences. In this paper, we describe a neural network based technique for automatic clustering of video frame signatures. The proposed technique utilizes Self Organizing Map (SOM) and/or Parallel Collision Control Network (PCC) to automatically produce a set of prototype vectors useful in the following process of scene segmentation. Results presented in this paper show that the SOM network perform efficiently, operating without requiring `a priori' knowledge about the number of shots present in the video. When we require the segmentation of a video composed by similar shots, the PCC network is suitable for its capability to preserve the acquired information.