A modular clutter-rejection technique that uses region-based principal component analysis (PCA) is proposed. A major problem in FLIR ATR is the poorly centered targets generated by the preprocessing stage. Our modular clutter-rejection system uses static as well as dynamic region of interest (ROI) extraction to overcome the problem of poorly centered targets. In static ROI extraction, the center of the representative ROI coincides with the center of the potential target image. In dynamic ROI extraction, a representative ROI is moved in several directions with respect to the center of the potential target image to extract a number of ROIs. Each module in the proposed system applies region-based PCA to generate the feature vectors, which are subsequently used to make a decision about the identity of the potential target. Region-based PCA uses topological features of the targets to reject false alarms. In this technique, a potential target is divided into several regions and a PCA is performed on each region to extract regional feature vectors. We propose using regional feature vectors of arbitrary shapes and dimensions that are optimized for the topology of a target in a particular region. These regional feature vectors are then used by a two-class classifier based on the learning vector quantization to decide whether a potential target is a false alarm or a real target. We also present experimental results using real-life data to evaluate and compare the performance of the clutter-rejection systems with static and dynamic ROI extraction.
Some important ideas of image recognition using neural network based on multi-valued neurons are being developed in this paper. We are going to discuss the recognition of color images, which is reduced to recognition of gray-scale images. An approach, which has been developed, is illustrated by simulation results. Recognition of distortion (blur) types, distortion parameters and recognition of images with distorted training set using the same neural network is also considered. At this time Gaussian blur and motion blur were taken as distortions. This part of work is also illustrated by simulation results.
A real-time hybrid distortion-invariant OPR system was established to make 3D multiobject distortion-invariant automatic pattern recognition. Wavelet transform technique was used to make digital preprocessing of the input scene, to depress the noisy background and enhance the recognized object. A three-layer backpropagation artificial neural network was used in correlation signal post-processing to perform multiobject distortion-invariant recognition and classification. The C-80 and NOA real-time processing ability and the multithread programming technology were used to perform high speed parallel multitask processing and speed up the post processing rate to ROIs. The reference filter library was constructed for the distortion version of 3D object model images based on the distortion parameter tolerance measuring as rotation, azimuth and scale. The real-time optical correlation recognition testing of this OPR system demonstrates that using the preprocessing, post- processing, the nonlinear algorithm os optimum filtering, RFL construction technique and the multithread programming technology, a high possibility of recognition and recognition rate ere obtained for the real-time multiobject distortion-invariant OPR system. The recognition reliability and rate was improved greatly. These techniques are very useful to automatic target recognition.
Image resizing is an important operation that is used extensively in document processing to magnify or reduce images. Standard approaches fit the original data with a continuous model and then resample this 2D function on a few sampling grid. These interpolation methods, however, apply an interpolation function indiscriminately to the whole image. The resulting document image suffers from objectionable moire patterns, edge blurring and aliasing. Therefore, image documents must often be segmented before other document processing techniques, such as filtering, resizing, and compression can be applied. In this paper, we present a new system to segment and label document images into text, halftone images, and background using feature extraction and unsupervised clustering. Once the segmentation is performed, a specific enhancement or interpolation kernel can be applied to each document component. In this paper, we demonstrate the power of our approach to segment document images into text, halftone, and background. The proposed filtering and interpolation method results in a noticeable improvement in the enhanced and resized image.
Adaptive spline interpolation (which is equivalent to the use of a type of radial basis function neural network) is investigated for digital image interpolation (i.e., for resolution enhancement). Test image results indicate that adaptive spline interpolation of a low-resolution image is superior to non-adaptive interpolation if the adjustable parameters are chosen to yield the best match to a known object in a corresponding high-resolution image.
For a raw picture data set in either binary or gray-scaled digital form, we can first apply a pixel-quantization method to condense the picture to a much smaller file. Then we can use a math-graphic program such as Microsoft Visual Basic to compute its center of mass (CM). From this CM, we can then construct a polar coordinate with M sectors and N rings. If we apply a normalized Magnitude Fourier Transform to these M sectors and a normalized Hankel transform to these N rings, we will obtain two numerical series truncated at P and Q terms (e.g., P equals Q equals 16). We can then construct a P+Q (or 32) dimension ANALOG vectors. This vector may be used as the pre-processed image vectors for feeding to any neural network (including the noniterative neural networks we presented in the last 8 years) for training and learning.
We propose a method for learning and generating image textures based on learning the weights of a recurrent Multiple Class Random Neural Network (MCRNN) from the color texture image. The network we use has a neuron which corresponds to each image pixel, and the local connectivity of the neurons reflects the adjacent structure of neighboring neurons. The same trained recurrent network is then used to generate a synthetic texture that imitates the original one. The proposed texture learning technique is efficient and its computation time is small. Texture generation is also fast. This work is a refinement and extension of our earlier work where we considered learning of grey-level textures and the generation of grey level or color textures. We have tested our method with different synthetic and natural textures. The experimental results show that the MCRNN can efficiently model a large category of color homogeneous microtextures. Statistical feature extracted from the co-occurrence matrix of the original and the MCRNN based texture are used to confirm the quality of fit of our approach.
Modern video encoding techniques generate variable bit rates, because they take advantage of different rates of motion in scenes, in addition to using lossy compression within individual frames. We have introduced a novel method for video compression based on temporal subsampling of video frames, and for video frame reconstruction using neural network based function approximations. In this paper we describe another method using wavelets for still image compression of frames, and function approximations for the reconstruction of subsampled frames. We evaluated the performance of the method in terms of observed traffic characteristics for the resulting compressed and subsampled frames, and in terms of quality versus compression ratio curves with real video image sequences. Comparisons are presented with other standard methods.
In this paper, the architecture of an all-round pattern recognition system is described together with some of the results obtained when dealing with the characterization of infected areas in retinal angiograms. Several tricky aspects of the implementation are discussed. In particular, special attention is given to the feature-extraction scheme and the learning data base pertinency.
In 3D Computer Vision a relevant problem is to match a `source' image dataset with a `target' image dataset, that is to find the rule that controls the modification of the global characteristics of the source in such a way as to match the target. The matching problem can be faced using a neural net approach, where the nodes are related to the image voxels and the synapses to the voxel information, e.g. locations, grey values, gradients, angles. This paper presents the `Volume-Matcher 3D' project, an approach for a data-driven comparison and registration of 3D images. The approach proposes a neural network model derived from the `self organizing maps' and extended in order to match a full 3D data set of a `source volume' with the 3D data set of a `target window'. The algorithms developed have been tested on real cases of interest in medical imaging. The results have been evaluated on the basis of both the Mean Square Error and the visual analysis, performed by an expert, of the result volume. The software has been implemented on a high performance PC using AVS/ExpressTM software package for volume reconstruction: `polytri' based algorithms have been used for this purpose.
Content of an image can be expressed in terms of different features such as color, texture, shape, or text annotations. Retrieval based on these features can be various by the way how to combine the feature values. Most of the existing approaches assume a linear relationship between different features, and also require the user to directly assign weights to features. In particular, as the number of feature classes increases, intuition about how to pick relative weights among features is lost. While this linear combining approach establishes the basis of content-based image retrieval (CBIR), the usefulness of such systems was limited due to the difficulty in representing human perception subjectivity. In this paper, we introduce a Neural Network- based Image Retrieval system, a human-computer interaction approach to CBIR using Radial Basis Function network. This approach determines nonlinear relationship between features so that more accurate similarity comparison between images can be supported and allows the user to submit a coarse initial query and continuously refine his information need via relevance feedback. The experimental results show that the proposed approach has the superior retrieval performance than the existing approaches such as linearly combining approach, the rank-based method, and the BackPropagation- based method.
The need for content-based image retrieval is staggering given the information explosion. The advantage of systems for content-based retrieval is their effectiveness to find relevant images when searching in large collections. We believe that an improved technique for retrieval can be achieved by combining techniques. So, we approach the problem in two stages: an organization and retrieval phase, employing different techniques in each. First phase classifies the images based on a preliminary feature extraction progress. This feature extraction gathers a set of statistical parameters. After the extraction, the images are classified employing an ART2 neural network model of 12 inputs and 16 outputs. Adaptive resonance architectures are networks that self-organize stable pattern recognition codes in real-time in response to arbitrary sequences of input patterns. The retrieval phase allows searching and relies on a wavelet matching process. This process works on representative images, and has been successful in information retrieval tasks. The wavelet Haar transform employed can be estimated quickly and tends to produce blocks int eh image details. The wavelet process is applied in L2. A similarity assessment process with an associated measure is necessary for wavelet matching. We report in detail the phases, and the preliminary results already achieved.
In this paper, an adequately domain-independent approach is presented where local features can characterize multimedia data using Neural Networks (ANN) and Support Vector Machines (SVM). In our previous work, we have shown that classification in content-based retrieval requires non- linear mapping of feature space. This can normally be accomplished by ANN and SVM. However, they inherently lack the capability to deal with meaningful feature evaluation and large dimensional feature space in the sense that they are inaccurate and slow. These defects can be overcome by employing meaningful feature selection on the basis of discriminative capacity of a feature. The experiments on database consisting of real video sequences show that the speed and accuracy of SVM can be improved substantially using this technique, while execution time can be substantially reduced for ANN. The comparison also shows that improved SVM turns out to be a better choice than ANN. Finally, it is shown that generalization in learning is not affected by reducing the dimension of the feature space by our method.
A new approach to neural network (NN) application for local recognition of images with mixed noise is put forward. Although some pixels in images can be corrupted by spikes the proposed technique permits to eliminate uncertainty observed in this case and to correctly recognize the pixels that, in fact, correspond to an edge, a homogeneous region or an small-sized object. For this purpose a procedure of recognition of one among four basic hypotheses and additional determination of spike properties within the scanning window is proposed. This recognition task is performed for two groups of outputs (classes) of one common NN. The problems of NN learning and structure selection for this case are discussed. The performance of neural network classifier is analyzed for different input data types and for various characteristics of noise. An improvement in correct recognition is shown for the proposed approach in comparison to previous work.
A neural network based system to identify images transmitted through a Coherent Fiber-optic Bundle (CFB) is presented. Patterns are generated in a computer, displayed on a Spatial Light Modulator, imaged onto the input face of the CFB, and recovered optically by a CCD sensor array for further processing. Input and output optical subsystems were designed and used to that end. The recognition step of the transmitted patterns is made by a powerful, widely-used, neural network simulator running on the control PC. A complete PC-based interface was developed to control the different tasks involved in the system. An optical analysis of the system capabilities was carried out prior to performing the recognition step. Several neural network topologies were tested, and the corresponding numerical results are also presented and discussed.
Morphological Neural Networks (MNN) have been proposed as an alternative neural computation paradigm. In this paper we explore the potential of Heteroassociative MNN (HMNN) for a vision based practical task, that of self-localization in a vision-based navigation framework for mobile robots. HMNN have a big potential for real time application because its recall process is very fast. We present some experimental results that illustrate the proposed approach.
This paper presents a novel optical-electronic shape recognition system based on synergetic associative memory. Our shape recognition system is composed of two parts: the first one is feature extraction system; the second is synergetic pattern recognition system. Hough transform is proposed for feature extraction of unrecognized object, with the effects of reducing dimensions and filtering for object distortion and noise, synergetic neural network is proposed for realizing associative memory in order to eliminate spurious states. Then we adopt an approach of optical- electronic realization to our system that can satisfy the demands of real time, high speed and parallelism. In order to realize fast algorithm, we replace the dynamic evolution circuit with adjudge circuit according to the relationship between attention parameters and order parameters, then implement the recognition of some simple images and its validity is proved.
The objective of this work is to generate a learning machine capable of find solutions for complex image processing task by Cellular Neural Network (CNN's). First a general machine for automatic analog algorithm design independent of the problem to solve is created, this is accomplished through an evolutionary strategy that is an extension of genetic programming. Second, this work introduces a suite of sub- mechanisms that increase the power of genetic programming and contribute to reduce the enormous space search for producing a plentiful search. Some concepts in this section are related with AI theory, in such a way that in this work we are in the intersection field of AI and Image Processing by CNN.
We present the pyramidal wavelet coder implemented with a Cellular Neural Network architecture, as an example of a Cellular Neural Network application, considering that some times it is extremely desired the massive and real-time processing and this kind of architecture fits very well for such purposes. The pyramidal wavelet coder works performing the image wavelet transform plus threshold and quantization operations. The wavelet transform consists essentially in a bank of filters, where an image is passed through them repeatedly, and after each filtering a sampling operation is performed. Once image has been filtered and sampled according the rules of the pyramidal image coder, the threshold operation is carried out, where we pretend to keep only the most significant wavelet coefficients. Finally, a quantization operation is performed in order to translate the coefficient values to a discrete environment.
In this paper we apply neural network techniques and physically based models to determine the surface shape of chips from scanning electronic microscopy images. Deducing some specific feature's vertical cross-section within an integrated circuit from 2D top down scanning electron microscope images of the feature surface is a difficult `inverse problem' which arises in semiconductor fabrication. This paper refines our previous work on the reconstruction of semiconductor wafer surface shapes from top down electron microscopy images. One of the approaches we have developed directly maps from the CD-SEM intensity waveforms to line profiles. The other novel method we describe is based on an approximate physical model, where we assume a simplified mathematical representation of the physical process that produces the SEM image from the electron beam's interaction with the feature surface. Our results are illustrated with a variety of real data sets.