This paper presents entropy constrained fuzzy clustering and learning vector quantization algorithms and their application in image compression. Entropy constrained fuzzy clustering (ECFC) algorithms were developed by minimizing an objective function incorporating the fuzzy partition entropy and the average distortion between the feature vectors, which represent the image data, and the prototypes, which represent the codevectors or codewords. The reformulation of fuzzy c-means (FCM) algorithms provided the basis for the development of fuzzy learning vector quantization (FLVQ) algorithms and essentially established a link between clustering and learning vector quantization. Minimization of the reformulation function that corresponds to ECFC algorithms using gradient descent results in entropy constrained learning vector quantization (ECLVQ) algorithms. These algorithms allow the gradual transition from a maximally fuzzy partition to a nearly crisp partition of the feature vectors during the learning process. This paper presents two alternative implementations of the proposed algorithms, which differ in terms of the strategy employed for updating the prototypes during learning. The proposed algorithms are tested and evaluated on the design of codebooks used for image data compression.
In this paper we propose a new scalable predictive vector quantization (PVQ) technique for image and video compression. This technique has been implemented using neural networks. A Kohonen self-organized feature map is used to implement the vector quantizer, while a multilayer perceptron implements the predictor. Simulation results demonstrate that the proposed technique provides a 5 - 10% improvement in coding performance over the existing neural networks based PVQ techniques.
In this paper, we describe a genetic learning neural network system to vector quantize images directly to achieve data compression. The genetic learning algorithm is designed to have two levels: One is at the level of code words in which each neural network is updated through reproduction every time an input vector is processed. The other is at the level of code-books in which five neural networks are included in the gene pool. Extensive experiments on a group of image samples show that the genetic algorithm outperforms other vector quantization algorithms which include competitive learning, frequency sensitive learning and LBG.
This work proposes a method for using an identity-mapping backpropagation (IMBKP) neural network for binary image compression, aimed at reducing the dimension of the feature vector in a NN-based pattern recognition system. In the proposed method, the IMBKP network was trained with the objective of achieving good reconstruction quality and a reasonable amount of image compression. This criteria is very important, when using binary images as feature vectors. Evaluation of the proposed network was performed using 800 images of handwritten signatures. The lowest and highest reconstruction errors were, respectively, 3.05 multiplied by 10-3% and 0.01%. The proposed network can be used to reduce the dimension of the input vector to a NN-based pattern recognition system without almost and degradation and, yet, with a good reduction in the number of input neurons.
The redundancy of the multiresolution representation has been clearly demonstrated in the case of fractal images, but has not been fully recognized and exploited for general images. This paper presents a new image coder in which the similarity among blocks of different subbands is exploited by block prediction based on neural network. After a pyramid subband decomposition, the detail subbands are partitioned into a set of uniform non-overlapping blocks. In order to speed up the coding procedure and improve the coding efficiency, a new classifying criteria is presented, the blocks are classified into two sets: the simple block set and the edge block set. In our proposed method, the edge blocks are predicted from blocks in lower scale subband with same orientation through neural network. The simple blocks and predictive edge error blocks are coded with an arithmetic coder. Simulation results show that the method presented in this paper is a promising coding technique which is worth further research.
This paper evaluates the performance of a system which compresses digital mammograms. In digital mammograms, important diagnostic features such as the microcalcifications appear in small clusters of few pixels with relatively high intensity compared with their neighboring pixels. These image features can be preserved in a compression system that employs a suitable image transform which can localize the signal characteristics in the original and the transform domain. Image compression is achieved by first decomposing the mammograms into different subimages carrying different frequencies, and then employing vector quantization to encode these subimages. Multiresolution codebooks are designed by the Linde-Buzo- Gray (LBG) algorithm and a family of fuzzy algorithms for learning vector quantization (FALVQ). The main advantage of the proposed approach is the design of separate multiresolution codebooks for different subbands of the decomposed image that carry different orientation and frequency information. The experimental results confirm the viability of the proposed compression scheme on digital mammograms.
In the research described by this paper, we implemented and evaluated a linear self-organized feedforward neural network for image compression. Based on the generalized Hebbian learning algorithm (GHA), the neural network extracts the principle components from the auto-correlation matrix of the input images. To do so, an image is first divided into mutually exclusive square blocks of size m multiplied by m. Each block represents a feature vector of m2 dimension in the feature space. The input dimension of the neural net is therefore m2 and the output dimension is m. Training based on GHA for each block then yields a weight matrix with dimension of m multiplied by m2, rows of which are the eigenvectors of the auto-correlation matrix of the input image block. Projection of each image block onto the extracted eigenvectors yields m coefficients for each block. Image compression is then accomplished by quantizing and coding the coefficients for each block. To evaluate the performance of the neural network, two experiments were conducted using standard IEEE images. First, the neural net was implemented to compress images at different bit rates using different block sizes. Second, to test the neural networks's generalization capability, the sets of principle components extracted from one image was used for compressing different but statistically similar images. The evaluation, based on both visual inspection and statistical measures (NMSE and SNR) of the reconstructed images, demonstrates that the network can yield satisfactory image compression performance and possesses a good generalization capability.
The ML parameter estimation and the neural network based methods for classifying the textures are compared in this paper. The comparison is based on the correct classification percentage. Certain constraints have been imposed on the classifiers which are using the same sample size, same number of features and same number of training and test feature vectors for both the classifiers. The classifiers use the energy of the dominant channels of a tree-structured wavelet transform as features. Experiments are performed with textures from the Brodatz album. All the textured images are of size 256 by 256 pixels with 256 gray levels. Selection of best feature set has been arrived at using the 'leave one out' approach. The results indicate that both the classifiers give comparable performance. However, the governing factors for their choice are the number of training samples, number of features, and the computational complexity for both the classifiers, and the size of the network, in specific, for the neural network.
A modular neural network classifier has been applied to the problem of automatic target recognition (ATR) using forward- looking infrared (FLIR) imagery. This modular network classifier consists of several neural networks (expert networks) for classification. Each expert network in the modular network classifier receives distinct inputs from features extracted from only a local region of a target, known as a receptive field, and is trained independently from other expert networks. The classification decisions of the individual expert networks are combined to determine the final classification. Our experiments show that this modular network classifier is superior to a fully connected neural network classifier in terms of complexity (number of weights to be learned) and performance (probability of correct classification). The proposed classifier shows a high noise immunity to clutter or target obscuration due to the independence of the individual neural networks in the modular network, Performance of the proposed classifier is further improved by the use of multi-resolution features and by the introduction of a higher level neural network on the top of expert networks, a method known as stacked generalization.
Besides the variety of fonts, character recognition systems for the industrial world are confronted with specific problems like: the variety of support (metal, wood, paper, ceramics . . .) as well as the variety of marking (printing, engraving, . . .) and conditions of lighting. We present a system that is able to solve a part of this problem. It implements a collaboration between two neural networks. The first network specialized in vision allows the system to extract the character from an image. Besides this capability, we have equipped our system with characteristics allowing it to obtain an invariant model from the presented character. Thus, whatever the position, the size and the orientation of the character during the capture are, the model presented to the input of the second network will be identical. The second network, thanks to a learning phase, permits us to obtain a character recognition system independent of the type of fonts used. Furthermore, its capabilities of generalization permit us to recognize degraded and/or distorted characters. A feedback loop between the two networks permits the first one to modify the quality of vision.The cooperation between these two networks allows us to recognize characters whatever the support and the marking.
The main objective of this work was to investigate the use of 'sensor based real time decision and control technology' applied to actively control the arrestment of aircraft (manned or unmanned). The proposed method is to develop an adaptively controlled system that would locate the aircraft's extended tailhook, predict its position and speed at the time of arrestment, adjust an arresting end effector to actively mate with the arresting hook and remove the aircraft's kinetic energy, thus minimizing the arresting distance and impact stresses. The focus of the work presented in this paper was to explore the use of fuzzy adaptive resonance theorem (fuzzy art) neural network to form a MSI scheme which reduces image data to recognize incoming aircraft and extended tailhook. Using inputs from several image sources a single fused image was generated to give details about range and tailhook characteristics for an F18 naval aircraft. The idea is to partition an image into cells and evaluate each using fuzzy art. Once the incoming aircraft is located in a cell that subimage is again divided into smaller cells. This image is evaluated to locate various parts of the aircraft (i.e., wings, tail, tailhook, etc.). the cell that contains the tailhook provides resolved position information. Multiple images from separate sensors provides opportunity to generate range details overtime.
In this paper, we newly apply a genetic and simulated annealing hybrid heuristic to encode optimal filter for optical pattern recognition. Simulated annealing as a stochastic computational technique allows for finding near globally-minimum-cost solutions with cooling schedule. Using the advantages of a parallelizable genetic algorithm (GA) and a simulated annealing algorithm (SA), the optimum filters are designed and implemented. The filter having 128 multiplied by 128 pixel size consists of the stepped phase that causes the discrete phase delay. The structure of this can be divided into rectangular cells such that each cell imparts a discrete phase delay of 0 approximately equals 2 pi[rad] to the incident wave front. Eight-phase stepped filters that we designed are compared with phase only matched filter and cosine-binary phase only filter. It is deeply focused on investigating the performance of the optimum filter in terms of recognition characteristics on the translation, scale and rotation variations of the image, and discrimination properties against similar images. By GA/SA hybrid heuristic, the optimum filter is realized for high efficiency optical reconstruction in spite of decreasing iteration number needed to encode it by respective algorithms.
Reverse engineering is the process of generating accurate three-dimensional CAD models from measured surface data. The coordinate data is segmented and then approximated by numerous parametric surface patches for an economized CAD representation. Most parametric surface fitting techniques manipulate large non-square matrices in order to interpolate all points. Furthermore, the interpolation process often generates high-order polynomials that produce undesirable oscillations on the reconstructed surface. The Bernstein basis function (BBF) network is an adaptive approach to surface approximation that enables a Bezier surface to be reconstructed from measured data with a pre-determined degree of accuracy. The BBF network is a two-layer architecture that performs a weighted summation of Bernstein polynomial basis functions. Modifying the number of basis neurons is equivalent to changing the degree of the Bernstein polynomials. An increase in the number of neurons will improve surface approximation, however, too many neurons will greatly diminish the network's ability to correctly interpolate the surface between the measured points. The weights of the network represent the control points of the defining polygon net used to generate the desired Bezier surface. The location of the weights are determined by a least-mean square (LMS) learning algorithm. Once the learning phase is complete, the weights can be used as control points for surface reconstruction by any CAD/CAM system that utilizes parametric modeling techniques.
The determination of the regularization parameter is an important sub-problem in optimizing the performances of image restoration systems. The parameter controls the relative weightings of the data-conformance and model- conformance terms in the restoration cost function. A small parameter value would lead to noisy appearances in the smooth image regions due to over-emphasis of the data term, while a large parameter results in blurring of the textured regions due to dominance of the model term. Based on the principle of adopting small parameter values for the highly textured regions for detail emphasis while using large values for noise suppression in the smooth regions, a spatially adaptive regularization scheme was derived in this paper. An initial segmentation based on the local image activity was performed and a distinct regularization parameter was associated with each segmented component. The regional value was estimated by viewing the parameter as a set of learnable neuronal weights in a model-based neural network. A stochastic gradient descent algorithm based on the regional spatial characteristics and specific functional form of the neuronal weights was derived to optimize the regional parameter values. The efficacy of the algorithm was demonstrated by our observation of the emergence of small parameter values in textured regions and large values in smooth regions.
This paper describes a Markov random field (MRF) approach to image segmentation. Unlike most previous MRF techniques, which are based on pixel-classification, this approach groups pixels that are similar. This removes the need to know the number of image classes. Mean field theory and multigrid processing are used in the subsequent optimization to find a good segmentation and to alleviate local minimum problems. Variations of the MRF approach are investigated by incorporating features/schemes motivated by characteristics of the human vision system (HVS). Preliminary results are promising and indicate that multi-grid and HVS based features/schemes can significantly improve segmentation results.
It is proved analytically, whenever the input-output mapping of a one-layered, hard-limited perceptron satisfies a positive, linear independency (PLI) condition, the connection matrix A to meet this mapping can be obtained noniteratively in one step from an algebraic matrix equation containing an N multiplied by M input matrix U. Each column of U is a given standard pattern vector, and there are M standard patterns to be classified. It is also analytically proved that sorting out all nonsingular sub-matrices Uk in U can be used as an automatic feature extraction process in this noniterative-learning system. This paper reports the theoretical derivation and the design and experiments of a superfast-learning, optimally robust, neural network pattern recognition system utilizing this novel feature extraction process. An unedited video movie showing the speed of learning and the robustness in recognition of this novel pattern recognition system is demonstrated in life. Comparison to other neural network pattern recognition systems is discussed.
A practical approach to continuous-tone color image segmentation is proposed. Unlike traditional algorithms of image segmentation which tend to use threshold methods we intend to show how neural network technique can be successfully applied to this problem. We used a back- propagation network architecture in this work. It was assumed that each image pixel has its own color, which is somehow correlated with those of the nearest neighborhood. To describe the color properties of a certain neighborhood we suggested nine component feature vectors for every image pixel. This set of feature components is applied to the network input neurons. By this means, every image pixel is described by the following values R, G and B (color intensities), Mr, Mg and Mb (averages of intensities of the nearest neighborhood), (sigma) r, (sigma) gland (sigma) b (r.m.s. deviations of color intensities). To estimate the algorithm efficiency the scalar criterion was proposed. It was shown by the results of comparative experiment that neural segmentation provides more efficiency than that of traditional, using threshold methods.