The need for hierarchical statistical tools for modeling and processing image data, as well as the success of Markov random fields (MRFs) in image processing, have recently given rise to a significant research activity on hierarchical MRFs and their application to image analysis problems. Important contributions, relying on different models and optimization procedures, have thus been recorded in the literature. This paper presents a synthetic overview of available models and algorithms, as well as an attempt to clarify the vocabulary in this field. We propose to classify hierarchical MRF-based approaches as explicit and implicit methods, with appropriate subclasses. Each of these major classes is defined in the paper, and several specific examples of each class of approach are described.
This paper presents an application of Markov field theory to image coding. First we use Markov random fields to model the correlation in the image intensity fields. We then propose a noncausal predictive image coding scheme in which the estimation of present pixel is based on both past and future neighboring pixels. A decoding algorithm is proposed to perfectly reconstruct the image from estimation residuals at the decoder. Open-loop and closed-loop quantizer structures are implemented for noncausal prediction and performances are compared with conventional DPCM predictive coding.
In this paper we address the intricate issue of recovering (long range) velocity fields between consecutive frames of an image sequence. Within the Bayesian estimation framework, we design a global objective function to be minimized. This energy function is classically composed of two terms. The first one reinforces the fragile modeling of the optical flow constraint equation making use of robust estimators, while the second (a priori) term incorporates a discontinuity preserving smoothness constraint. A multiresolution definition of this differential estimation method aims at accessing long range displacements in a coarse-to- fine incremental way. As for the associated successive minimizations, they are processed through a very efficient deterministic multigrid relaxation algorithm.
This paper presents a class of nonlinear hierarchical algorithms for the fusion of multiresolution image data in low-level vision. The approach combines nonlinear causal Markov models defined on hierarchical graph structures, with standard Bayesian estimation theory. Two random processes defined on simple hierarchical graphs (quadtrees or 'ternary graphs') are introduced to represent the multiresolution observations at hand and the hidden labels to be estimated. Two optimal algorithms (inspired from Viterbi's algorithm) are developed on the quadtree structure. The first one gives the exact solution for the MAP (maximum a posteriori) estimator. The second one gives, on the same structure, the global minimum of an energy function more relevant than the MAP criterion. These algorithms are noniterative, estimates being obtained within two passes on the graph structure. They are compared to an extension of the multiscale algorithm proposed by Bouman et al. which is adapted here to multiresolution data fusion.
In this paper, a hierarchical version of the segmentation of motion with Markov field is used in order to detect moving objects in two successive images. We use the same hierarchical method which guarantees convergence for the initialization of the sequence of images and for the iterations between successive images. The model we present here is composed of two main parts, one is defined at label scale, meaning at the scale at which we want to estimate the final segmentation, and the other one on regions. These two terms are optimized alternatively, each of them by using relaxation method of type iterated conditional modes.
The four classes of mathematical morphology elementary operators: dilations, erosions, anti- dilations, and anti-erosions have proved to be of fundamental importance of the decomposition/representation of any mapping between complete lattices. In this paper, we are concerned with the characterization of translation invariant window elementary operators (with window W) that transform a gray-level image with finite range K1 into a gray-level image with possibly different finite range K2. Three types of characterization are presented. In the first characterization, called 'characterization by confrontation' each elementary operator depends on a family of mappings from W to K1, called structuring element. In the second characterization, called 'characterization by selection' each elementary operator depends on a family of mappings from W to K2, called impulse response. Finally, in the third characterization, called 'characterization by decomposition' each elementary operator depends on a family of mappings for K1 to K2, called elementary look up tables.
Computational mathematical morphology provides zeta-function-based representation for windowed, translation-invariant image operators taking their values in a complete lattice. Image operators are induced via windowing by product lattice operators and, in both the increasing and nonincreasing cases, these reduce to classical logical representation for binary operators. The present paper presents the image-operator theory for increasing filters. In particular, it treats gray-to-binary and gray-to-gray morphological operators, as well as representation of lattice-valued stack filters via threshold decomposition.
We propose a software architecture for picture processing that allows efficient memory management when algorithms with many operators are applied to large images, and that allows automated parallelization. This architecture relies on image tiling and operators with a call back function that evaluates image tiles on demand. Several tiling strategies with and without overlapping are discussed. The compexity of this evaluation strategy is hidden to application programs. This is shown with a sample program. This architecture is well suited for neighborhood operators such as convolutions and mathematical morphology operators.
Image extrema are often used for locating the structures present in an image. Their extraction and their selection is a classic image segmentation preprocessing problem. One of the most powerful morphological tools for selecting significant extrema in a grayscale is to use their dynamics. However, a drawback of this technique is that minima and maxima (the dark and light structures they point out) are processed independently. We show in the paper that using the dynamics comes down to measuring the persistence of image minima (resp. maxima) when processing the image via a increasing (resp. decreasing) family of contrast filters. This principle can be generalized to any increasing family of morphological filters by reconstruction and leads to a general method for valuating image minima with respect to any criterion: size, shape, contrast. This proposition is still true for families of alternating filters whose main characteristic is to have a self dual behavior. In this paper we concentrate on this point. A symmetrical equivalent for the dynamics is defined and an efficient technique of computation is proposed. One of its key concept is a merging tree of extrema. The usefulness of this notion in image segmentation applications is also illustrated.
The presence of diffuse far-infrared emission from the interstellar dust which has the form of the so-called 'Galactic-cirrus' has made the detection and flux determination of faint extragalactic sources difficult, especially near the Galactic plane. This effect is most serious at long infrared wavelengths around (lambda) 100 micrometers , and is especially obvious in sky survey images made with the infrared astronomical satellite (IRAS) at those wavelengths. We describe the development of a filter designed to remove the cirrus emission from the IRAS images using classification and morphological operations. The technique, based upon 'sieving', involves extracting the size information of the objects to form a growth cube, and then classifying the growth information with the K-means method. This allows the cirrus emission to be distinguished from other forms of emission in the images. The growth characteristic of the cirrus is then used to remove the cirrus components from the growth for each pixel for each field making extragalactic infrared emission more observable. This filtering process was applied to various fields detected by IRAS and the cirrus noise filtered successfully.
It has been recently shown that morphological openings and closings can be viewed as consistent MAP estimators of morphologically smooth signals in i.i.d. noise. We revisit this viewpoint under a different set of assumptions, which allows the explicit incorporation of geometric and morphological contraints into the noise model, i.e., the noise may now exhibit geometric structure; surprisingly, it turms out that this affects neither the optimality nor the consistency of these filters.
The morphological top-hat transform is often used to locate bright peaks in a gray-scale image. The method can be problematic when there are two classes of peaks, one corresponding to valid objects and the other to noise. The present paper employs Bayesian estimation in conjunction with a multinomial distribution corresponding to levels of peak heights in the top-hat image to arrive at an optimal conditional-expectation estimator for the number of images in a random sample of images that contain a given number of valid peaks.
Assuming a random shape to be governed by a random parameter vector, a basic problem is to estimate the value of the parameter vector given some set of random features based on the random shape. The present paper considers this Bayesian estimation problem as one involving conditional densities of the random parameters conditioned by granulometric moments generated by linear granulometries. The conditional densities are interpreted as generalized functions and from these the optimal conditional-expectation estimates of the parameters given the granulometric moments are found.
This paper presents mathematical morphology tools for 3D image analysis, namely, the geodesic granulometries and the neck histogram. The family of openings which constitutes the geodesic granulometries is parameterized by the radius of the digital disks utilized as structuring elements. We demonstrate the validity of the granulometry thus obtained. The resulting granulometric distributions are determined by the underlying metric associated with the digital disks. Next we propose an algorithm to compute the neck histogram, which is an analysis tool that gives statistical information concerning the occurrence of constrictions in the object studied. Finally, we demonstrate the applicaiton of the proposed analysis tools in the characterization of a 3D experimental sample designed as a model for a porous medium.
In (linear or nonlinear) diffusive scale-space representations, local variations of the luminance field with respect to infinitesimal scale transitions are described via a first-order parabolic partial differential equation modeling a generalized diffusion process. A geometric characterization of the scale-space structure is then classically derived by analyzing the properties of the deformation flow induced by scale transitions along specific geometric structures embedded on the photometric surface. In particular, studying the simultaneous deformation of the dual families of curves consisting of isophotes and stream lines of the luminance field yields a Euclidean-invariant geometric description of generalized diffusion processes. In this paper, the generalized diffusion equation is interpreted within the framework of the relativistic electromagnetic (EM) theory as a Lorentz gauge condition expressing the trace-invariance of an EM quadripotential with covariant scalar and contravariant vector components respectively related to luminance and geometric properties of the image. This gauge condition determines an EM quadrifield and quadricharge which satisfy Maxwell equations. Deriving the general expressions of these quadrivectors as functions of Euclidean characteristics of isophotes and stream lines leads to identifying Lorentz-invariants which synthetize under an extremely compact form intrinsic multiscale image properties. In addition, weak formulations of diffusive scale-spaces are consistently re-expressed in terms of Em energy density. The specific cases of linear scale-spaces, corresponding to purely electric fields, and of classical anisotropic diffusion models are studied in detail, providing a significant insight in the understanding of the deep structure of diffusive scale-spaces.
We describe a Bayesian estimator for simultaneous restoration and segmentation of images. The estimator is based on a pixel-line Markov random field and is computed by using an efficient approximation. The approximation is based on locality of interactions within the Markov random field. An example, the simultaneous restoration and segmentation of a medical tomographic image, is described.
A model-based method for reconstructing the 3D structure of icosahedrally-symmetric viruses from solution x-ray scattering is presented. An example of the reconstruction, for data from cowpea mosaic virus, is described. The major opportunity provided by solution x-ray scattering is the ability to study the dynamics of virus particles in solution, information that is not accessible to crystal x-ray diffraction experiments.
A VLSI implementable, massively parallel, digital, stochastic neural network architecture with on-chip learning is described which can be used to address video/image compression applications. Color information in images is usually encoded as a means to reduce the data needed to transmit high resolution video images over a lower bandwidth communication system. A neural net approach can be used for such a compression scheme since it can be trained to map a set of patterns from a k-dimensional space to a 1D space very easily. The training algorithm is implemented as a cross correlation between previously calculated weights and global errors. These simple calculations are performed by each processing unit with information available in their local memories, thus, no transfer of information between neurons is necessary. This feature allows for the synchronous updating of all the weights, which optimizes the inherent parallel nature of the neural network architecture, making the design an excellent candidate for VLSI implementation as an SIMD architecture. By incorporating the learning portion on-chip, an appreciable reduction in computing time is possible. The stochastic nature of the learning algorithm, and a simulated 'annealing' process, allows it to convergence to a global minimum. The architecture is synthesized from an HDL description using powerful design and analysis tools which allow for many performance trade- offs and fault coverage to be done by the compiler before the final design is sent for fabrication.
When using approaches for solving imaging problems such as maximum likelihood or a Bayesian decision rule, massive amounts of data are involved. In order to make the implementation on computers attainable and not overly CPU-intensive, approximations to optimal solutions are often chosen, or nonoptimal solutions sought. In this paper we present a novel solution to the problem of solving for a maximum a posteriori estimator that uses genetic algorithms to search the solution space and a new statistical model called partially ordered Markov models (POMMs). We apply the procedure to the problem of parameter fitting to stochastic models for texture. POMMs are a subclass of Markov random fields that have been shown to offer computational advantages over general Markov rnadom fields. POMMs are based on partial orderings of the lattice array. Among other properties, these models have an exact closed-form joint distribution. We show that POMMs can be used successfully for parameter fitting to texture data. A genetic algorithm is used for approximation of a solution of the maximum likelihood estimator. We also show simulated textures representing samples of the solutions found.
This paper presents a comparative study of three deterministic unsupervised image segmentation algorithms. All of the three algorithms basically make use of a Markov random field (MRF) and try to obtain an approximate solution to the maximum likelihood or the maximum a posteriori estimates. Although the three algorithms are based on the same stochastic image models, they adopt different ways to incorporate model parameter estimation into the iterative region label updating procedure. The differences among the three algorithms are identified and the convergence properties are compared both analytically and experimentally.
The Bayesian approach combined with the Markov random field approach provide a powerful and consistent mathematical framework for taking into account a priori knowledge and for regularizing ill-posed problems. Applied to 3D x-ray vascular reconstruction, such a combining approach requires a 3D object model describing the vascular tree. To take into account characteristic features of blood vessels, the proposed model performs a sort of shape analysis in order to estimate non-stationary parameters of the Markovian model. The global energy function is then expressed as a weighted combination of an adaptive smoothing potential which favors smoothing along the vessel direction; an enhancing potential which increases the contrast of small vessels; and a data-dependent term based on the difference between reprojection of the 3D reconstructed object and observed projections.
Binary image analysis problems can be solved by set operators implemented as programs for a binary morphological machine (BMM). This is a very general and powerful approach to solve this type of problem. However, the design of these programs is not a task manageable by nonexperts on mathematical morphology. In order to overcome this difficulty we have worked on tools that help users describe their goals at higher levels of abstraction and to translate them into BMM programs. Some of these tools are based on the representation of the goals of the user as a collection of input-output pairs of images and the estimation of the target operator from these data. PAC learning is a well suited methodology for this task, since in this theory 'concepts' are represented as Boolean functions that are equivalent to set operators. In order to apply this technique in practice we must have efficient learning algorithms. In this paper we introduce two PAC learning algorithms, both are based on the minimal representation of Boolean functions, which has a straightforward translation to the canonical decomposition of set operators. The first algorithm is based on the classical Quine-McCluskey algorithm for the simplification of Boolean functions, and the second one is based on a new idea for the construction of Boolean functions: the incremental splitting of intervals. We also present a comparative complexity analysis of the two algorithms. Finally, we give some application examples.
The pattern spectrum method of moments has been used by the author to estimate the shape distribution in random binary images with unknown grain size. Simultaneous estimation of the parameters of the grain size distribution and the mixture proportions of the specified shapes required solution of simultaneous nonlinear equations and was demonstrated only for a small dimensionality. The generalization of the method permits a larger dimensionality by simplifying the estimation equations. Using an iterative approach to normal grain size distribution, the estimation equations are shown to be : 1) linear in the proportions, 2) cubic in micrometers , 3) quadradic in O2 at each stage. An example demonstrates that the iterative process converges rapidly to an accurate estimate of the parameters.
This paper discusses the application of high-order neural networks (HONNs) for image recognition and image enhancement of digitized images. A key property of neural networks is their ability to recognize invariances and extract essential parameters from complex high- dimensional data. The most significant advantage of the HONN over first-order networks is that invariances to geometric transformations can be incorporated into the network and need not be learned through iterative weight updates. A third-order HONN can be used to achieve translation, scale, and rotation invariant recognition with a significant reduction in training time over other neural net paradigms such as the multilayer perceptron. We have developed a model based on a third-order net that can be trained with various images. Simulation results show that the model is able to perform very well with images embedded in noise. It is also shown that this method outperforms the Hamming net. Our model has also been applied to another difficult and computationally-complex problem: human face recognition. We put forth arguments for the use of isodensity information in the recognition algorithm. A method of image recognition that fuses isodensity information and neural networks is described and its merits over other image recognition methods are expounded. It is shown that isodensity information coupled with the use of an 'adaptive threshold' strategy yields a system that is to a high degree unperturbed by image contrast noise. Simulation results for these applications are presented in the paper.
In this paper we examine the use of geometric modeling in grouping and classification. A neural network approach is suggested to combine the completeness of information provided by a geometric modeler with the uncertainty required in the grouping process. The utilization of highly connectionist systems in such an environment is investigated. The advantages and shortcomings of these systems are discussed. The impact of cellular atrophy, as observed in biological systems, on highly connectionist systems is proposed. The effect of atrophy criterion on systems' size and efficiency is examined.
A neural network architecture that is capable of assisting as well as providing valuable information in diagnosing faults in digital circuits is presented here. Once a digital circuit has been constructed, it is required to test the circuit in order to find out how well the digital circuit works. The neural network architecture presented here will be useful on a wide variety of circuits, both analog and digital, and efficient for guiding the search and forming fault and test hypotheses in diagnostic trouble-shooting of electronic circuits.
Traditional neural networks such as backpropagation begin with a set of decision boundaries and optimize the network by moving the boundaries. The problem with this approach is a large number of iterations is required and the network can easily be stuck in a local minima. The algorithm presented here rapidly creates boundaries when necessary and destroys boundaries when they become obsolete. Optimization is achieved by a 'survival of the fittest' boundaries approach. Since the individual boundaries are not optimized the algorithm does not require iterations and trains the network very quickly. The algorithm is well suited for high- dimensional analog inputs and analog outputs.
Mathematical morphology (MM) is one of the most efficient tools in advanced digital image processing. Morphological techniques have been successfully applied in such cases as: image analysis, smoothing, enhancement, edge detection, skeletonization, filtering, and segmentation (watershed algorithms). Two essential operations of MM are dilation and erosion and can be implemented in several different ways. In our paper we propose their effective implementation by using higher order neural network approach (functional-link network). The novel structure and its learning method is presented. Some other neural network methods for MM operations are shown and compared with our approach.
Nonlinear extrapolation of the image spectrum is stated as a consequence of new mathematical results about definite positive extension or covariance extension. The main contribution of this method is its ability to incorporate a priori knowledge like positivity or some edges early detected in the image. The consequence is the gain in resolution when spatial or MRI images are considered.