Grayscale morphology has been widely used in image processing, especially in noise removal. In this paper, we find an optimal solution for designing a grayscale morphological filter. An adaptive algorithm is developed for determining, from a given class of grayscale morphological filters, a filter which minimizes the mean square error between its output and a desired process. The adaptation using the conventional least mean square algorithm optimizes the grayscale structuring element in a given search area. The performance of noise removal is compared to another class of nonlinear filters, i.e., adaptive and nonadaptive stack-based filters.
This paper analyzes restoration of subtractive noise on a binary image by a single morphological operation, dilation. Restoration by dilation alone is appropriate under particular explicitly defined random noise models, based respectively on erosion, independent pixel subtractive noise, and independent pixel subtractive noise followed by dilation. Since in general it is not possible to perfectly restore subtractive noise we use the Hausdorf metric to measure the residual error in restoration. This metric is the appropriate one because of its geometric interpretation in terms of set coverings. We describe a search procedure to find a structuring element for dilation that is optimal in the sense of minimizing the mean Hausdorf error. The search procedure's utility function is based on the calculation of certain probabilities related to the noise model, namely the probability of one set being the subset of another set and some related probabilities.
A selected morphology operation on a binary image is called a thickening by Serra. It is simply a union of the image with a hit or miss transform. The definition can be extended to a gray level image by applying the operation to each binary level in a threshold decomposition. A difficulty is that the operation is not increasing, and the resulting threshold decomposition consists of stacks that contain holes. In this sense, the thickening operator leaves an image that is a multivalued function. A single valued function can be defined as a projection of the stacks onto the spatial dimension. This is called a projected thickening, and is different than the traditional umbra representation of functions.
The binary hit-or-miss transform is applied to filter digital gray-scale signals. This is accomplished by applying a union of hit-or-miss transforms to an observed signal's umbra and then taking the surface of the filtered umbra as the estimate of the ideal signal. The hit-or-miss union is constructed to provide the optimal mean-absolute-error filter for both the ideal signal and its umbra. The method is developed in detail for thinning hit-or-miss filters and applies at once to the dual thickening filters. It requires the output of the umbra filter to be an umbra, which in general is not true. A key aspect of the paper is the complete characterization of umbra-preserving union-of-hit-or-miss thinning and thickening filters. Taken together, the mean-absolute-error theory and the umbra-preservation characterization provide a full characterization of binary hit-or-miss filtering as applied to digital gray-scale signals. The theory is at once applicable to hit-or-miss filtering of digital gray-scale signals via the three- dimensional binary hit-or-miss transform.
In this paper, we analyze the deterministic and the statistical properties of soft morphological filters and their extensions. This analysis offers methods to design well performing soft morphological filters. We derive some detail preservation properties and study the noise attenuation properties of certain filters. Special attention is paid to the effect of varying parameters on the behavior of filters. Understanding these effects is essential in designing optimal filters.
The classical Matheron representation of gray-scale filters considers images to possess infinite range, in particular so that they are range translation invariant. A key aspect of the theory is that binary morphology algebraically embeds into gray-scale with binary images possessing gray range minus infinity and zero. While this structure causes no algebraic problems, it does create both topological and probabilistic difficulties. In particular, the theory of optimal gray- scale filters does not contain the theory of optimal binary filters as a special case, and the optimal gray-scale filter takes finite-range images, say [0,M], and yields images with range [-M,2M]. These anomalies are mitigated by the theory of computational morphology. Here, morphological filters preserve the gray range and possess very simple Matheron-type representations. Besides range preservation for finite-range images, the key difference in computational morphology is that a filter possesses a vector of bases, not a single basis. The salient feature remains, that of the filter being represented in terms of erosions.
Mean-absolute-error-optimal, finite-observation, translation-invariant, binary-image filters have previously been characterized in terms of morphological representations: increasing filters as unions of erosions and nonincreasing filters as unions of hit-or-miss operators. Based upon these characterizations, (sub)optimal filters have been designed via image-process realizations. The present paper considers the precision of filter estimation via realizations. A key point: while precision deteriorates for both erosion and hit-or-miss filters as window size increases, the number of image realizations required to obtain good estimation in erosion-filter design can be much less than the number required for hit-or-miss-filter design. Thus, while in theory optimal hit-or-miss filtering is better because the unconstrained optimal hit-or-miss filter is the conditional expectation, owing to estimation error it is very possible that estimated optimal erosion filters are better than estimated optimal hit-or-miss filters.
We present a modification to the standard least-median-of-squares (LMedS) algorithm for removal of outliers in range data which retains the merits of the conventional approach. The surface model generated around one point may be shared by other points in its neighborhood; this reduces substantially the time for the algorithm to generate local surface models. Like the conventional LMedS approach, the algorithm is most applicable to data points from locally smooth surfaces.
In linear predictive coding, especially in adaptive linear predictive coding, channel errors tend to propagate destroying significant portions of an image. In this paper, an adaptive weighted median predictive coding scheme which is robust against channel errors is proposed. Neither overhead information nor error correction coding is needed. The scheme is applied to real images and is shown to provide much improved performance at high bit error rates (BER).
In this paper, an adaptive WMMR filter is introduced, which adaptively changes its window size to accommodate edge width variations. We prove that for any given one dimensional input signal convergence is to fixed points, which are PICO (piecewise constant), by iterative application of the adaptive WMMR filter. An application of the filters to one-D data (non- PICO) and images of printed circuit boards are then provided. Application to images in general is discussed.
In this paper, we introduce the value-and-criterion filter structure and give an example of a filter with the structure. The value-and-criterion filter structure is based on morphological opening (or closing), which is actually two filters applied sequentially: the first assigns values based on the original image values, and the second assigns values based on the results of the first. The value-and-criterion structure is similar, but includes an additional step in parallel to the first that computes a different set of values to use as criteria for selecting a final value. Value-and-criterion filters have a `value' function (V) and a `criterion' function (C), each operating separately on the original image, and a `selection' operator (S) acting on the output of C. The selection operator chooses a location from the output of C, and the output of V at that point is the output of the overall filter. The value-and-criterion structure allows the use of different linear and nonlinear elements in a single filter, but also provides the shape control of morphological filters. An example of a value-and-criterion filter is the mean of least variance (MLV) filter, which we define to have V equals mean, C equals variance, and S equals minimum. The MLV filter resembles several earlier edge-preserving smoothing filters, but performs better and is more flexible and more efficient. The MLV filter smoothes homogeneous regions and enhances edges, and is therefore useful in segmentation algorithms. We illustrate its response to various image features and compare it to the median filter on different biomedical images.
The nonlinear digital filtering methods which are based on threshold decomposition and positive Boolean functions are becoming widely used. They are in use especially in image processing and biomedical signal processing, both of which usually involve large amounts of data. It is thus important to develop fast algorithms to realize weighted median weighted order statistics and more general stack filtering operations. The amounts of data are so large that fast algorithms are needed no matter how powerful the computers are that are available. In the following paper we introduce new algorithms for some of these problems and discuss the use of logical matrix algorithms for the solution of nonlinear digital filtering problems.
Vector directional filters (VDF) for multichannel image processing are introduced and analyzed in this paper. These filters separate the processing of vector-valued signals into the directional processing and the magnitude processing. This provides a link between single- channel image processing, where only magnitude processing is essentially performed, and multichannel image processing where both the direction and the magnitude of the image vectors play an important role in the resulting (processed) image. The intuitive motivation behind VDF is explained and properties of these filters are derived. It is shown that many similar properties with the vector median filter (VMF) exist. The specific case of color image filtering is studied as an important example of multichannel image processing. It is shown that VDF can achieve very good filtering results for various noise source models.
The proposed design is a circuit that determines the order statistics of an arbitrary length, N, string of numbers as the string is acquired. The computational time complexity for the design is optimally O(N). This is a valuable filtering method, e.g., the median filter, for suppressing and quite often eliminating impulse noise while preserving the edges of objects in an image. Two other closely related methods of order statistic filtering, stack filters, and morphological algebra, are determined to be very limited with regard to the number of template or window elements that can be used in the filtering process and not practical for two-dimensional imagery. The comparator stack circuit does not have this restriction. Also, the comparator stack method operates at or near the same cost function as a single convolution operation with the same template configuration, thus O(N).
Morphological filtering is known for its flexibility in locally modifying geometrical features of three dimensional data, or image functions. The topic in this paper is on adaptive thresholding of multilevel image functions to extract application-specific features from grayscale images. By adaptive thresholding, we mean that the process of binarizing grayscale images is locally adjusted. The geometric features to be extracted are furnished by specifics from the application requirements, e.g., a binary version is needed from a photo to extract letters from a car license plate such that the binarized image is specifically representing the information about letters around the license plate while ignoring other background information. A contour function is used as the adaptive thresholding layer for the grayscale image. After the first thresholding, a binarized version is obtained and then local geometric parameters about the binary image are measured through a skeletonization process. The parameters from skeletonization are compared with the feature descriptions and a contour function is redefined and used for adaptively thresholding the grayscale image. A skeletonization process is then applied to the binarized image to extract local geometric parameters to meet the application- specific requirement. Application of the developed adaptive thresholding algorithm includes examples in text image binarization, object feature binarization against surrounding background, and glass flaw detection.
A common problem with binary images generated by a segmentation algorithm is to split the domains either into different objects or an object into different parts. While it is easy to do this interactively by drawing lines in the image it is a much more difficult task to formulate rules for this operation in a computer language and thus automate the procedure. This paper presents a fast algorithm that yields results very similar to an interactive splitting procedure for the domains of a binary image. The algorithm is based on watershed segmentation using distance transformations. We let a pixel belong to a watershed line if at least two neighbors belong to differently labeled segments. We have criteria for relabeling segments which do not become large enough to form segments of their own. After having labeled all pixels we replace every watershed line with the lines with shortest distances. The algorithm preserves the shape and the number of segments with good accuracy and is also independent of how the domains are rotated in the image.
A line sketch of a 3-dimensional scene provides important information about objects in the scene and about their spatial relationships. Considerable effort has been reported in the literature describing methods of extracting line sketches from 2-dimensional projected images of scenes. In this paper a projection based approach is described for detecting boundaries in an image having linear segments. This method can be utilized for the recognition of polyhedral objects or for the identification of familiar objects for mobile robot location. The computational performance of this algorithm is greatly enhanced by operating directly on the gray-scale image and by starting with a coarse (low resolution) image and moving successively to higher resolutions at regions of interest (i.e., brightness discontinuities). A complete circular projection of the higher level image is computed and the discontinuities in this projection are located using the derivatives of the projection at fixed angles. Peaks in this function correspond to lines in the original image. A vertex extraction algorithm has been developed to establish the edge extent in the image. These methods have been implemented and tested with real and synthetic images and are believed to be better than methods that use edge enhancement/thresholding and then employ the Hough transform to isolate lines in the image.
Pattern theory is a combination of pattern recognition, machine learning, switching theory, and computational complexity technologies with the central theme that the pattern in a function can be found by minimizing the complexity of a particular generalized representation. The sense of `pattern' used in pattern theory has been demonstrated to be very robust. This paper develops a pattern theoretic approach to image restoration. We assume that an original, patterned, binary image has been corrupted by additive noise and is given as a gray-scale image. The decision theoretic approach to restoration would be simply to threshold the gray- scale image to regain a binary image. The pattern theoretic approach is to use two thresholds. These thresholds separate the pixels into three classes: pixels that were very probably white, pixels that were very probably black, and pixels that we are less certain about. We then use only those pixels that we are confident about and find the pattern based on those pixels. Finally, we use this pattern to extrapolate through the pixels that are uncertain. The amount of noise that can be abated depends on the strength of the underlying pattern. This relationship is developed for uniform and normal noise distributions.
A nonlinear filtering technique is developed for the extraction of quasi-linear features from images. The technique begins with a line segment image transform, which is a windowed version of the radon transform. The original image is divided into overlapped subimages using a simple analysis filter, and a radon transform is applied to each subimage, yielding a representation of the subimage in terms of line segments at varying angles and positions. The amplitude of each line segment is used to calculate the detection statistic for that line segment. A nonlinear filter is used to pass line segments whose detection statistics are above a specified threshold (it is shown that a threshold is an optimal Bayes estimate when the signal is intermittent). The filtered subimage is then reconstructed.
A gray tone image taken of a real scene will contain inherent ambiguities due to light dispersion on the physical surfaces. The neighboring pixels may have very different intensity values and yet represent the same surface region. In this paper, a fuzzy set theoretic approach to representing, processing, and quantitatively evaluating the ambiguity in gray tone images is presented. The gray tone digital image is mapped into a two-dimensional array of singletons called a fuzzy image. The value of each fuzzy singleton reflects the degree to which the intensity of the corresponding pixel is similar to the neighboring pixel intensities. The inherent ambiguity in the surface information can be modified by performing a variety of fuzzy mathematical operations on the singletons. Once the fuzzy image processing operations are complete, the modified fuzzy image can be converted back to a gray tone image representation. The ambiguity associated with the processed fuzzy image is quantitatively evaluated by measuring the uncertainty present both before and after processing. Computer simulations are presented in order to illustrate some of these notions.
The approach presented in this paper is a procedure for detecting local pattern orientation of fingerprint images. Also, it is a reliable method for detecting singular points of these images. A fingerprint is a directional image consisting of many ridges in different directions and its structural information lies in the position and direction of its constituent ridges. The core part of a fingerprint processing system is detecting local ridges orientations which is done by using some directional operators. The area of support of these operators is a major concern. Using a large area, details of flow deflection are not sensed and with a small area of support, the direction is not detected correctly because of existing noise in the image. To take advantage of both small and large area sizes, we consider several area sizes in a hierarchical order. Starting with a large area size, the pattern orientation is detected at every block of the image and is used to validate and correct the direction of smaller subblocks in the next level of the hierarchy. This procedure continues until obtaining reliable pattern directions at sufficiently small blocks. We shall also address the problem of detecting singular points of fingerprints and shall show that they are detected correctly and accurately using this hierarchical approach.
A fuzzy quantization technique is proposed for the color quantization problem. The proposed method efficiently exploits the human visual perception of color. The developed technique successfully selects limited colors from the visual spectrum while maintaining excellent image quality with the color image displayed on a graphic system with a low number of display colors. Experimental results are provided for comparison of traditional color quantization techniques.
An adaptive multi-dimensional interpolation technique for irregularly gridded data based on a regularized linear spline is described. The regularization process imposes a penalty or energy function which depends upon a sum of quadratic functions of the error at the data points and the gradient and curvature of the surface. The weighting of a given term in the penalty function is made to depend non-linearly on the first and second differences in the regularly gridded interpolation of the data. As a result the method is able to provide an interpolation which is sensitive to the local behavior of the data being interpolated. For example, data containing a discontinuity or crease can be smoothed to reduce noise without smoothing the discontinuity or crease. For the 2-D problem, the technique is analogous to a rectangular grid of stiff extensible rods defining an interpolation surface, with springs resisting: the displacement of the surface from the known data values; the extension of the rods, and the bending of one rod with respect to another. The weights in the penalty function are equivalent to a non-linear spring characteristic for which the spring constant is reduced at large displacements. For a given set of weights, the penalty function is quadratic. This leads to a set of linear equations which can be solved efficiently using iterative techniques. Implementations of the technique for irregular 2-D and 3-D data are described and results are presented.
Both biological visual systems and image understanding systems are forced by resource limitations to reduce input data to their essential part, to keep the amount of data manageable in succeeding processing levels and to maintain a realistic chance to achieve real-time performance even for complex tasks. We present a new design and implementation of a visual attention control system (GOAL) for a significant reduction of data while maintaining salient information. The visual attention system deals not with synthetic images or simple static images but is developed for complex dynamic real image sequences emphasizing arbitrary traffic scenes recorded from a car built-in camera. GOAL is part of the image understanding system MOVIE for real-time interpretation of traffic scenes and supports the model-based scene analysis (MOSAIC) by directing high-level vision processes to salient regions. Based on a model of human attention (the guided search model from Wolfe and Cave) requirements for the module `visual attention' of an image understanding system are derived. GOAL combines different knowledge sources (both motion and shape-oriented) to achieve a robust, spatially restricting, and expectation-driven attention control system. The knowledge sources consist of four very basic image operations, namely (1) enhanced difference image method, (2) direct depth method, (3) local symmetry detection, and (4) 2D - 3D line movement. Each knowledge source contributes to an accumulated evidence for the existence of attention fields. The knowledge sources are temporally stabilized by using a Kalman filter. The nonlinear combination of multiple knowledge sources makes the selection of attention fields much more robust even with merely increasing computational power. This is shown with results from various real image sequences.
Many computational vision routines can be regarded as recognition and retrieval of echoes in space or time. Cepstral analysis is a powerful nonlinear adaptive signal processing methodology widely used in many areas such as: echo retrieval and removal, speech processing and phoneme chunking, radar and sonar processing, seismology, medicine, image deblurring and restoration, and signal recovery. The aim of this paper is: (1) To provide a brief mathematical and historical review of cepstral techniques. (2) To introduce computational and performance improvements to power and differential cepstrum for use in detection of echoes; and to provide a comparison between these methods and the traditional cepstral techniques. (3) To apply cepstrum to visual tasks such as motion analysis and trinocular vision. And (4) to draw a brief comparison between cepstrum and other matching techniques. The computational and performance improvements introduced in this paper can e applied in other areas that frequently utilize cepstrum.
In this paper, we introduce a method to design gray scale composite morphological operators as fuzzy neural networks. In this structure, synaptic weights are represented by a gray scale structuring element. The proposed method is a two-step procedure. First, a suitable neural topology is found through the basis functions of the composite operators. Second, a learning rule based on the average least mean square is applied where each synaptic weight is found through a back propagation algorithm. One dimensional examples are shown. This scheme can be easily extended to two dimensions.
In an earlier paper, the authors introduced binary automata neural networks, which can be considered a further development of the Hopfield model and BAM. Hopfield model is a one- state automata neural network with only one synaptic connection matrix in the alphabet. BAM is a special two-state automata neural network. In general, there can be any number of states and there can be any number of synaptic connection matrices in the alphabet. In this paper, we first systematically introduce the automata network. The original automata network is called union network in this paper. Several new types of automata networks are developed in this paper. Then we study the stability problem of the automata networks for several cases.
We apply a novel optoelectronic neural network to recognize a set of characters from the alphabet. The network consists of a 15 X 1 binary input vector, two optoelectronic vector matrix multiplication layers, and a 15 X 1 binary output layer. The network utilizes a pair of custom fabricated spatial light modulators (SLMs) with 90 levels of gray scale per pixel. The SLMs realize the matrix weights. Previous networks of this type were hampered by limited levels of gray scale and the need to use two separate weight masks (matrices) per layer. We operate the weight masks in unipolar mode which allows for both positive and negative weights from the same masks. We use a hard limiting function for the network's nonlinearity. A modification of Widrow's seldom known MR2 training algorithm is used to train the network. Furthermore, the network introduces a novel lens-free crossbar matrix- vector multiplier. We also show proposed networks of higher capacity which could be implemented for image processing.
This paper describes an approach to areas of FLIR target recognition: (1) target isolation, and (2) target classification. The method utilized for the isolation of potential target regions is based on localized texture information. The modality of the local gradient histogram is used to define both target regions and to segment these regions into subcomponents corresponding to the vehicle morphology (wheels, engine, armor, etc.). After the target regions are isolated, each region is fit with a metric (parallelogram). Each subcomponent in this region is then classified based on its shape and location within this metric. The classification is made using several neural networks with each corresponding to a specific vehicular subcomponent. The classifications of these neural networks are then used as input to another network responsible for vehicle type classification. This construct allows for azimuth and depression angle robustness of the target region, the limitations of which are discussed.
This paper describes the operation and construction of a magneto-optical neural network image processing system, together with a discussion of the physical basis for its operation. We discuss the behavior of the model under simulated annealing in light of statistical physics. This paper also presents results of large scale simulations of the physical system performed on CM- 2 Connection Machine. The system is capable of image recognition, reconstruction, and processing by use of massive parallelism in a physical thin film. A spin glass thin film material, in conjunction with magneto-optical control, implements a Boltzmann Machine like neural network. The thin film provides the units and connective weights of the neural network, and the magneto-optical system controls the image learning and recall by accessing the units and weights, and allowing their modification, using physical annealing in the film. Images are learned sequentially via stochastic minimization of the system energy, a function of all spin orientations and of interspin distances. Images can be recalled later when a similar, corrupted, or noisy version of a learned prototype image is presented. Our Monte Carlo style computer simulations of this system show its feasibility and practicality for real time image recognition.
In digital halftoning various local and global methods have been suggested. The structure of neural networks allows a new interpretation of these procedures. For global processing the use of the Hopfield net was examined for the binarization problem. We show that the constraints concerning this model influences the binary result. The numerical description leads to a basic parallelism to the well known iterative Fourier transform algorithm (IFTA), applied in digital halftoning.
We consider the problem of linear object extraction from binary and graylevel images. For this purpose we use mathematical morphology and some related filters. They are efficiently implemented on the neighborhood connected processor array and are based on Bresenham's approximation of arbitrary linear element in a square discrete grid.
Determination of the depth of objects in a scene is based on interpretation of the visual cues that tell us how near or far away the objects are. Such cues can be binocular or monocular. Most existing algorithms are based on binocular cues and use a pair of stereo images of the scene to compute a depth map from the disparity between corresponding points in the two images, the geometry of the imaging system, and camera parameters. To solve the correspondence problem, certain simplifying assumptions are usually made. Here we propose a method based on the fact that the brain computes the approximate distance of an object from the viewer from the amount of defocus of its image on the retina. Given two images of a scene taken with different focal settings, we model one of the images as the convolution of a blur function with the other image and use the DFT of the two images to obtain an estimate of the blur at each pixel. A multilayer perceptron using backpropagation learning is used to infer the complex relationship between blur and depth, which also involves the imaging system parameters. Blur functions obtained from a set of images with objects at known depths are used to train the neural network. This approach avoids both the correspondence and camera calibration problems.