We have recently proposed the use of geometry in image processing by representing an image as a surface in 3-space. The linear variations in intensity (edges) were shown to have a nondivergent surface normal. Exploiting this feature we introduced a nonlinear adaptive filter that only averages the divergence in the direction of the surface normal. This led to an inhomogeneous diffusion (ID) that averages the mean curvature of the surface, rendering edges invariant while removing noise. This mean curvature diffusion (MCD) when applied to an isolated edge imbedded in additive Gaussian noise results in complete noise removal and edge enhancement with the edge location left intact. In this paper we introduce a new filter that will render corners (two intersecting edges), as well as edges, invariant to the diffusion process. Because many edges in images are not isolated the corner model better represents the image than the edge model. For this reason, this new filtering technique, while encompassing MCD, also outperforms it when applied to images. Many applications will benefit from this geometrical interpretation of image processing, and those discussed in this paper include image noise removal, edge and/or corner detection and enhancement, and perceptually transparent coding.
Changing video frame rate may require adding new frames. This can be done by interpolating the given frames over time. We describe a very fast algorithm for time interpolation of DCT encoded video. Given a sequence of DCT encoded frames, the algorithm produces the interpolated sequence of DCT encoded frames without computing DCT or inverse DCT. The idea is to exploit a simple shift property of DCT coefficients that holds for frame triplets. It is somewhat analogous to the shift property of Fourier coefficients that holds for frame pairs.
With the advent of High Definition Television, it will become desirable to convert existing video sequence data into higher-resolution formats. This conversion process already occurs within the human visual system to some extent, since the perceived spatial resolution of a sequence appears much higher than the actual spatial resolution of an individual frame. This paper addresses how to utilize both the spatial and temporal information present in a sequence in order to generate high-resolution video. A novel observation model based on motion compensated subsampling is proposed for a video sequence. Since the reconstruction problem is ill-posed, Bayesian restoration with a discontinuity-preserving prior image model is used to extract high-resolution image sequences will be shown, with dramatic improvements provided over various single frame interpolation methods.
The component-wise processing of color image data in performed in a variety of applications. These operations are typically carried out using Lookup Table (LUT) based processing techniques, making them well suited for digital implementation. A general exposition of this type of processing is provided, indicating it's remarkable utility along with some of the practical issues that can arise. These motivate a call for the use of constraints in the types of operators that are used during the construction of LUTs. Several particularly useful classes of constrained operators are identified. These lead to an object-oriented approach generalized to operated in a variety of color spaces. The power of this type of framework is then demonstrated via several novel applications in the HSL color space.
Due to the special features of the homomorphic filter, relying on its capabilities of selectively enhancing blurred images with poor contrast and nonuniform illumination, a study concerning the possibility of applying it to RGB (24-bit true color) images has been made. Moreover, the effects of different shapes for the linear filter employed by the process are discussed and illustrated using a classical high pass butterworth filter modified for more flexibility in the final enhancement. An image of poor quality in terms of blurring and nonuniform illumination was used to demonstrate the results of different stages of the filtering process. It is shown that homomorphic filtering is a viable tool for enhancing poor quality RGB images.
Image data compression using vector quantization (VQ) has received a lot of attention in the last decade because of its simplicity and adaptability. The performance of encoding and decoding by VQ is dependent on the available codebook. It is important to design an optimal codebook based on some training set. The codebook is optimum in the sense that the codebook tries to match all the source data (the training set), as far as possible. Hence, the design of an efficient and robust codebook is of prime importance in VQ. Also, it was proven that Neural Network (NN) is a fast alternative approach to create the codebook. Neural Network appears to be particularly well-suited for VQ applications. Most NN learning algorithms are adaptive and can be used to produce effective scheme for training VQ.
A method to restore faded color materials by digital image processing is presented. The algorithms used for the reconstruction are based on photographic experiments, i.e., on accelerated fading tests of various photographic materials. The densities of the original and the faded materials were measured. Based on this data, a mathematical model for fading can be described by a linear equation. The faded image is digitized using a scanner of high spatial and photometric resolution. For good spectral resolution, channel separations are done with narrow- band interference filters. The original colors are reconstructed by applying the inverse of the facing equation. The corrected image is exposed with a high-resolution film recorder on color film. The method shows good results for color slides, prints, and 16 mm movies.
The discrepancy between diagnostic importance of nuclear medicine images and their quality and the predominating visual interpretation of the images demand quality improvements. Digital image restoration procedures are known for being capable to solve this problem considering both noise and blurring. In this paper we propose the application of a modified 2D Kalman filter for nuclear medicine images (SPECT). In addition to the special capability of processing instationary signals the Kalman filter offers a possibility for controlling the filtering effect in a convenient way. The Kalman filter is based on a state space approach subdividing the imaging process into image generation and image degradation processes. The tested filter operates adaptively by permanent identification of the image generation model. For adaptation to the human visual system the filtering effect is modified depending on image quality represented by the mean information density and the local image contents represented by a structure information. The appropriate filtering effect is determined by modifying filter parameters with a predefined piecewise linear characteristic curve. The both smoothing and structure enhancing effect of our Kalman filtering approach is demonstrated in numerous tests performed with SPECT phantom images and brain SPECT studies.
We present an approach to the annotation of natural scene images that used a simple color segmenter to indicate the presence of skin, sky and grass in an image. The results of the color based segmentation are then used to determine higher level description of images as 'People' or 'Outdoors'. The images are analyzed in a luminance-chrominance color space in order to reduce the influence of the scene illumination on the segmentation. The distribution of the chrominance components of the pixels belonging to each object class is modeled as a Gaussian PDF, allowing an image-adaptive setting of the object-class thresholds.
This paper presents an approach for multiscale image segmentation suitable for applications such as multiscale object recognition. The multiscale segmentation of the input image is obtained by segmenting the scale-space image in a bottom up fashion (i.e., from fine to coarse scale). The segmentation method used combines a Gaussian texture model and Gibbs-Markov contour model to produce an image segmentation which corresponds closely to the objects in the scale-space image. In order to obtain an accurate segmentation at multiple scales, the region labels from the preceding fine scales are propagated as initial conditions for the succeeding coarse scales. Results demonstrate that in general, there is a close similarity between the behavior of the contours derived by this segmentation method, and the behavior of edges found in conventional scale-space approaches. An advantage to this new technique is that the resulting contours are closed, as required by many machine vision algorithms. This is not guaranteed in conventional scale-space methods.
Edge detection is one of the most important image processing steps towards image understanding. The resultant isolated regions or segments must be completely separated from their neighbors; that is, edges be continuous. Images must first be smoothed to remove noises. In this paper, a novel fuzzy smoother algorithm is presented which removes all camera noises and enhances edge contrasts. A fuzzy edge detection algorithm is then presented which is applied on the smoothed image. In this algorithm normalized hue in HSI and color contrast in RGB spaces are combined using an aggregate operator. Almost local maxima is then found for all edge directions and the results are combined. Finding another local maxima produces promising results.
We propose a nonuniform frequency sampling method for 2D FIR filter design based on the concept of the nonuniform discrete Fourier transform (NDFT). The NDFT of a 2D sequence is defined as a sequence of samples of its z-transform taken at distinct points located arbitrarily in the (z1, z2) space. In our design method, we determine the independent filter coefficients by taking samples of the desired frequency response at points located nonuniformly on the unit bi-disc, and then solving the linear equations given by the NDFT formulation. The choice of sample values and locations depends on the shape of the 2D filter. Best results are obtained when samples ar placed on contour lines that match the desired passband shape. The proposed method produces nonseparable filters with good passband shapes and low peak ripples. In this paper, we consider the design of square- and diamond-shaped filters. Extensive comparisons with filters designed by other methods demonstrate the effectiveness of the proposed method. We also investigate the performances of the filters designed by applying them as prefilters and postfilters to schemes for rectangular and quincunx downsampling of images. Examples show that the filters designed by our method produce output images which are sharper and have a higher PSNR, as compared with other filters.
Real-time image-processing differs from 'ordinary' image-processing in that the logical correctness of the system requires both correct and timely outputs. And, although special real-time imaging architectures are available or can theoretically be constructed, there are no standard programming languages to support these architectures. Moreover, many applications cannot or do not take advantage of any special imaging architectures; they use standard processors instead. In this case, optimization of run-time code is necessarily done with hand-tuning or by trial-and-error. In this paper these optimization techniques are formalized and it is shown how they can be applied to imaging algorithms in real-time to improve run-time performance.
Projections Onto Convex Sets (POCS) is an important algorithm for many image processing and video processing applications. Slow convergence is one of its limitations. In this paper, an acceleration algorithm for POCS is presented. The algorithm is based on the observation that the trajectory of iterations can be approximated as a straight line at the vicinity of the convergence point. As a result, a fast convergence algorithm can be derived. The proposed algorithm has a quadratic convergence rate as comparing to the linear rate of the standard POCS.
The demand for multimedia applications has spurred significant interest in the area of video compression. Yet, as the complexity of compression algorithms increase, the design and optimization of video applications has become both formidable and time consuming. In this paper, we outline an object-oriented C++ video toolkit and illustrate its usage in an R&D setting. This toolkit enables the user to rapidly construct complex video algorithms using familiar objects and operations. A set of statistical gathering tools and MPEG extensions which perform MPEG decoding and encoding are also provided. After describing these components, we demonstrate the use of the toolkit to design a complex applications called the MPEGEditor which performs standard editing operations such as splicing and fading on MPEG sequences through an X Windows/Motif graphical user interface. As a research tool, we illustrate how to expand the toolkit to incorporate new compression techniques. In particular, we show how to extend the MPEG encoding algorithm to include a number of prioritization techniques which are not part of the MPEG standard.
This paper develops and presents methods for the detection of features in high-resolution digital mammograms using anisotropic diffusion techniques. The automated or semiautomated analysis of digital mammograms for the purpose of detecting suspicious changes in normal tissue structure is an exceedingly important and elusive goal confronting researchers in digital mammography. The nature of the changes can be quite variable, but often the quality of the periphery of suspect lesions contains strong cues regarding the nature of the lesion. Thus, it is of interest to consider processing paradigms that analyze lesion boundary information, both to isolate suspect lesions from normal tissue and to aid in the differentiation of benign vs. malignant lesions. In this paper a modified version of the Malik-Perona nonlinear diffusion model is adopted that provides superior boundary detection capability while simultaneously strongly rejecting noise or irrelevant image artifacts. The algorithm provide a multiscale family of smoothed images that display the important property of intra-region smoothing without smoothing across boundaries. Thus, the features extracted do not suffer from the unnecessary blurring arising from conventional smoothing-differentiation edge detectors, while retaining the highly desirable property of noise elimination. In other words, the anisotropic diffusion method performs a piecewise smoothing of the mammographic data image. These properties make it possible to achieve high-quality segmentations of mammographic images. The output of the algorithm is a binary representation containing detailed structural information for the potentially interesting features in the mammogram. Thus, lesions containing spiculations or with associated microcalcifications can be represented with a high resolution, and subjected to further processing towards attaining the difficult goals of detection and diagnosis. The results of this technique as applied to digitized mammograms are presented, using mammogram X-rays digitized to 100 Micron spatial resolution and 12 bits of gray scale resolution.
To create solid models of irregularly shaped objects, a mesh of curves consisting of two orthogonal sets of planar contours that define both transverse and longitudinal cross sections is usually required. By interpolating the in-slice contour-point list one can fairly easily obtain one set of such planar contours, but the question arises of how to produce the other orthogonal set of contours. In the approach reported here, the object contour data in slice images are first processed by an optimal triangulation algorithm of a surface modeler, and the output triangles are used as the initial guess of possible inter-slice point matches. Local contour features are computed and the prominent points of each 2D contour line are identified and classified (e.g. local curvature maxima or inflections). This information will guide our optimal vertical path search algorithm, and in the searching process the dominant points are given priorities for possible connections. This approach aims at retaining structures with rotational displacement, which are commonly seen in bone anatomy. A 2D bicubic spline interpolation is then employed to produce an isotropic mesh of curves. From the mesh of curves so obtained, an isotropic three dimensional (3D) bone object is created by an automatic filling and labeling algorithm developed by the author to permit volume rendering. Initial results of the visualization of bone solid models have been encouraging using AVS (Advanced Visual Systems Inc., Waltham, MA) as well as our own GUI. The results also showed advantages of our system over surface modeling techniques when used to visualize certain geometrical properties with fine resolution, such as proximity map, and to make more accurate volume and distance measurements.
The scene partitioning problem is to delineate regions in an image that correspond to the same object according to some underlying object model. Examples include partitioning an intensity image into piecewise constant intensities or identifying separate planar regions in a disparity map. A fast, general algorithm for solving the partitioning problem in cases of linear models and multiband images is presented. The algorithm uses a statistical test in a region growing formalism. The algorithm relies on the assumption that the correct image partition is connected in image coordinates. Experiments are performed on a series of models with a range of state dimensions.
An image processing and object tracking approach is proposed for the design of a video-based freeway traffic monitoring system. Proper estimation of the traffic speed in different lanes of a freeway allows for timely detection of possible congestions. The proposed method consists of a road modeling stage and a vehicle tracking stage. In the first stage, a three-dimensional model of the background road image is generated using several initial frames. In the tracking stage, each car in the scene is isolated and tracked over many frames, and by mapping its coordinates to the road's 3D model, an estimate of its velocity on the road is produced. The algorithm runs in real-time on a workstation. Experimental results on frame sequences taken from Santa Ana area freeways will be presented.
This paper is concerned with rapid detection of the runway in an image sequence captured from a landing aircraft. During the critical section of landing maneuvers, a vision system which can continuously detect the runway is very useful to both enhance the landing safety and reduce the pilots' workload. Such a system requires fast and reliable recognition strategies in order to meet the real-time flight control demands. In this paper, a method based on defining the regions of interest (ROIs) in the image plane is described for achieving such real-time performance. This method utilizes the approximate information about the aircraft/camera position and orientation to obtain a two-dimensional (2D) projection of the runway on the image plane. This 2D projection is represented by a set of regions called the ROIs, with each ROI corresponding to one feature of the three-dimensional (3D) runway model. We show that using such a technique, the computational complexity of the recognition process can be significantly reduced.
Invariant methods for object representation and model matching develop relationships among object and image features that are independent of the quantitative values of the camera parameters of object orientation, hence the term invariant. Three-dimensional models of objects of scenes can be reconstructed and transferred to new images, given a minimum of two reference images and a sufficient number of corresponding points in the images. By using multiple reference images, redundancy can be exploited to increase robustness of the procedure to pixel measurement errors and systematic errors (i.e., discrepancies in the camera model). We present a general method for deriving invariant relationships based on two or more images. Simulations of model transfer and reconstruction demonstrate the positive effect of additional reference images on the robustness of invariant procedures. Pixel measurement error is simulated by adding random noise to coordinate values of the features in the reference images.
Many techniques in machine vision require tangent estimation. In many implementations, the acquired tangent estimates are sensitive to coding direction of the curves in interest. Therefore, it is a common practice to enforce a certain coding scheme, for example, a boundary is traced in the counterclockwise manner. However, this scheme guarantees to work only for closed curves. For open curves, an inverse operator seems to be a must. In this paper, we propose a new tangent representation scheme named direction-dependent tangent (DDT). DDT makes explicit the direction of curve following and incorporates concavity information into the tangent orientation. The scheme is a simple but powerful enhancement to the standard tangent notation. It facilitates shape matching type of tasks by removing the need for either a predefined coding direction or an inverse operator.
The procedure of preliminary processing of an image for extraction cytological objects from a background which help to decrease the number of facets and to reduce the computational expenses is considered. The model of mechanical properties of cell membranes and corresponding form of their contour energy E is considered. To minimize E the scheme of a global facet algorithm of cell contours detection is discussed.
The proposed decomposition algorithm follows the divide-and-conquer approach. Specifically, operands of discrete convolution operation are decomposed into smaller units, computed separately, and then combined for the final result. The decomposition of the operands is based on integer modular arithmetic from Number Theory. Operands are treated as ordered set, and integer modular arithmetic is used to partition these sets into congruent subsets. It is basically a Decimation by p operation, where p is the common factor of the operands' sizes. Since the proposed decomposition algorithm is an isomorphism, the decomposed convolution operation is equivalent to the original one. Processing speed is increased by implementing these decomposed convolution operations in parallel. The proposed algorithm is similar to the well- known Block convolution except that it is more suitable for parallel implementation. Because the decomposed operations are highly regular and independent, it is also suitable for VLSI implementation.
Many image processing tasks exhibit a high degree of data locality and parallelism and map quite readily to specialized massively parallel computing hardware. However, as network technologies continue to mature, workstation clusters are becoming a viable and economical parallel computing resource, so it is important to understand how to use these environments for parallel image processing as well. In this paper we discuss our implementation of parallel image processing software library (the Parallel Image Processing Toolkit). The Toolkit uses a message- passing model of parallelism designed around the Message Passing Interface (MPI) standard. Experimental results are presented to demonstrate the parallel speedup obtained with the Parallel Image Processing Toolkit in a typical workstation cluster over a wide variety of image processing tasks. We also discuss load balancing and the potential for parallelizing portions of image processing tasks that seem to be inherently sequential, such as visualization and data I/O.
Efficient algorithms with various level of parallelism are proposed for the recently introduced class of Binary Polynomial Transforms (BPT) including Walsh and conjunctive (Reed-Muller) transforms. A unique generic algorithm is proposed for the class of BPT. For each level of parallelism this algorithm is optimal for most of BPT with regard to speedup factor, including Walsh-Hadamard and conjunctive transforms. A family of processors realizing the proposed algorithms is also suggested. The processors can be implemented using a varying number of processor elements of unified architecture. They are universal, i.e. a class of binary polynomial transforms is effectively realized in the processor. Although the processors are universal, their area-time complexities are comparable with the complexities of known Hadamard processors. Processors from the proposed family can be included as blocks in construction of a signal/image processing system.
In this work we introduce a new wide family of 'unbounded' DOTs based on parametric representations of transform matrices. This family contains the generalized Haar transform. A computational model corresponding to linear MISD type (pipelined) algorithms is introduced. Lower bounds are found for the complexity of linear transforms relative to the proposed model. Unified pipelined-parallel algorithms with various level of parallelism which can be implemented on MISD systems to compute unbounded DOTs are developed. It is shown that the proposed algorithms are asymptotically optimal, i.e. the order of the upper bounds coincide with the order of the lower bounds. A unified processor architecture realizing the proposed algorithms is developed. Each processor is universal for a family of unbounded DOTs, meaning that each transform of the family is effectively realized in the processors. The processors can be implemented using different number of processor elements based on the same architecture. Although the processors are universal their area-time complexities are comparable with complexities of known Haar processors.
Nowadays various architectures are suggested for highly efficient image processing, including parallel processors of SIMD type, multiprocessor systems, pipelined processors, systolic arrays and pyramid machines. However, a maximal speed of algorithm execution can be reached only by specialized processor implemented on a custom chip. So merging of the properties of a specialized processor and a possibility of reprogramming in one approach should give a satisfactory result. The paper suggests a new approach to the developing of architecture of a high-speed parallel system for low-level image processing. The homogeneous computing structure (HCS) and homogeneous storing structure (HSS) are the basic elements of this approach. The high speed of the system is provided by structural method of organization of the computing process which is based on the hardware realization of all the nodes of the information graph and their interconnections. The size of the HCS matrix allows to use for each program instruction its own group of processor elements. The program is loaded once before starting to solve the problem, and the information streams processing is carried out without intermediate results storage. The data streams applied to the information inputs of the processor elements are processed in accordance with the program, moving from one element to another in the matrix of the HCS. The examples of execution of the image filtering algorithms on the system are presented.
Hierarchial structures permit distributed computing and multitasking for processing the partitionable image data. A two level hierarchial multiprocessor employing the 68000 master processor, three 8085 slave processors, shared memory mapping, VME backplane bus and dedicated operating system is presented in this paper. The task generation, scheduling and interprocessor communication is under OS control. Image processing algorithms like edge detection, segmentation, smoothing and compression were performed by the master and slaves simultaneously.