In this paper, we present a novel method for fingerprint verification. A unique characteristic of our method
is the use of direction images and local features in the matching process. A direction center is computed
from the direction image and used as a reference point for aligning fingerprints. Fingerprint matching is
performed in two stages. In the first stage, we compute the correlation between the direction images of the
two fingerprints. In the second stage, we compare various features derived from fingerprint minutiae. The
first stage acts as a filtering procedure that rejects fingerprints based on the global directional patterns of
the ridges. The second stage verifies the local characteristics of the fingerprint minutiae. The two-stage
matching process results in a robust procedure that minimizes verification errors.
With the proliferation of digital media such as images, audio, and video, robust digital watermarking and data hiding techniques are needed for copyright protection, copy control, annotation, and authentication. While many techniques have been proposed for digital color and grayscale images, not all of them can be directly applied to binary document images. The difficulty lies in the fact that changing pixel values in a binary document could introduce irregularities that are very visually noticeable. Over the last few years, we have seen a growing but limited number of papers proposing new techniques and ideas for document image watermarking and data hiding. In this paper, we present an overview and summary of recent developments on this important topic, and discuss important issues such as robustness and data hiding capacity of the different techniques.
With the proliferation of digital media such as digital images, digital audio, and digital video, robust digital watermarking and data hiding techniques are needed for copyright protection, copy control, annotation, and authentication. While many techniques have been proposed for digital color and grayscale images, not all of them can be directly applied to binary text images. The difficulty lies in the fact that changing pixel values in a binary document could introduce irregularities that are very visually noticeable. We propose a new method for data hiding in binary text documents by embedding data in the 8-connected boundary of a character. We have identified a fixed set of pairs of five-pixel long boundary patterns for embedding data. One of the patterns in a pair requires deletion of the center foreground pixel, whereas the other requires the addition of a foreground pixel. A unique property of the proposed method is that the two patterns in each pair are dual of each other -- changing the pixel value of one pattern at the center position would result in the other. This property allows easy detection of the embedded data without referring to the original document, and without using any special enforcing techniques for detecting embedded data.
A framework for video content classification using a knowledge-based approach is herein proposed. This approach is motivated by the fact that videos are rich in semantic contents, which can best be interpreted and analyzed by human experts. We demonstrate the concept by implementing a prototype video classification system using the rule-based programming language CLIPS 6.05. Knowledge for video classification is encoded as a set of rules in the rule base. The left-hand-sides of rules contain high level and low level features, while the right-hand-sides of rules contain intermediate results or conclusions. Our current implementation includes features computed from motion, color, and text extracted from video frames. Our current rule set allows us to classify input video into one of five classes: news, weather, reporting, commercial, basketball and football. We use MYCIN's inexact reasoning method for combining evidences, and to handle the uncertainties in the features and in the classification results. We obtained good results in a preliminary experiment, and it demonstrated the validity of the proposed approach.
Shape is a popular feature used for content-based image retrieval. In this paper we propose a new method for image retrieval using a shape boundary represented in scale-space. The proposed method is suggested by the notion of 'dynamic shape' where all 2D boundary representations evolve from a single, primeval, featureless shape - a circle. Shape is represented by linearizing the boundary based on the polar coordinates of boundary points relative to the object's centroid. Points on the shape boundary are mapped to a primeval circle, and two functions are defined, the Radius Difference Function and the Angle Difference Function,and smoothed through scale-space to devolve the shape. Maxima and minima of the Radius Difference Function are extracted and used to calculate similarity between objects. Similarity is calculated using Euclidean distance. Other scale-space approaches to shape representation use various techniques to maintain constant boundary arc length, that any otherwise change in non-intuitive ways over scale. We introduce the contour stability over scale property stating that the perceived boundary length should not change significantly over scale. Experiments show that significant similarity computation may be saved by using coarser scales without effectively reducing retrieval performance.
Image histogram is an image feature widely used in content- based image retrieval and video segmentation. It is simple to compute, yet very effective as a feature in detecting image-to-image similarity, or frame-to-frame dissimilarity. While the image histogram captures the global distribution of different intensities or colors well, it does not contain any information about the spatial distribution of pixels. In this paper, we propose to incorporate spatial information into the image histogram, by computing features from the spatial distance between pixels, belonging to the same intensity or color. In addition to the frequency, count of the intensity or color, the mean, variance, and entropy of the distances are computed to form an augmented image histogram. Using the new feature, we performed experiments on a set of color images and a color video sequence. Experimental results demonstrate that the augmented image histogram performs significantly better than the conventional color histogram, both in the image retrieval and video shot segmentation.
Vast amounts of inexpensive storage and cost-effective input devices have promoted a rapid increase in the amount of stored digital images and video. Retrieving desired images from image databases is a challenging problem, not easily solved by existing database methods. Shape is one popular feature used to automate retrieval of images by content. Example shape features include the Fourier descriptor for describing shape boundaries, and area or compactness for describing shape regions, among others. We present a new shape-representation method based on extracting an image surface ridge-line called the Most Prominent Ridge-Line (MPRL) in scale-space. The MPRL is extracted by minimizing the second spatial derivative orthogonal to the ridge-line direction. The scale of the MPRL point is proportional to the image object's width. Matching query and database image MPRLs is shown to be an effective method for retrieving images based on shape. A unique feature of the proposed method is that the scale dimension may be weighted to allow for any desired amount of shape details in image retrieval.
In the analysis of grayscale images, straight line segments are usually extracted by using the Hough Transform method. Straight line detection using Hough Transform has the disadvantage that detecting peaks in the accumulator array is not always a reliable process. Thus, a significant amount of error may result. In this paper, we propose to extract straight line segments from binary images using binary morphological operations. In addition to the endpoint coordinates, the width of a line segment can also be reliably computed in the process. In the proposed approach, a set of line-shaped fixed-length structuring elements with orientation ranging from 0 to 180 degrees is used to extract line segments of all orientations. The algorithm is flexible for different applications. Line segments of different thickness, length and orientation can be extracted with high precision. Experiment results we obtained on engineering map drawings demonstrate the good performance of the algorithm.
Image thinning methods can be divided into two categories based on the type of image they are designed to thin: binary image thinning and grayscale image thinning. Typically, grayscale images are threshold to allow binary image thinning methods to be applied. However, thresholding grayscale images may introduce uneven object contours that are a difficulty for binary methods. The scale-space approach to image thinning includes scale as an additional dimension where images at scale t are derived from the original image at scale zero by applying the Gaussian filter. As scale increase finer image structure is suppressed. By treating the image as a 3D surface with intensity as the third dimension, the most prominent ridge- line (MPRL) is the union of topographical features: peak, ridge, and saddle point, such that each has greatest contrast with its surroundings. The MPRL is computed by minimizing its second spatial derivative over scale. The result forms a trajectory in scale-space. The thinned image is the projection of the MPRL on the base level. The MPRL has been implemented using the image pyramid data structure, and has been applied to binary and grayscale images of printed characters. Experimental results show that the method is less sensitive to contour unevenness. It also offers the option of choosing different levels of fine structure to include.
In this paper, we develop an improved optimization algorithm based on genetic algorithm (GA) approach for the bandwidth allocation of ATM networks. The ATM switches can be connected with multiples of DS3 trunks via digital cross connect systems (DCS). One of the advantages of DCS is its ability to reconfigure a customer network dynamically. We utilize this advantage in the design and dynamic reconfiguration of ATM networks. The problem is formulated as a network optimization problem where a congestion measure based on the average packet delay is minimized, subject to capacity constraints posed by the underlying facility trunks. We choose the traffic routing on the express pipes and the allocation of the bandwidth on these pipes as the variables in this problem. The previous GA algorithm is not practical because (1) the number of the traffic distribution patterns is huge, and (2) the values of offered traffic are continuous. A new representation of the chromosome, Net- Chro, and the reproduction operator are presented. We show that the previous algorithm cannot guarantee full usage of trunk capacities in the solutions it generates. We also discuss open-loop control to overcome the congestion caused by a trunk failure.
An emerging trend in the banking industry is to digitize the check storage, processing, and transmission process. One bottleneck in this process is the extremely large sizes of digitized checks. A check image is usually comprised of a foreground overlaid on top of a background. For most banking functions, only the foreground carries useful information and should be specified accurately. The background either does not need to be retained, or can be represented with less precisions, depending on the underlying banking requirements and procedures. Recognizing this special characteristic of check images, we propose a layered coding approach. The first layer consists of the binary foreground map. The second layer contains the gray or color values of the foreground pixels. The third layer retains a coarse representation of the background. The fourth layer comprises the error image between the original and the decompressed one from the first three layers. The methods for segmenting the foreground and for coding different layers are presented. The proposed layered coding scheme can yield a more accurate representation of a check image, especially the foreground, than the JPEG baseline algorithm under the same compression ratio. Furthermore, it facilitates progressive retrieval or transmission of check images in compressed formats.
A fuzzy hierarchical FLIR ATR is proposed which more closely models the fuzziness in the FLIR data and the human decision process than the traditional ATR methods. The target and its internal hot spots are segmented out from the background by use of an iterative volume based morphological contrast peak extraction routine. The segmented regions are then represented by a set of silhouettes for each segmented blob rather than just the one `best' silhouette. For the target or foundation segment, the primary recognition feature, silhouette shape, is captured by the low frequencies of the 2-D DFT of each member of the set. The hot spots are represented both by the shape features (DFT) and by positional features. The first level of this hierarchical classification system uses an Euclidean distance figure of merit for the foundation's silhouette to assign a fuzzy classification to the target. This initial guess is then adjusted based on the internal features.