In order to fulfill the potential of fingerprint templates as the basis for authentication schemes, one needs to design a hash function for fingerprints that achieves acceptable matching accuracy and simultaneously has provable security guarantees, especially for parameter regimes that are needed to match fingerprints in practice. While existing matching algorithms can achieve impressive matching accuracy, they have no security guarantees. On the other hand, provable secure hash functions have bad matching accuracy and/or do not guarantee security when parameters are set to practical values. In this work, we present a secure hash function that has the best known tradeoff between security guarantees and matching accuracy. At a high level, our hash function is simple: we apply an off-the shelf hash function on certain collections of minutia points (in particular, triplets of minutia triangles"). However, to realize the potential of this scheme, we have to overcome certain theoretical and practical hurdles. In addition to the novel idea of combining clustering ideas from matching algorithms with ideas from the provable security of hash functions, we also apply an intermediate translation-invariant but rotation-variant map to the minutia points before applying the hash function. This latter idea helps improve the tradeoff between matching accuracy and matching efficiency.
We propose a bayesian framework for keyword spotting in handwritten documents. This work is an extension to our previous work where we proposed dynamic background model, DBM for keyword spotting that takes into account the local character level scores and global word level scores to learn a logistic regression classifier to separate keywords from non-keywords. In this work, we add a bayesian layer on top of the DBM called the variational dynamic background model, VDBM. The logistic regression classifier uses the sigmoid function to separate keywords from non-keywords. The sigmoid function being neither convex nor concave, exact inference of VDBM becomes intractable. An expectation maximization step is proposed to do approximate inference. The advantage of VDBM over the DBM is multi-fold. Firstly, being bayesian, it prevents over-fitting of data. Secondly, it provides better modeling of data and an improved prediction of unseen data. VDBM is evaluated on the IAM dataset and the results prove that it outperforms our prior work and other state of the art line based word spotting system.
In this work we place some of the traditional biometrics work on fingerprint verification via the fuzzy vault scheme within a cryptographic framework. We show that the breaking of a fuzzy vault leads to decoding of Reed-Solomon codes from random errors, which has been proposed as a hard problem in the cryptography community. We provide a security parameter for the fuzzy vault in terms of the decoding problem, which gives context for the breaking of the fuzzy vault, whereas most of the existing literature measures the strength of the fuzzy vault in terms of its resistance to pre-defined attacks or by the entropy of the vault. We keep track of our security parameter, and provide it alongside ROC statistics. We also aim to be more aware of the nature of the fingerprints when placing them in the fuzzy vault, noting that the distribution of minutiae is far from uniformly random. The results we show provide additional support that the fuzzy vault can be a viable scheme for secure fingerprint verification.
We propose a segmentation free word spotting framework using Dynamic Background Model. The proposed
approach is an extension to our previous work where dynamic background model was introduced and integrated
with a segmentation based recognizer for keyword spotting. The dynamic background model uses the local
character matching scores and global word level hypotheses scores to separate keywords from non-keywords. We
integrate and evaluate this model on Hidden Markov Model (HMM) based segmentation free recognizer which
works at line level without any need for word segmentation. We outperform the state of the art line level word
spotting system on IAM dataset.
Handwriting styles are constantly changing over time. We approach the novel problem of estimating the approximate
age of Historical Handwritten Documents using Handwriting styles. This system will have many
applications in handwritten document processing engines where specialized processing techniques can be applied
based on the estimated age of the document. We propose to learn a distribution over styles across centuries
using Topic Models and to apply a classifier over weights learned in order to estimate the approximate age of
the documents. We present a comparison of different distance metrics such as Euclidean Distance and Hellinger
Distance within this application.
State-of-the-art techniques for writer identification have been centered primarily on enhancing the performance
of the system for writer identification. Machine learning algorithms have been used extensively to improve
the accuracy of such system assuming sufficient amount of data is available for training. Little attention has
been paid to the prospect of harnessing the information tapped in a large amount of un-annotated data. This
paper focuses on co-training based framework that can be used for iterative labeling of the unlabeled data
set exploiting the independence between the multiple views (features) of the data. This paradigm relaxes the
assumption of sufficiency of the data available and tries to generate labeled data from unlabeled data set along
with improving the accuracy of the system. However, performance of co-training based framework is dependent
on the effectiveness of the algorithm used for the selection of data points to be added in the labeled set. We
propose an Oracle based approach for data selection that learns the patterns in the score distribution of classes
for labeled data points and then predicts the labels (writers) of the unlabeled data point. This method for
selection statistically learns the class distribution and predicts the most probable class unlike traditional selection
algorithms which were based on heuristic approaches. We conducted experiments on publicly available IAM
dataset and illustrate the efficacy of the proposed approach.
Document binarization is one of the initial and critical steps for many document analysis systems. Nowadays,
with the success and popularity of hand-held devices, large efforts are motivated to convert documents into
digital format by using hand-held cameras. In this paper, we propose a Bayesian based maximum a posteriori
(MAP) estimation algorithm to binarize the camera-captured document images. A novel adaptive segmentation
surface estimation and normalization method is proposed as the preprocessing step in our work and followed by
a Markov Random Field based refine procedure to remove noises and smooth binarized result. Experimental
results show that our method has better performance than other algorithms on bad or uneven illumination
Biomedical images are often referenced for clinical decision support (CDS), educational purposes, and research. They
appear in specialized databases or in biomedical publications and are not meaningfully retrievable using primarily textbased
retrieval systems. The task of automatically finding the images in an article that are most useful for the purpose of
determining relevance to a clinical situation is quite challenging. An approach is to automatically annotate images
extracted from scientific publications with respect to their usefulness for CDS. As an important step toward achieving
the goal, we proposed figure image analysis for localizing pointers (arrows, symbols) to extract regions of interest (ROI)
that can then be used to obtain meaningful local image content. Content-based image retrieval (CBIR) techniques can
then associate local image ROIs with identified biomedical concepts in figure captions for improved hybrid (text and
image) retrieval of biomedical articles.
In this work we present methods that make robust our previous Markov random field (MRF)-based approach for pointer
recognition and ROI extraction. These include use of Active Shape Models (ASM) to overcome problems in recognizing
distorted pointer shapes and a region segmentation method for ROI extraction.
We measure the performance of our methods on two criteria: (i) effectiveness in recognizing pointers in images, and (ii)
improved document retrieval through use of extracted ROIs. Evaluation on three test sets shows 87% accuracy in the
first criterion. Further, the quality of document retrieval using local visual features and text is shown to be better than
using visual features alone.
In this paper we present a prototype for an automated deception detection system. Similar to polygraph examinations, we
attempt to take advantage of the theory that false answers will produce distinctive measurements in certain physiological
manifestations. We investigate the role of dynamic eye-based features such as eye closure/blinking and lateral movements
of the iris in detecting deceit. The features are recorded both when the test subjects are having non-threatening conversations
as well as when they are being interrogated about a crime they might have committed. The rates of the behavioral
changes are blindly clustered into two groups. Examining the clusters and their characteristics, we observe that the dynamic
features selected for deception detection show promising results with an overall deceptive/non-deceptive prediction
rate of 71.43% from a study consisting of 28 subjects.
Biomedical images are invaluable in establishing diagnosis, acquiring technical skills, and implementing best practices in
many areas of medicine. At present, images needed for instructional purposes or in support of clinical decisions appear in
specialized databases and in biomedical articles, and are often not easily accessible to retrieval tools. Our goal is to
automatically annotate images extracted from scientific publications with respect to their usefulness for clinical decision
support and instructional purposes, and project the annotations onto images stored in databases by linking images
through content-based image similarity.
Authors often use text labels and pointers overlaid on figures and illustrations in the articles to highlight regions of
interest (ROI). These annotations are then referenced in the caption text or figure citations in the article text. In previous
research we have developed two methods (a heuristic and dynamic time warping-based methods) for localizing and
recognizing such pointers on biomedical images. In this work, we add robustness to our previous efforts by using a
machine learning based approach to localizing and recognizing the pointers. Identifying these can assist in extracting
relevant image content at regions within the image that are likely to be highly relevant to the discussion in the article
text. Image regions can then be annotated using biomedical concepts from extracted snippets of text pertaining to images
in scientific biomedical articles that are identified using National Library of Medicine's Unified Medical Language
System® (UMLS) Metathesaurus. The resulting regional annotation and extracted image content are then used as indices
for biomedical article retrieval using the multimodal features and region-based content-based image retrieval (CBIR)
techniques. The hypothesis that such an approach would improve biomedical document retrieval is validated through
experiments on an expert-marked biomedical article dataset.
This paper describes a system for script identification of handwritten
word images. The system is divided into two main
phases, training and testing. The training phase performs a
moment based feature extraction on the training word images
and generates their corresponding feature vectors. The testing
phase extracts moment features from a test word image
and classifies it into one of the candidate script classes using
information from the trained feature vectors. Experiments
are reported on handwritten word images from three scripts:
Latin, Devanagari and Arabic. Three different classifiers are
evaluated over a dataset consisting of 12000 word images in
training set and 7942word images in testing set. Results show
significant strength in the approach with all the classifiers having
a consistent accuracy of over 97%.
Homeland security requires technologies capable of positive and reliable identification of humans for law enforcement,
government, and commercial applications. As artificially intelligent agents improve in their abilities and become a part
of our everyday life, the possibility of using such programs for undermining homeland security increases. Virtual
assistants, shopping bots, and game playing programs are used daily by millions of people. We propose applying
statistical behavior modeling techniques developed by us for recognition of humans to the identification and verification
of intelligent and potentially malicious software agents. Our experimental results demonstrate feasibility of such methods
for both artificial agent verification and even for recognition purposes.
For the domain of strategy-based behavioral biometrics we propose the concept of profiles enhanced with spatial, temporal and contextual information. Inclusion of such information leads to a more stable baseline profile and as a result more secure systems. Such enhanced data is not always readily available and often is time consuming and expensive to acquire. One solution to this problem is the use of artificially generated data. In this paper a novel methodology for creation of feature-level synthetic biometric data is presented. Specifically generation of behavioral biometric data represented by game playing strategies is demonstrated. Data validation methods are described and encouraging results are obtained with possibility of expanding proposed methodologies to generation of artificial data in the domains other then behavioral biometrics.
Statistical modeling of biometric systems at the score level is extremely important. It is the foundation of the
performance assessment of biometric systems including determination of confidence intervals and test sample
size for simulations, and performance prediction of real world systems. Statistical modeling of multimodal
biometric systems allows the development of a methodology to integrate information from multiple biometric
sources. We present a novel approach for estimating the marginal biometric matching score distributions by
using extreme value theory in conjunction with non-parametric methods. Extreme Value Theory (EVT) is based
on the modeling of extreme events represented by data which has abnormally low or high values in the tails of the
distributions. Our motivation stems from the observation that the tails of the biometric score distributions are
often difficult to estimate using other methods due to lack of sufficient numbers of training samples. However,
good estimates of the tails of biometric distributions are essential for defining the decision boundaries. We
present EVT based novel procedures for fitting a score distribution curve. A general non-parametric method is
used for fitting the majority part of the distribution curve, and a parametric EVT model - the general Pareto
distribution - is used for fitting the tails of the curve. We also demonstrate the advantage of applying the EVT
Rather than use arbitrary matching threshold values and a heuristic set of features while comparing minutiae points
during the fingerprint verification process, we develop a system which considers only the optimal features, which
contain the highest discriminative power, from a predefined feature set. For this, we use a feature selection algorithm
which adds features, one at a time, till it arrives at an optimal feature set of the target size. The classifier is trained on this
feature set, on a two class problem representing pairs of matched minutiae points belonging to fingerprints of same and
different users. During the test phase, the system generates a number of candidate matched minutiae pairs; features from
each of them are extracted and given to the classifier. Those that are incorrectly matched are eliminated from the scoring
algorithm. We have developed a set of seven candidate features, and tested our system using the FVC 2002 DB1
fingerprint database. We study how feature sets of different sizes affect the accuracy of the system, and observe how
additional features not necessarily would improve the performance of a classifier. This is illustrated in how using a 3
feature set gives us the most accurate system and using bigger feature sets cause a slight drop in accuracy.
The problem of form classification is to assign a single-page form image to one of a set of predefined form types or classes.
We classify the form images using low level pixel density information from the binary images of the documents. In this
paper, we solve the form classification problem with a classifier based on the k-means algorithm, supported by adaptive
boosting. Our classification method is tested on the NIST scanned tax forms data bases (special forms databases 2 and 6)
which include machine-typed and handwritten documents. Our method improves the performance over published results
on the same databases, while still using a simple set of image features.
Transcript mapping or text alignment with handwritten documents is the automatic alignment of words in a text file with word images in a handwritten document. Such a mapping has several applications in fields ranging from machine learning where large quantities of truth data are required for evaluating handwriting recognition algorithms, to data mining where word image indexes are used in ranked retrieval of scanned documents in a digital library. The alignment also aids "writer identity" verification algorithms. Interfaces which display scanned handwritten documents may use this alignment to highlight manuscript tokens when a person examines the corresponding transcript word. We propose an adaptation of the True DTW dynamic programming algorithm for English handwritten documents. The integration of the dissimilarity scores from a word-model word recognizer and Levenshtein distance between the recognized word and
lexicon word, as a cost metric in the DTW algorithm leading to a fast and accurate alignment, is our primary contribution. Results provided, confirm the effectiveness of our approach.
This paper describes an OCR-based technique for word
spotting in Devanagari printed documents. The system
accepts a Devanagari word as input and returns a sequence
of word images that are ranked according to their
similarity with the input query. The methodology involves
line and word separation, pre-processing document
words, word recognition using OCR and similarity
matching. We demonstrate a Block Adjacency Graph
(BAG) based document cleanup in the pre-processing
phase. During word recognition, multiple recognition hypotheses
are generated for each document word using a
font-independent Devanagari OCR. The similarity matching
phase uses a cost based model to match the word
input by a user and the OCR results. Experiments are
conducted on document images from the publicly available
ILT and Million Book Project dataset. The technique
achieves an average precision of 80% for 10 queries and
67% for 20 queries for a set of 64 documents containing
5780 word images. The paper also presents a comparison
of our method with template-based word spotting techniques.
To compensate for the different orientations of two fingerprint images, matching systems use a reference point and a set
of transformation parameters. Fingerprint minutiae are compared on their positions relative to the reference points, using
a set of thresholds for the various matching features. However a pair of minutiae might have similar values for some of
the features compensated by dissimilar values for others; this tradeoff cannot be modeled by arbitrary thresholds, and
might lead to a number of false matches. Instead given a list of potential correspondences of minutiae points, we could
use a static classifier, such as a support vector machine (SVM) to eliminate some of the false matches. A 2-class model is
built using sets of minutiae correspondences from fingerprint pairs known to belong to the same and different users. For
a test pair of fingerprints, a similar set of minutiae correspondences is extracted and given to the recognizer, using only
those classified as genuine matches to calculate the similarity score, and thus, the matching result. We have built
recognizers using different combinations of fingerprint features and have tested them against the FVC 2002 database.
Using this recognizer reduces the number of false minutiae matches by 19%, while only 5% of the minutiae pairs
corresponding to fingerprints of the same user are rejected. We study the effect of such a reduction on the final error rate,
using different scoring schemes.
Quality of a biometric system is directly related to the performance of the dissimilarity measure function. Frequently a
generalized dissimilarity measure function such as Mahalanobis distance is applied to the task of matching biometric
feature vectors. However, often accuracy of a biometric system can be greatly improved by introducing a customized
matching algorithm optimized for a particular biometric. In this paper we investigate two tailored similarity measure
functions for behavioral biometric systems based on the expert knowledge of the data in the domain. We compare
performance of proposed matching algorithms to that of other well known similarity distance functions and demonstrate
superiority of one of the new algorithms with respect to the chosen domain.
Handwriting recognition research requires large databases of word images each of which is labeled with the word it contains. Full images scanned in, however, usually contain sentences or paragraphs of writing. The creation of labeled databases of images of isolated words is usually tedious, requiring a person to drag a rectangle around each word in the full image and type in the label. Transcript mapping is the automatic alignment of words in a text file with word locations in the full image. It can ease the creation of databases for research. We propose the first transcript mapping method for handwritten Arabic documents. Our approach is based on Dynamic Time Warping (DTW) and offers two primary algorithmic contributions. First is an extension to DTW that uses true distances when mapping multiple entries from one series to a single entry in the second series. Second is a method to concurrently map elements of a partially aligned third series within the main alignment. Preliminary results are provided.
Behavior based intrusion detection is a frequently used approach for insuring network security. We expend behavior based intrusion detection approach to a new domain of game networks. Specifically, our research shows that a unique behavioral biometric can be generated based on the strategy used by an individual to play a game. We wrote software capable of automatically extracting behavioral profiles for each player in a game of Poker. Once a behavioral signature is generated for a player, it is continuously compared against player's current actions. Any significant deviations in behavior are reported to the game server administrator as potential security breaches. Our algorithm addresses a well-known problem of user verification and can be re-applied to the fields beyond game networks, such as operating systems and non-game networks security.
Biometric identification has emerged as a reliable means of controlling access to both physical and virtual spaces. Fingerprints, face and voice biometrics are being increasingly used as alternatives to passwords, PINs and visual verification. In spite of the rapid proliferation of large-scale databases, the research has thus far been focused only on accuracy within small databases. In larger applications, response time and retrieval efficiency also become important in addition to accuracy. Unlike structured information such as text or numeric data that can be sorted, biometric data does not have any natural sorting order. Therefore indexing and binning of biometric databases represents a challenging problem. We present results using parallel combination of multiple biometrics to bin the database. Using hand geometry and signature features we show that the search space can be reduced to just 5% of the entire database.
Most current on-line signature verification systems are pen-based,
i.e., the signature is written by a pen on a digital tablet. To
apply signature verification techniques to Internet applications,
one desirable way is to use mouse as the input device. In this
paper, a mouse based on-line signature verification system is
presented. The signature data is represented in a compact and
secure way. The enrollment is convenient for users and the
verification can be real-time. The system architecture is a simple
client-server structure that suits the common internet
applications. The client side posts a sequence of X*Y to the
server. That is, the raw signature sequence [<i>X,Y</i>] is
represented by <i>X+Y</i>(dot sum of the <i>X-, Y</i>-coordinate
sequences). Experimental results show that <i>X+Y</i>is effective for
verification. Even if the sequence <i>X+Y</i> is intercepted during
transmission over the Internet, original shape of signature [<i>X,
Y</i>] can be recovered. To make the system easy to implement, the
client side does nothing complex except posting <i>X+Y</i>to the
server. The server side preprocesses the sequence and saves it to
database during enrollment or match it against the claimed
template during verification. The matching technique adopted here
is the similarity measure <i>ER</i><sup>2</sup>. Simulation results show that the mouse based verification system is reliable and secure for internet applications.
This paper presents a new document binarization algorithm for camera
images of historical documents, which are especially found in The
Library of Congress of the United States. The algorithm uses a
background light intensity normalization algorithm to enhance an
image before a local adaptive binarization algorithm is applied. The
image normalization algorithm uses an adaptive linear or non-linear
function to approximate the uneven background of the image due to
the uneven surface of the document paper, aged color or uneven light
source of the cameras for image lifting. Our algorithm adaptively
captures the background of a document image with a "best fit"
approximation. The document image is then normalized with respect to
the approximation before a thresholding algorithm is applied. The
technique works for both gray scale and color historical handwritten
document images with significant improvement in readability for both
human and OCR.
Most of the handwritten text challenges are usually either more severe or not encountered in machine-printed text. In contrast to the traditional role of handwriting recognition in various applications, we explore a different perspective inspired by these challenges and introduce new applications based on security systems and HIP. Human Interactive Proofs (HIP) emerged as a very active research area that has focused on defending online services against abusive attacks. The approach uses a set of security protocols based on automatic reverse Turing tests, which virtually all humans can pass but current computer programs don't. In our paper we explore the fact that some recognition tasks are significantly harder for machines than for humans and describe a HIP algorithm that exploits the gap in ability between humans and computers in reading handwritten text images. We also present several promising applications of HIP for Cyber security.
Despite advances in fingerprint identification techniques, matching incomplete or partial fingerprints still poses a difficult challenge. While the introduction of compact silicon chip-based sensors that capture only a part of the fingerprint area have made this problem important from a commercial perspective, there is also considerable interest on the topic for processing partial and latent fingerprints obtained at crime scenes. Attempts to match partial fingerprints using singular ridge structures-based alignment techniques fail when the partial print does not include such structures (e.g., core or delta). We present a multi-path fingerprint matching approach that utilizes localized secondary features derived using only the relative information of minutiae. Since the minutia-based fingerprint representation, is an ANSI-NIST standard, our approach has the advantage of being directly applicable to already existing databases. We also analyze the vulnerability of partial fingerprint identification systems to brute force attacks. The described matching approach has been tested on one of FVC2002’s DB1 database11. The experimental results show that our approach achieves an equal error rate of 1.25% and a total error rate of 1.8% (with FAR at 0.2% and FRR at 1.6%).
The performance of any fingerprint recognizer highly depends on the
fingerprint image quality. Different types of noises in the fingerprint images pose greater difficulty for recognizers.
Most Automatic Fingerprint Identification Systems (AFIS) use some form of image enhancement. Although several methods have been described in the literature, there is still scope for improvement. In particular, effective methodology of cleaning the valleys between the ridge contours are lacking. We observe that noisy valley pixels and the pixels in the interrupted ridge flow gap are "impulse noises". Therefore, this paper describes a new approach to fingerprint image enhancement, which is based on integration of Anisotropic Filter and directional median filter(DMF). Gaussian-distributed noises are reduced effectively by Anisotropic Filter, "impulse noises" are
reduced efficiently by DMF. Usually, traditional median filter is the most effective method to remove pepper-and-salt noise and other small artifacts, the proposed DMF can not only finish its original tasks, it can also join broken fingerprint ridges, fill out the holes of fingerprint images, smooth irregular ridges as well as remove some annoying small artifacts between ridges. The enhancement algorithm has been implemented and tested on fingerprint images from FVC2002. Images of varying quality have been used to evaluate the performance of our approach. We have compared our method with other methods described in the literature in terms of matched minutiae, missed
minutiae, spurious minutiae, and flipped minutiae(between end points and bifurcation points). Experimental results show our method to be superior to those described in the literature.
A new multiple level classification method is introduced. With an available feature set, classification can be done in several steps. After first step of the classification using the full feature set, the high confidence recognition result will lead to an end of the recognition process. Otherwise a secondary classification designed using partial feature set and the information available from earlier classification step will help classify the input further. In comparison with the existing methods, our method is aimed for increasing recognition accuracy and reliability. A feature selection mechanism with help of genetic algorithms is employed to select important features that provide maximum separability between classes under consideration. These features are then used to get a sharper decision on fewer classes in the secondary classification. The full feature set is still used in earlier classification to retain complete information. There are no features dumped as they would be in feature selection methods described in most related publications.
Foreign language materials on the web are growing at a faster rate than English language materials and it has been predicted that by the end of 1999 the amount of non-English resources on the internet will exceed English resources. A significant portion of the non-English material is in the form of images.
Determining the readability of documents is an important task. Human readability pertains to the scenario when a document image is ultimately presented to a human to read. Machine readability pertains to the scenario when the document is subjected to an OCR process. In either case, poor image quality might render a document un-readable. A document image which is human readable is often not machine readable. It is often advisable to filter out documents of poor image quality before sending them to either machine or human for reading. This paper is about the design of such a filter. We describe various factors which affect document image quality and the accuracy of predicting the extent of human and machine readability possible using metrics based on document image quality. We illustrate the interdependence of image quality measurement and enhancement by means of two applications that have been implemented: (1) reading handwritten addresses on mailpieces and (2) reading handwritten U.S. Census forms. We also illustrate the degradation of OCR performance as a function of image quality. On an experimental test set of 517 document images, the image quality metric (measuring fragmentation due to poor binarization) correctly predicted 90% of the time that certain documents were of poor quality (fragmented characters) and hence not machine readable.
The holistic paradigm in HWR has been applied to recognition scenarios involving small, static lexicons, such as the check amount recognition task. In this paper, we explore the possibility of using holistic information for lexicon reduction when the lexicons are large or dynamic, and training, in the traditional sense of learning decision surfaces from training samples of each class, is not viable. Two experimental lexicon reduction methods are described. The first uses perceptual features such as ascenders, descenders and length and achieves consistent reduction performance with cursive, discrete and mixed writing styles. A heuristic feature-synthesis algorithm is used to 'predict' holistic features of lexicon entries, which are matched against image features using a constrained bipartite graph matching scheme. With essentially unconstrained handwritten words, this system achieves reduction of 50% with less than 2% error. More effective reduction can be achieved if the problem can be constrained by making assumptions about the nature of input. The second classifier described operates on pure cursive script and achieves effective reduction of large lexicons of the order of 20,000 entries. Downstrokes are extracted from the contour representation of cursive words by grouping local extrema using a small set of heuristic rules. The relative heights of downstrokes are captured in a string descriptor that is syntactically matched with lexicon entries using a set of production rules. In initial tests, the system achieved high reduction (99%) at the expense of accuracy (75%).
Efficient image handling in the handwritten document recognition is an important research issue in real time applications. Image manipulation procedures for a fast handwritten word recognizer, including pre-processing, segmentation, and feature extraction, have been implemented using the chain code representation and presented in this paper. Pre-processing includes noise removal, slant correction and smoothing of contours. Slant angle is estimated by averaging orientation angles of vertical strokes. Smoothing removes jaggedness on contours. Segmentation points are determined using ligatures and concavity features. Average stroke width of an image is used in an adaptive fashion to locate ligatures. Concavities are located by examination of slope changes in contours. Feature extraction efficiently converts a segment into feature vectors. Experimental results demonstrate the efficiency of the algorithms developed. Three-thousand word images captured from real mail pieces, with size of 217 by 82 in average, are used in the experiments. Average processing times taken for each module are 10, 15, and 34 msec on a single Sparc 10 for pre-processing, segmentation, and feature extraction, respectively.