A nonparametric inferential statistical data analysis is presented. The utility of this method is
demonstrated through analyzing results from minutiae exchange with two-finger fusion. The
analysis focused on high-accuracy vendors and two modes of matching standard fingerprint templates: 1) Native Matching - where the same vendor generates the templates and the matcher,
and 2) Scenario 1 Interoperability - where vendor A's enrollment template is matched to vendor B's authentication template using vendor B's matcher. The purpose of this analysis is to make inferences about the underlying population from sample data, which provide insights at an
aggregate level. This is very different from the data analysis presented in the MINEX04 report
in which vendors are individually ranked and compared. Using the nonparametric bootstrap
bias-corrected and accelerated (BCa) method, 95 % confidence intervals are computed for each
mean error rate. Nonparametric significance tests are then applied to further determine if the
difference between two underlying populations is real or by chance with a certain probability. Results from this method show that at a greater-than-95 % confidence level there is a significant degradation in accuracy of Scenario 1 Interoperability with respect to Native Matching. The difference of error rates can reach on average a two-fold increase in False Non-Match Rate. Additionally, it is proved why two-finger fusion using the sum rule is more accurate than single-finger
matching under the same conditions. Results of a simulation are also presented to show the significance of the confidence intervals derived from the small size of samples, such as six error rates in some of our cases.
This paper presents an R&D framework used by the National Institute of Standards and Technology (NIST) for biometric technology testing and evaluation. The focus of this paper is on fingerprint-based verification and identification. Since 9-11 the NIST Image Group has been mandated by Congress to run a program for biometric technology assessment and biometric systems certification. Four essential areas of activity are discussed: 1) developing test datasets, 2) conducting performance assessment; 3) technology development; and 4) standards participation. A description of activities and accomplishments are provided for each of these areas. In the process, methods of performance testing are described and results from specific biometric technology evaluations are presented. This framework is anticipated to have broad applicability to other technology and application domains.
This paper discusses survey data collected as a result of planning a project to evaluate document recognition and information retrieval technologies. In the process of establishing the project, a Request for Comment (RFC) was widely distributed throughout the document recognition and information retrieval research and development (R&D) communities, and based on the responses, the project was discontinued. The purpose of this paper is to present `real' data collected from the R&D communities in regards to a `real' project, so that we may all form our own conclusions about where we are, where we are heading, and how we are going to get there. Background on the project is provided and responses to the RFC are summarized.
A new, fully-automated process has been developed at NIST to derive ground truth for document images. The method involves matching optical character recognition (OCR) results from a page with typesetting files for an entire book. Public domain software used to derive the ground truth is provided in the form of Perl scripts and C source code, and includes new, more efficient string alignment technology and a word- level scoring package. With this ground truthing technology, it is now feasible to produce much larger data sets, at much lower cost, than was ever possible with previous labor- intensive, manual data collection projects. Using this method, NIST has produced a new document image database for evaluating Document Analysis and Recognition technologies and Information Retrieval systems. The database produced contains scanned images, SGML-tagged ground truth text, commercial OCR results, and image quality assessment results for pages published in the 1994 Federal Register. These data files are useful in a wide variety of experiments and research. There were roughly 250 issues, comprised of nearly 69,000 pages, published in the Federal Register in 1994. This volume of the database contains the pages of 20 books published in January of that year. In all, there are 4711 page images provided, with 4519 of them having corresponding ground truth. This volume is distributed on two ISO-9660 CD- ROMs. Future volumes may be released, depending on the level of interest.
Building upon the utility of connected components, NIST has designed a new character segmentor based on statistically modeling the style of a person's handwriting. Simple spatial features capture the characteristics of a particular writer's style of handprint, enabling the new method to maintain a traditional character-level segmentation philosophy without the integration of recognition or the use of oversegmentation and linguistic postprocessing. Estimates for stroke width and character height are used to compute aspect ratio and standard stroke count features that adapt to the writer's style at the field level. The new method has been developed with a predetermined set of fuzzy rules making the segmentor much less fragile and much more adaptive, and the new method successfully reconstructs fragmented characters as well as splits touching characters. The new segmentor was integrated into the NIST public domain form-based handprint recognition systems and then tested on a set of 490 handwriting sample forms found in NIST special database 19. When compared to a simple component-based segmentor, the new adaptable method improved the overall recognition of handprinted digits by 3.4 percent and field level recognition by 6.9 percent, while effectively reducing deletion errors by 82 percent. The same program code and set of parameters successfully segments sequences of uppercase and lowercase characters without any context-based tuning. While not as dramatic as digits, the recognition of uppercase and lowercase characters improved by 1.7 percent and 1.3 percent respectively. The segmentor maintains a relatively straight-forward and logical process flow avoiding convolutions of encoded exceptions as is common in expert systems. As a result, the new segmentor operates very efficiently, and throughput as high as 362 characters per second can be achieved. Letters and numbers are constructed from a predetermined configuration of a relatively small number of strokes. Results in this paper show that capitalizing on this knowledge through the use of simple adaptable features can significantly improve segmentation, whereas recognition-based and oversegmentation methods fail to take advantage of these intrinsic qualities of handprinted characters.
A public domain optical character recognition (OCR) system has been developed by the National Institute of Standards and Technology (NIST). This standard reference form-based handprint recognition system is designed to provide a baseline of performance
on an open application. The system's source code, training data, performance assessment tools, and type of forms processed are all publicly available. The system is modular, allowing for system component
testing and comparisons, and it can be used to validate training and testing sets in an end-to-end application. The system's source code is written in C and will run on virtually any UNIX-based computer. The presented functional components of the system are divided into three levels of processing: (1) form-level processing includes the tasks of form registration and form removal; (2) field-level processing includes the tasks of field isolation, line trajectory reconstruction, and field segmentation; and (3) character-level processing includes character normalization, feature extraction, character classification, and dictionary-based postprocessing. The system contains a number of significant contributions to OCR technology, including an optimized probabilistic neural network (PNN) classifier that operates a factor of 20 times faster than traditional software implementations of the algorithm. Provided in the system are a host
of data structures and low-level utilities for computing spatial histograms, least-squares fitting, spatial zooming, connected components, Karhunen Loe` ve feature extraction, optimized PNN classification,
and dynamic string alignment. Any portion of this standard reference OCR system can be used in commercial products without restrictions.
A new technique for intelligent form removal has been developed along with a new method for evaluating its impact on optical character recognition (OCR). All the dominant lines in the image are automatically detected using the Hough line transform and intelligently erased while simultaneously preserving overlapping character strokes by computing line width statistics and keying off of certain visual cues. This new method of form removal operates on loosely defined zones with no image deskewing. Any field in which the writer is provided a horizontal line to enter a response can be processed by this method. Several examples of processed fields are provided, including a comparison of results between the new method and a commercially available forms removal package. Even if this new form removal method did not improve character recognition accuracy, it is still a significant improvement to the technology because the requirement of a priori knowledge of the form's geometric details has been greatly reduced. This relaxes the recognition system's dependence on rigid form design, printing, and reproduction by automatically detecting and removing some of the physical structures (lines) on the form. Using the National Institute of Standards and Technology (NIST) public domain form-based handprint recognition system, the technique was tested on a large number of fields containing randomly ordered handprinted lowercase alphabets, as these letters (especially those with descenders) frequently touch and extend through the line along which they are written. Preserving character strokes improves overall lowercase recognition performance by 3%, which is a net improvement, but a single performance number like this doesn't communicate how the recognition process was really influenced. There is expected to be trade- offs with the introduction of any new technique into a complex recognition system. To understand both the improvements and the trade-offs, a new analysis was designed to compare the statistical distributions of individual confusion pairs between two systems. As OCR technology continues to improve, sophisticated analyses like this are necessary to reduce the errors remaining in complex recognition problems.
A public domain document processing system has been developed by the National Institute of Standards and Technology (NIST). The system is a standard reference form-based handprint recognition system for evaluating optical character recognition (OCR), and it is intended to provide a baseline of performance on an open application. The system's source code, training data, performance assessment tools, and type of forms processed are all publicly available. The system recognizes the handprint entered on handwriting sample forms like the ones distributed with NIST Special Database 1. From these forms, the system reads hand-printed numeric fields, upper and lowercase alphabetic fields, and unconstrained text paragraphs comprised of words from a limited-size dictionary. The modular design of the system makes it useful for component evaluation and comparison, training and testing set validation, and multiple system voting schemes. The system contains a number of significant contributions to OCR technology, including an optimized probabilistic neural network (PNN) classifier that operates a factor of 20 times faster than traditional software implementations of the algorithm. The source code for the recognition system is written in C and is organized into 11 libraries. In all, there are approximately 19,000 lines of code supporting more than 550 subroutines. Source code is provided for form registration, form removal, field isolation, field segmentation, character normalization, feature extraction, character classification, and dictionary-based postprocessing. The recognition system has been successfully compiled and tested on a host of UNIX workstations. This paper gives an overview of the recognition system's software architecture, including descriptions of the various system components along with timing and accuracy statistics.
A word recognition system has been developed at NIST to read free-formatted text paragraphs containing handprinted characters. The system has been developed and tested using binary images containing 2,100 different writers' printings of the Preamble to the U.S. Constitution. Each writer was asked to print these sentences in an empty 70 mm by 175 mm box. The Constitution box contains no guidelines for the placement and spacing of the handprinted text, nor are there guidelines to instruct the writer where to stop printing one line and to begin the next. While the layout of the handprint in these paragraphs is unconstrained, a limited-size lexicon may be applied to reduce the complexity of the recognition application. The system's four components have been combined into an end-to-end hybrid system that executes across a UNIX file server and a massively parallel SIMD computer. The recognition system achieves a word error rate of 49% across all 2,100 printings of the Preamble (109,096 words). This performance is achieved with a neural network character classifier that has a substitution error rate of 14% on its 22,823 training patterns.
NIST needed a large set of segmented characters for use as a test set for the First Census Optical Character Recognition (OCR) Systems Conference. A machine-assisted human classification system was developed to expedite the process. The testing set consists of 58,000 digits and 10,000 upper and lower case characters entered on forms by high school students and is distributed as Testdata 1. A machine system was able to recognize a majority of the characters but all system decisions required human verification. The NIST recognition system was augmented with human verification to produce the testing database. This augmented system consists of several parts, the recognition system, a checking pass, a correcting pass, and a clean up pass. The recognition system was developed at NIST. The checking pass verifies that an image is in the correct class. The correcting pass allows classes to be changed. The clean-up pass forces the system to stabilize by making all images accepted with verified classifications or rejected. In developing the testing set we discovered that segmented characters can be ambiguous even without context bias. This ambiguity can be caused by over- segmentation or by the way a person writes. For instance, it is possible to create four ambiguous characters to represent all ten digits. This means that a quoted accuracy rate for a set of segmented characters is meaningless without reference to human performance on the same set of characters. This is different from the case of isolated fields where most of the ambiguity can be overcome by using context which is available in the non-segmented image. For instance, in the First Census OCR Conference, one system achieved a forced decision error rate for digits of 1.6% while 21 other systems achieved error rates of 3.2% to 5.1%. This statement cannot be evaluated until human performance on the same set of characters presented one at a time without context has been measured.
Two reject mechanisms are compared using a massively parallel character recognition system implemented at NIST. The recognition system was designed to study the feasibility of automatically recognizing hand-printed text in a loosely constrained environment. The first method is a simple scalar threshold on the output activation of the winning neurode from the character classifier network. The second method uses an additional neural network trained on all outputs from the character classifier network to accept or reject assigned classifications. The neural network rejection method was expected to perform with greater accuracy than the scalar threshold method, but this was not supported by the test results presented. The scalar threshold method, even though arbitrary, is shown to be a viable reject mechanism for use with neural network character classifiers. Upon studying the performance of the neural network rejection method, analyses show that the two neural networks, the character classifier network and the rejection network, perform very similarly. This can be explained by the strong non-linear function of the character classifier network which effectively removes most of the correlation between character accuracy and all activations other than the winning activation. This suggests that any effective rejection network must receive information from the system which has not been filtered through the non-linear classifier.
NIST has developed a massively parallel hand-print recognition system that allows components to be interchanged. Using this system, three different character segmentation algorithms have been developed and studied. They are blob coloring, histogramming, and a hybrid of the two. The blob coloring method uses connected components to isolate characters. The histogramming method locates linear spaces, which may be slanted, to segment characters. The hybrid method is an augmented histogramming method that incorporates statistically adaptive rules to decide when a histogrammed item is too large and applies blob coloring to further segment the difficult item. The hardware configuration is a serial host computer with a 1024 processor SIMD machine attached to it. The data used in this comparison is `NIST Special Database 1' which contains 2100 forms from different writers where each form contains 130 digit characters distributed across 28 fields. This gives a potential 273,000 characters to be segmented. Running the massively parallel system across the 2100 forms, blob coloring required 2.1 seconds per form with an accuracy of 97.5%, histogramming required 14.4 seconds with an accuracy of 95.3%, and the hybrid method required 13.2 seconds with an accuracy of 95.4%. The results of this comparison show that the blob coloring method on a SIMD architecture is superior.
A massively parallel character recognition system has been implemented. The system is designed to study the feasibility of the recognition of handprinted text in a loosely constrained environment. The NIST handprint database, NIST Special Database 1, is used to provide test data for the recognition system. The system consists of eight functional components. The loading of the image into the system and storing the recognition results from the system are I/O components. In between are components responsible for image processing and recognition. The first image processing component is responsible for image correction for scale and rotation, data field isolation, and character data location within each field; the second performs character segmentation; and the third does character normalization. Three recognition components are responsible for feature extraction and character reconstruction, neural network-based character recognition, and low-confidence classification rejection. The image processing to load and isolate 34 fields on a scientific workstation takes 900 seconds. The same processing takes only 11 seconds using a massively parallel array processor. The image processing components, including the time to load the image data, use 94 of the system time. The segmentation time is 15 ms/character and segmentation accuracy is 89 for handprinted digits and alphas. Character recognition accuracy for medium quality machine print is 99.8. On handprinted digits, the recognition accuracy is 96 and recognition speeds of 10,100 characters/second can be realized. The limiting factor in the recognition portion of the system is feature extraction, which occurs at 806 characters/second. Through the use of a massively parallel machine and neural recognition algorithms, significant improvements in both accuracy and speed have been achieved, making this technology effective as a replacement for key data entry in existing data capture systems.
Developers of large-scale document processing and image recognition systems are in need of a dynamically robust character segmentation component. Without this essential module, potential turn-key products will remain in the laboratory indefinitely. An experiment of evolving a biologically based neural image processing system which has the ability to isolate characters within an unstructured text image is presented. In this study, organisms are simulated using a genetic algorithm with the goal of learning the intelligent behavior required for locating and consuming text image characters. Each artificial life-form is defined by a genotype containing a list of interdependent control parameters which contribute to specific functions of the organism. Control functions include vision, consumption, and movement. Using asexual reproduction in conjunction with random mutation, a domain independent solution for text segmentation is sought. For this experiment, an organism's vision system utilizes a rectangular receptor field with signals accumulated using Gabor functions. The optimal subset of Gabor kernel functions for conducting character segmentation are determined through the process of evolution. From the results, two analyses are presented. A study of performance over evolved generations shows that qualifiers for the natural selection of dominant organisms increased 62%. The second analysis visually compares and discusses the variations of dominant genotypes form the first generation to the uniform genotypes resulting from the final generation.