Paper
14 August 2002 Automated document analysis system
Jeffrey D. Black, Robert Dietzel, David Hartnett
Author Affiliations +
Abstract
A software application has been developed to aid law enforcement and government intelligence gathering organizations in the translation and analysis of foreign language documents with potential intelligence content. The Automated Document Analysis System (ADAS) provides the capability to search (data or text mine) documents in English and the most commonly encountered foreign languages, including Arabic. Hardcopy documents are scanned by a high-speed scanner and are optical character recognized (OCR). Documents obtained in an electronic format bypass the OCR and are copied directly to a working directory. For translation and analysis, the script and the language of the documents are first determined. If the document is not in English, the document is machine translated to English. The documents are searched for keywords and key features in either the native language or translated English. The user can quickly review the document to determine if it has any intelligence content and whether detailed, verbatim human translation is required. The documents and document content are cataloged for potential future analysis. The system allows non-linguists to evaluate foreign language documents and allows for the quick analysis of a large quantity of documents. All document processing can be performed manually or automatically on a single document or a batch of documents.
© (2002) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Jeffrey D. Black, Robert Dietzel, and David Hartnett "Automated document analysis system", Proc. SPIE 4708, Sensors, and Command, Control, Communications, and Intelligence (C3I) Technologies for Homeland Defense and Law Enforcement, (14 August 2002); https://doi.org/10.1117/12.479296
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Optical character recognition

Image processing

Human-machine interfaces

Commercial off the shelf technology

Mining

Process control

Data mining

Back to Top