24 March 2014 Two-stage approach to keyword spotting in handwritten documents
Author Affiliations +
Separation of keywords from non-keywords is the main problem in keyword spotting systems which has traditionally been approached by simplistic methods, such as thresholding of recognition scores. In this paper, we analyze this problem from a machine learning perspective, and we study several standard machine learning algorithms specifically in the context of non-keyword rejection. We propose a two-stage approach to keyword spotting and provide a theoretical analysis of the performance of the system which gives insights on how to design the classifier in order to maximize the overall performance in terms of F-measure.
© (2014) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Mehdi Haji, Mehdi Haji, Mohammad R. Ameri, Mohammad R. Ameri, Tien D. Bui, Tien D. Bui, Ching Y. Suen, Ching Y. Suen, Dominique Ponson, Dominique Ponson, "Two-stage approach to keyword spotting in handwritten documents", Proc. SPIE 9021, Document Recognition and Retrieval XXI, 90210P (24 March 2014); doi: 10.1117/12.2042265; https://doi.org/10.1117/12.2042265


Segmentation-based image retrieval
Proceedings of SPIE (December 22 1997)
Text extraction from web images
Proceedings of SPIE (February 07 2011)

Back to Top