Paper
24 January 2011 Keyword and image-based retrieval of mathematical expressions
Richard Zanibbi, Bo Yuan
Author Affiliations +
Proceedings Volume 7874, Document Recognition and Retrieval XVIII; 78740I (2011) https://doi.org/10.1117/12.873312
Event: IS&T/SPIE Electronic Imaging, 2011, San Francisco Airport, California, United States
Abstract
Two new methods for retrieving mathematical expressions using conventional keyword search and expression images are presented. An expression-level TF-IDF (term frequency-inverse document frequency) approach is used for keyword search, where queries and indexed expressions are represented by keywords taken from LATEX strings. TF-IDF is computed at the level of individual expressions rather than documents to increase the precision of matching. The second retrieval technique is a form of Content-Based Image Retrieval (CBIR). Expressions are segmented into connected components, and then components in the query expression and each expression in the collection are matched using contour and density features, aspect ratios, and relative positions. In an experiment using ten randomly sampled queries from a corpus of over 22,000 expressions, precision-at-k (k = 20) for the keyword-based approach was higher (keyword: μ = 84.0, σ = 19.0, imagebased: μ = 32.0, σ = 30.7), but for a few of the queries better results were obtained using a combination of the two techniques.
© (2011) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Richard Zanibbi and Bo Yuan "Keyword and image-based retrieval of mathematical expressions", Proc. SPIE 7874, Document Recognition and Retrieval XVIII, 78740I (24 January 2011); https://doi.org/10.1117/12.873312
Lens.org Logo
CITATIONS
Cited by 23 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Latex

Visualization

Mathematics

Feature extraction

Image segmentation

Binary data

Content based image retrieval

Back to Top