7 March 2013 Presentation video retrieval using automatically recovered slide and spoken text
Author Affiliations +
Abstract
Video is becoming a prevalent medium for e-learning. Lecture videos contain text information in both the presentation slides and lecturer's speech. This paper examines the relative utility of automatically recovered text from these sources for lecture video retrieval. To extract the visual information, we automatically detect slides within the videos and apply optical character recognition to obtain their text. Automatic speech recognition is used similarly to extract spoken text from the recorded audio. We perform controlled experiments with manually created ground truth for both the slide and spoken text from more than 60 hours of lecture video. We compare the automatically extracted slide and spoken text in terms of accuracy relative to ground truth, overlap with one another, and utility for video retrieval. Results reveal that automatically recovered slide text and spoken text contain different content with varying error profiles. Experiments demonstrate that automatically extracted slide text enables higher precision video retrieval than automatically recovered spoken text.
© (2013) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Matthew Cooper, "Presentation video retrieval using automatically recovered slide and spoken text", Proc. SPIE 8667, Multimedia Content and Mobile Devices, 86670E (7 March 2013); doi: 10.1117/12.2008433; https://doi.org/10.1117/12.2008433
PROCEEDINGS
7 PAGES


SHARE
RELATED CONTENT


Back to Top