31 July 2006 Automatic video caption detection and extraction in the DCT compressed domain
Author Affiliations +
Proceedings Volume 5960, Visual Communications and Image Processing 2005; 59602N (2006) https://doi.org/10.1117/12.631588
Event: Visual Communications and Image Processing 2005, 2005, Beijing, China
The text in a video frame can help us to understand the semantics of video content directly. Although there are many approaches that can automatically detect and localize text a video, most of them use the original pixels of an image to find the text regions. In this paper, we present an approach to automatically localize captions in MPEG compressed videos. Caption regions are segmented from background by using their distinguishing texture characteristics. Unlike previously published ones which fully decompress the video sequence before extracting the caption regions or only extract text regions in Intra-(I-) frames, our approach detect and localize caption regions directly in the DCT compressed domain. Therefore, only very small amounts of decoding processes are required. Experiments show that a good caption detection rate can be obtained, and the average recalls of Intra- and Inter-frame detections are 97.77% and 97.84%, respectively.
© (2006) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Chin-Fu Tsao, Chin-Fu Tsao, Yu-Hao Chen, Yu-Hao Chen, Jin-Hau Kuo, Jin-Hau Kuo, Chia-wei Lin, Chia-wei Lin, Ja-Ling Wu, Ja-Ling Wu, } "Automatic video caption detection and extraction in the DCT compressed domain", Proc. SPIE 5960, Visual Communications and Image Processing 2005, 59602N (31 July 2006); doi: 10.1117/12.631588; https://doi.org/10.1117/12.631588


Semantic event detection using MPEG-7
Proceedings of SPIE (January 09 2003)
Feature management for large video databases
Proceedings of SPIE (April 13 1993)
An improved algorithm for shot cut detection
Proceedings of SPIE (June 23 2005)

Back to Top