The text in a video frame can help us to understand the semantics of video content directly. Although there are many approaches that can automatically detect and localize text a video, most of them use the original pixels of an image to find the text regions. In this paper, we present an approach to automatically localize captions in MPEG compressed videos. Caption regions are segmented from background by using their distinguishing texture characteristics. Unlike previously published ones which fully decompress the video sequence before extracting the caption regions or only extract text regions in Intra-(I-) frames, our approach detect and localize caption regions directly in the DCT compressed domain. Therefore, only very small amounts of decoding processes are required. Experiments show that a good caption detection rate can be obtained, and the average recalls of Intra- and Inter-frame detections are 97.77% and 97.84%, respectively.
Mathematic morphology provides a systematic approach to analyze the geometric characteristics of signals or images,
and has been widely applied to many applications such as edge detection, object segmentation, noise suppression. In this
paper, a supervised morphology based video segmentation system is proposed. To find where a semantic object resides,
the user could click the mouse near the boundary of the object in the first frame of a video to indicate its rough
definition, shape and location. The proposed system will automatically segment the first frame by first locating the
searching area and then classifying the units in it into object part and non-object part to find out the continuous contour
by means of a multi-valued watershed algorithm using a hierarchical queue. An adaptive morphological operator based
on edge strength, which is computed by a multi-scale morphological gradient algorithm, is proposed to lower the error
of user assistance such that the searching area is created correctly. Once extended to video object segmentation, a fast
video tracking technique is applied. Under the assumption of small motion, the object can be segmented in real-time.
Moreover, an accuracy evaluation mechanism is proposed to ensure the robustness of the segmentation.
KEYWORDS: Computer programming, Video, Scalable video coding, Visualization, Video coding, Distortion, Quantization, Matrices, Video processing, Signal to noise ratio
The universal scalability, which integrates different types of scalabilities and consequently provides a large scaling range for each parameter, is of high interests to the applications in the current heterogeneous surroundings. In our previous work, an MPEG-4 universal scalable codec basing on a layered path-tree structure [1,2] has been addressed, in which a video layer and the coding order of two consecutive layers are interpreted as a node and the parent-to-child relationship in a path-tree, respectively. Since individual video layers can be coded separately using different coding tools in MPEG-4 simple scalable profile (SSP) [3] and fine-granularity scalable profile (FGS) [4], the proposed scalable video coder may include spatial, temporal and SNR enhancements simultaneously. In this paper, based on some visual observations we first address some encoding strategies for the universal scalable coding, including spatial-temporal quality tradeoff, region sensitivity and frequency weighting. Applying these strategies will take the content characteristics into consideration and can determine better coding parameters. As a result, bit allocation becomes more sensitive to those perceptually important parts of spatial, temporal and SNR enhancements. Next, a batch encoding process is conducted to generate universal scalable streams automatically, in which all the abovementioned encoding strategies are fully integrated together. The preliminary experiments show that better visual quality can be obtained within the full bitrate range.
In this paper, we proposed a content-based multimedia analysis/retrieval system basing mainly on the MPEG-7 defined
features. Some new and specific features are also included for supporting various requirements of different applications.
The proposed system is capable of extracting a variety of features from different kinds of media data, such as videos, audios, and images. The system design is well modularized, so that the tasks of system development and feature construction can be done independently and separately. In other words, adding a new feature into the system can be done without modifying the structure of the system. In addition, a user can also input some hints about what he or she is
looking for into the system, and the system will search through the database and return the best matched candidates. On the basis of this well-modularized system, several interesting applications have been investigated and realized in our Lab, to show the effectiveness and flexibility of the proposed work.
Due to the increasing amount of information carried by video, video analysis that clips video as changed scenes or key-frames becomes essential for efficient video indexing. In this paper, we proposed a compressed domain scene change detection and camera motion characterization algorithm. We believe that the most vital inherent information hided in the MPEG bitstream, which can aid scene shot and sub-shot detection, are the motion vector and the macroblock type statistics. We evaluate the results of the scene change detection and camera motion characterization to get the accurate shot and sub-shot location.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.