Microcalcification detection in a mammogram is an effective method to find the early stage of breast tumor. Especially,
computer aided diagnosis (CAD) improves the working performance of radiologists and doctors as it offers an efficient
microcalcification detection. In this paper, we propose a microcalcification detection system which consists of three
modules; coarse detection, clustering, and fine detection module. The coarse detection module finds candidate pixels
from an entire mammogram which are suspected as a part of a microcalcification. The module not only extracts two
median contrast features and two contrast-to-noise ratio features, but also categorizes the candidate pixels with a linear
kernel-based SVM classifier. Then, the clustering module forms the candidate pixels into regions of interest (ROI) using
a region growing algorithm. The objective of the fine detection module is to decide whether the corresponding region
classifies as a microcalcification or not. Eleven features including distribution, variance, gradient, and various edge
components are extracted from the clustered ROIs and are fed into a radial basis function-based SVM classifier to
determine the microcalcification. In order to verify the effectiveness of the proposed microcalcification detection system,
the experiments are performed with full-field digital mammogram (FFDM). We also compare its detection performance
with an ANN-based detection system.
In a mobile consumption environment, users not only desire to preview video contents with highlights, but also desire to
consume attractive segments of the video rather than the whole video. Thus, condensed representation of video contents
which can represent the whole video content and video structure is demanded. In this paper, we propose a video content
authoring system allowing content authors to filter the video structure and to compose contents and metadata efficiently
and effectively. The proposed authoring system consists of two modules: video analyzer and metadata generator. A
video analyzer detects shot boundaries and scenes and establishes temporal segmentation metadata including shot and
scene boundary information. The shot detection adopts adaptive thresholding with different multiple windows to
segment the raw video into shots. The segmented shots are grouped and merged depending on similarity between
adjacent shots. In order to minimize the consumption time of the shot clustering, we apply a span as a computation unit,
which is defined as aggression of successive shots. A metadata generator allows authors to edit the video metadata in
addition to temporal segmentation metadata which is detected by a video analyzer. The video metadata supports
hierarchical representation of individual shot and scene.
H.264/AVC Scalable Video Coding (SVC) is an emerging video coding standard developed by the Joint Video Team
(JVT), which supports multiple scalability features. With scalabilities, SVC video data can be easily adapted to the
characteristics of heterogeneous networks and various devices. Furthermore, SVC requires a high coding efficiency that
is equally competitive or better than single-layer H.264/AVC. Motion prediction at the level of Fine Grain Scalability
(FGS) enhancement layers was proposed to improve coding efficiency as well as inter-layer motion prediction. However,
the removal of the FGS enhancement layer at the inter-layer motion prediction causes significant visual errors due to the
encoder-decoder mismatches of motion vectors and MB modes. In this paper, we analyze visual errors to find the cause
as well as the method for reducing such errors. Experimental results showed that the proposed method allowed SVC
bitstreams decoding with reduced visual errors, even when the FGS enhancement layer used for the inter-layer motion
prediction was removed.
In this paper, we propose a storage format which binds digital broadcasts with related data such as TV-Anytime
metadata, additional multimedia resources, and personal viewing history. The goal of the
proposed format is to make it possible to offer personalized content consumption after recording
broadcasting contents to storage devices, e.g., HD-DVD and Blu-ray Disc. To achieve that, we adopt
MPEG-4 file format as a container and apply a binary format for scenes (BIFS) for representing and
rendering personal viewing history. In addition, TV-Anytime metadata is used to describe broadcasts and
to refer to the additional multimedia resources, e.g, images, audio clips, and short video clips. To
demonstrate the usefulness of the proposed format, we introduce an application scenario and test it on that
In this paper, we propose an archiving method of broadcasts for TV terminals including a set-top box (STB) and a personal video recorder (PVR). Our goal is to effectively cluster and retrieve semantic video scenes obtained by realtime broadcasting content filtering for re-use or transmission. For TV terminals, we generate new video archiving formats which combine broadcasting media resources with the related metadata and auxiliary media data. In addition, we implement an archiving system to decode and retrieve the media resource and the metadata within the format. The experiment shows that the proposed format makes it possible to retrieve or browse media data or metadata in the TV terminal effectively, and could have compatibility with a portable device.
In modern digital broadcasting environment, broadcasting content filtering could provide a useful function that a TV viewer can find or store personal desired scenes from programs of multiple channels and it can be done even when one is watching a program from other channel. To achieve the filtering function in live broadcast, real-time processing is needed basically. In this paper, a broadcasting content filtering algorithm is proposed and filtering system requirements for the real-time content filtering are analyzed. To achieve real-time content processing, a buffer control algorithm is proposed as well. The usefulness of the broadcasting content filtering is demonstrated with experiments on a test-bed system.
We propose a video genre classification method using multimodal features. The proposed method is applied for the preprocessing of automatic video summarization or the retrieval and classification of broadcasting video contents. Through a statistical analysis of low-level and middle-level audio-visual features in video, the proposed method can achieve good performance in classifying several broadcasting genres such as cartoon, drama, music video, news, and sports. In this paper, we adopt MPEG-7 audio-visual descriptors as multimodal features of video contents and evaluate the performance of the classification by feeding the features into a decision tree-based classifier which is trained by CART. The experimental results show that the proposed method can recognize several broadcasting video genres with a high accuracy and the classification performance with multimodal features is superior to the one with unimodal features in the genre classification.
We propose a robust video segmentation algorithm for video summary. Exact shot boundary detection and segmentation of video into meaningful scenes are important parts for the automatic video summary. In this paper, we present a shot boundary detection using audio and visual features defined in the MPEG-7 which provides software standard for multimedia description. By using Hidden Markov Model classifier based on statistics of the audio and visual features, exact shot boundary is detected and further over-segmentation could be reduced, which is a common problem in automatic video segmentation.
We propose a multi-agents platform for an interactive broadcasting system based on the MPEG-7 and TV-anytime standards. In the system, an intelligent agent technique is adopted from FIPA that provides software standards for interacting heterogeneous agents. Using the MDS (Multimedia Description Scheme) of MPEG-7, TV-anytime standards and interactive functions such as a user preference and an audio/video contents summary, are applied. In this paper, we propose the technique to use multi-modal features as well as multiple MPEG-7 features. We evaluate a video summary and filtering based on MPEG-7 on top of intelligent agent.
We propose a technique for generating the MPEG-7 descriptor in compressed image/video data. Image processing in the transform domain is a much interesting area recently because compressed image and video data are becoming widely available with the data format like MPEG or JPEG.. In general, processing in the transform domain requires smaller data quantities, and lower computation complexity than that in the spatial domain. In this paper, we propose a generation algorithm of the MPEG-7 metadata in the compressed domain. We have developed an algorithm to get the homogeneous texture descriptor in the compressed domain.