Environments for the delivery and consumption of multimedia are often very heterogeneous, due to the use of various
terminals in varying network conditions. One example of such an environment is a wireless network providing
connectivity to a plethora of mobile devices. H.264/AVC Scalable Video Coding (SVC) can be utilized to deal with
diverse usage environments. However, in order to optimally tailor scalable video content along the temporal, spatial, or
perceptual quality axes, a quality metric is needed that reliably models subjective quality. The major contribution of this
paper is the development of a novel quality metric for scalable video bit streams having a low spatial resolution,
targeting consumption in wireless video applications. The proposed quality metric allows modeling the temporal, spatial,
and perceptual quality characteristics of SVC bit streams. This is realized by taking into account several properties of the
compressed bit streams, such as the temporal and spatial variation of the video content, the frame rate, and PSNR values.
An extensive number of subjective experiments have been conducted to construct and verify the reliability of our quality
metric. The experimental results show that the proposed quality metric is able to efficiently reflect subjective quality.
Moreover, the performance of the quality metric is uniformly high for video sequences with different temporal and
Scalable Video Coding (SVC) is one of the promising techniques to ensure Quality of Service (QoS) in multimedia
communication through heterogeneous networks. SVC compresses a raw video into multiple bitstreams composed of a
base bitstream and enhancement bitstreams to support multi scalabilities such as SNR, temporal and spatial. Therefore, it
is able to extract an appropriate bitstream from original coded bitstream without re-encoding to adapt a video to user
environment. In this flexible environment, QoS has appeared as an important issue for service acceptability. Therefore,
there has been a need for measuring a degree of video quality to guarantee the quality of video streaming service.
Existing studies on the video quality metric have mainly focused on temporal and SNR scalability.
In this paper, we propose an efficient quality metric, which allows for spatial scalability as well as temporal and SNR
scalability. To this end, we study the effect of frame rate, SNR, spatial scalability and motion characteristics by using the
subjective quality assessment, and then a new video quality metric supporting full scalability is proposed. Experimental
results show that this quality metric has high correlation with subjective quality. Because the proposed metric is able to
measure a degree of video quality according to the variation of scalability, it will play an important role at the extraction
point for determining the quality of SVC.
Quality is an essential factor in multimedia communication, especially in compression and adaptation. Quality metrics can be divided into three categories: within-modality quality, cross-modality quality, and multi-modality quality. Most research has so far focused on within-modality quality. Moreover, quality is normally just considered from the perceptual perspective. In practice, content may be drastically adapted, even converted to another modality. In this case, we should consider the quality from semantic perspective as well. In this work, we investigate the multi-modality quality from the semantic perspective. To model the semantic quality, we apply the concept of "conceptual graph", which consists of semantic nodes and relations between the nodes. As an typical of multi-modality example, we focus on audiovisual streaming service. Specifically, we evaluate the amount of information conveyed by a audiovisual content where both video and audio channels may be strongly degraded, even audio are converted to text. In the experiments, we also consider the perceptual quality model of audiovisual content, so as to see the difference with semantic quality model.
As color is more widely used to carry visual information in the multimedia content, ability to perceive color plays a crucial role in getting visual information. Regardless of color vision variations, one should have visual information equally. This paper proposes the adaptation technique for color vision variations in the MPEG-21 Digital Item Adaptation (DIA). DIA is performed respectively for severe color vision deficiency (dichromats) and for mild color vision deficiency (anomalous trichromats), according to the description of user characteristics about color vision variations. Adapted images are tested by simulation program for color vision variations so as to recognize the appearance of the adapted images in the color deficient vision. Experimental result shows that proposed adaptation technique works well in the MPEG-21 framework.
We propose semantic event detection method using MPEG-7. In the proposed method, content description technique of MPEG-7 is adopted into the detection algorithm to extract, represent, reuse, and interoperate low-level features. Also we use multiple descriptors to improve efficiency. In this paper, shots and key frames give a hint in semantic event detection by predefined inference. Each shot gets a semantic meaning using MPEG-7 descriptors with example image or image sequence. Event is detected by segmenting the shots.
We propose a technique for generating the MPEG-7 descriptor in compressed image/video data. Image processing in the transform domain is a much interesting area recently because compressed image and video data are becoming widely available with the data format like MPEG or JPEG.. In general, processing in the transform domain requires smaller data quantities, and lower computation complexity than that in the spatial domain. In this paper, we propose a generation algorithm of the MPEG-7 metadata in the compressed domain. We have developed an algorithm to get the homogeneous texture descriptor in the compressed domain.
We propose a content-based summary generation method using MPEG-7 metadata. In this paper, the important events of video are defined and subsequently shot boundary detection is carried out. Then, we analyze the video contents in the shot with multiple content features using multiple MPEG-7 descriptors. In experiments with a golf-video, we combined motion activity, edge histogram and homogeneous texture for the detection of event. Further, the extracted segments and key-frames of each event are described by XML document. Experimental result shows that the proposed method gives reliable summary generation with robust event detection.