In scalable video coding (SVC), video quality can be controlled by bit-stream extraction. The bit-stream extraction is to extract network abstraction layer (NAL) units for the required quality. Since scalable bit-stream is packed in NAL units, quality control of SVC video is highly related to NAL unit extraction policy at a given bit-rate. Therefore, effective extraction policy is required to provide SVC video with optimal quality. In this paper, a SVC bit-stream extraction method based on perceptual quality is proposed. The main goal of this work is to find the optimal extraction policy for SVC bit-stream that contain the spatial, temporal, and SNR scalability at a given bit-rate. Also, to consider perceptual quality relied on video characteristics, the video segments are classified into different classes, namely action, crowd, dialog, scenery, and text&graphic. As a result of a subjective test on the classified video scenes, consistent characteristic of perceptual quality preference is achieved. Based on this, quality information table (QIT) has been determined for each class which guides the bit-stream extraction process. The determined QIT is applied to SVC bit-stream extraction depending on to which class a video segment belongs. In the experiment, the proposed extraction scheme is applied for SVC bit-stream extraction belongs to action class. Extraction policy to maximize perceptual quality of action class is applied based on QIT for action class. The extracted video and multi-dimensional scalability resulted from the proposed scheme are also described.
Quality is an essential factor in multimedia communication, especially in compression and adaptation. Quality metrics can be divided into three categories: within-modality quality, cross-modality quality, and multi-modality quality. Most research has so far focused on within-modality quality. Moreover, quality is normally just considered from the perceptual perspective. In practice, content may be drastically adapted, even converted to another modality. In this case, we should consider the quality from semantic perspective as well. In this work, we investigate the multi-modality quality from the semantic perspective. To model the semantic quality, we apply the concept of "conceptual graph", which consists of semantic nodes and relations between the nodes. As an typical of multi-modality example, we focus on audiovisual streaming service. Specifically, we evaluate the amount of information conveyed by a audiovisual content where both video and audio channels may be strongly degraded, even audio are converted to text. In the experiments, we also consider the perceptual quality model of audiovisual content, so as to see the difference with semantic quality model.