In this paper we present a novel image quality assessment technique for evaluating virtual synthesized views in the context of multi-view video. In particular, Free Viewpoint Videos are generated from uncompressed color views and their compressed associated depth maps by means of the View Synthesis Reference Software, provided by MPEG. Prior to the synthesis step, the original depth maps are encoded with different coding algorithms thus leading to the creation of additional artifacts in the synthesized views. The core of proposed wavelet-based metric is in the registration procedure performed to align the synthesized view and the original one, and in the skin detection that has been applied considering that the same distortion is more annoying if visible on human subjects rather than on other parts of the scene. The effectiveness of the metric is evaluated by analyzing the correlation of the scores obtained with the proposed metric with Mean Opinion Scores collected by means of subjective tests. The achieved results are also compared against those of well known objective quality metrics. The experimental results confirm the effectiveness of the proposed metric.
Multi-view Video plus Depth (MVD) data refer to a set of conventional color video sequences and an associated set of depth video sequences, all acquired at slightly different viewpoints. This huge amount of data necessitates a reliable compression method. However, there is no standardized compression method for MVD sequences. H.264/MVC compression method, which was standardized for Multi-View-Video representation (MVV), has been the subject of many adaptations to MVD. However, it has been shown that MVC is not well adapted to encode multi-view depth data. We propose a novel option as for compression of MVD data. Its main purpose is to preserve joint color/depth consistency. The originality of the proposed method relies on the use of the decoded color data as a prior for the associated depth compression. This is meant to ensure consistency in both types of data after decoding. Our strategy is motivated by previous studies of artifacts occurring in synthesized views: most annoying distortions are located around strong depth discontinuities and these distortions are due to misalignment of depth and color edges in decoded images. Thus the method is meant to preserve edges and to ensure cosistent localization of color edges and depth edges. To ensure compatibility, colored sequences are encoded with H.264. Depth maps compression is based on a 2D still image codec, namely LAR (locally adapted resolution). It consists in a quad-tree representation of the images. The quad-tree representation contributes in the preservation of edges in both color and depth data. The adopted strategy is meant to be more perceptually driven than state-of-the-art methods. The proposed approach is compared to H.264 encoding of depth images. Objective metrics scores are similar with H.264 and with the proposed method, and visual quality of synthesized views is improved with the proposed approach.
This paper considers the reliability of usual assessment methods when evaluating virtual synthesized views in the multiview
video context. Virtual views are generated from Depth Image Based Rendering (DIBR) algorithms. Because DIBR
algorithms involve geometric transformations, new types of artifacts come up. The question regards the ability of
commonly used methods to deal with such artifacts. This paper investigates how correlated usual metrics are to human
judgment. The experiments consist in assessing seven different view synthesis algorithms by subjective and objective
methods. Three different 3D video sequences are used in the tests. Resulting virtual synthesized sequences are assessed
through objective metrics and subjective protocols. Results show that usual objective metrics can fail assessing
synthesized views, in the sense of human judgment.