Most of the efficient objective image or video quality metrics are based on properties and models of the Human Visual System (HVS). This paper is dealing with two major drawbacks related to HVS properties used in such metrics applied in the DWT domain : subband decomposition and masking effect. The multi-channel behavior of the HVS can be emulated applying a perceptual subband decomposition. Ideally, this can be performed in the Fourier domain but it requires too much computation cost for many applications. Spatial transform such as DWT is a good alternative to reduce computation effort but the correspondence between the perceptual subbands and the usual wavelet ones is not straightforward. Advantages and limitations of the DWT are discussed, and compared with models based on a DFT. Visual masking is a sensitive issue. Several models exist in literature. Simplest models can only predict visibility threshold for very simple cue while for natural images one should consider more complex approaches such as entropy masking. The main issue relies on finding a revealing measure of the surround influences and an adaptation: should we use the spatial activity, the entropy, the type of texture, etc.? In this paper, different visual masking models using DWT are discussed and compared.
Regarding the important constraints due to subjective quality assessment, objective image quality assessment has recently been extensively studied. Such metrics are usually of three kinds, they might be Full Reference (FR), Reduced Reference (RR) or No Reference (NR) metrics. We focus here on a new technique, which recently appeared in quality assessment context: data-hiding-based image quality metric. Regarding the amount of data to be transmitted for quality assessment purpose, watermarking based techniques are considered as pseudo noreference metric: A little overhead due to the embedded watermark is added to the image. Unlike most existing techniques, the proposed embedding method exploits an advanced perceptual model in order to optimize both the data embedding and extraction. A perceptually weighted watermark is embedded into the host image, and an evaluation of this watermark allows to assess the host image's quality. In such context, the watermark robustness is crucial; it must be suffciently robust to be detected after very strong distortions, but it must also be suffciently fragile to be degraded along with the host image. In other words, the watermark distortion must be proportional to the image's distortion. Our work is compared to existing standard RR and NR metrics in terms of both the correlation with subjective assessment and of data overhead induced by the mark.