In picture quality assessment, the amount of distortion perceived by a human observer differs from one region to another according to its particular local content. This subjective perception can be explained/predicted by considering some simple psychovisual properties (masking) of the Human Visual System (HVS). We have implemented a HVS model based on a pyramid decomposition for extracting the spatial frequencies, associated with a multi-resolution motion representation. Then the visibility of the decoded errors is computed by exploiting the Kelly's contrast sensitivity spatio-velocity model. The resulting data is called a 'Quality-map.' Special attention has been paid to temporal/moving effects since, in the case of video sequences, motion strongly influences the subjective quality assessment. The quality of the motion information is thus preponderant. In the second part, two possible uses of these psychovisual properties for improving MPEG video encoding performances are depicted: (1) The pre-processing of the pictures to remove non-visible information using a motion adapted filtering. This process is efficient in term of bits saved and degradation is not significant especially on consumer electronic TV sets. (2) A perceptual quantizer based on a local adaptation scheme in order to obtain Quality-maps as uniform as possible (homogeneous perceived distortion), at constant bit-rate. Further improvements have been considered, especially when the viewer is tracking a moving object in the scene.
"Perceptually adapted MPEG video encoding", Proc. SPIE 3959, Human Vision and Electronic Imaging V, (2 June 2000); doi: 10.1117/12.387153; https://doi.org/10.1117/12.387153