We investigate the impact of overt visual attention and perceived interest on the prediction performance of image quality metrics. Towards this end we performed two respective experiments to capture these mechanisms: an eye gaze tracking experiment and a region-of-interest selection experiment. Perceptual relevance maps were created from both experiments and integrated into the design of the image quality metrics. Correlation analysis shows that indeed there is an added value of integrating these perceptual relevance maps. We reveal that the improvement in prediction accuracy is not statistically different between fixation density maps from eye gaze tracking data and region-of-interest maps, thus, indicating the robustness of different perceptual relevance maps for the performance gain of image quality metrics. Interestingly, however, we found that thresholding of region-of-interest maps into binary maps significantly deteriorates prediction performance gain for image quality metrics. We provide a detailed analysis and discussion of the results as well as the conceptual and methodological differences between capturing overt visual attention and perceived interest.
Performing psychophysical experiments to investigate lighting perception can be expensive and time consuming
if complex lighting systems need to be implemented. In this paper, display-based experiments are explored
as a cost effective and less time consuming alternative to real-world experiments. The aim of this work is to
better understand the upper limit of prediction accuracy that can be achieved when presenting an image on a
display rather than the real-world scene. We compare the predictive value of photographs and physically-based
renderings on a number of perceptual lighting attributes. It is shown that the photographs convey statistically
the same lighting perception as in a real-world scenario. Initial renderings have an inferior performance, but are
shown to converge towards the performance of the photographs through iterative improvements.
In this paper, distortions caused by packet loss during video transmission are evaluated with respect to their
perceived annoyance. In this respect, the impact of visual saliency on the level of annoyance is of particular
interest, as regions and objects in a video frame are typically not of equal importance to the viewer. For this
purpose, gaze patterns from a task free eye tracking experiment were utilised to identify salient regions in a
number of videos. Packet loss was then introduced into the bit stream such as that the corresponding distortions
appear either in a salient region or in a non-salient region. A subjective experiment was then conducted in which
human observers rated the annoyance of the distortions in the videos. The outcomes show a strong tendency
that distortions in a salient region are indeed perceived as much more annoying as compared to distortions in
the non-salient region. The saliency of the distorted image content was further found to have a larger impact on
the perceived annoyance as compared to the distortion duration. The findings of this work are considered to be
of great use to improve prediction performance of video quality metrics in the context of transmission errors.
Images usually exhibit regions that particularly attract the viewer's attention. These regions are typically referred to as regions of interest (ROI), and the underlying phenomenon in the human visual system is known as visual attention (VA). In the context of image quality, one can expect that distortions occurring in the ROI are perceived as being more annoying compared to distortions in the background. However, VA is seldom taken into account in existing image quality metrics. In this work, we provide a VA framework to extend existing image quality metrics with a simple VA model. The performance of the framework is evaluated on three contemporary image quality metrics. We further consider the context of wireless imaging where a broad range of artifacts can be observed. To facilitate the VA-based metric design, we conduct subjective experiments to both obtain a ground truth for the subjective quality of a set of test images and to identify ROI in the corresponding reference images. A methodology is further discussed to optimize the VA metrics with respect to quality prediction accuracy and generalization ability. It is shown that the quality prediction performance of the three considered metrics can be significantly improved by deploying the proposed framework.
Visual content typically exhibits regions that particularly attract the viewer's attention, usually referred to as
regions-of-interest (ROI). In the context of visual quality one may expect that distortions occurring in the ROI
are perceived more annoyingly than distortions in the background (BG). This is especially true given that the
human visual system is highly space variant in sampling visual signals. However, this phenomenon of visual
attention is only seldom taken into account in visual quality metric design. In this paper, we thus provide a
framework for incorporation of visual attention into the design of an objective quality metric by means of regionbased
segmentation of the image. To support the metric design we conducted subjective experiments to both
quantify the subjective quality of a set of distorted images and also to identify ROI in a set of reference images.
Multiobjective optimization is then applied to find the optimal weighting of the ROI and BG quality metrics. It
is shown that the ROI based metric design allows to increase quality prediction performance of the considered
metric and also of two other contemporary quality metrics.
Conference Committee Involvement (2)
Human Vision and Electronic Imaging XX
9 February 2015 | San Francisco, California, United States
Human Vision and Electronic Imaging XIX
3 February 2014 | San Francisco, California, United States