An established way of validating and testing new image quality assessment (IQA) algorithms have been to compare how
well they correlate with subjective data on various image databases. One of the most common measures is to calculate
linear correlation coefficient (LCC) and Spearman’s rank order correlation coefficient (SROCC) against the subjective
mean opinion score (MOS). Recently, databases with multiply distorted images have emerged 1,2. However with
multidimensional stimuli, there is more disagreement between observers as the task is more preferential than that of
distortion detection. This reduces the statistical differences between image pairs. If the subjects cannot distinguish a
difference between some of the image pairs, should we demand any better performance with IQA algorithms? This paper
proposes alternative performance measures for the evaluation of IQA’s for the CID2013 database. One proposed
alternative performance measure is root-mean-square-error (RMSE) value for the subjective data as a function of the
number of observers. The other alternative performance measure is the number of statistical differences between image
pairs. This study shows that after 12 subjects the RMSE value saturates around the level of three, meaning that a target
RMSE value for an IQA algorithm for CID2013 database should be three. In addition, this study shows that the state-of-the-art IQA algorithms found the better image from the image pairs with a probability of 0.85 when the image pairs with
statistically significant differences were taken into account.
To understand the viewing strategies employed in a quality estimation task, we compared two visual tasks—quality estimation and difference estimation. The estimation was done for a pair of natural images having small global changes in quality. Two groups of observers estimated the same set of images, but with different instructions. One group estimated the difference in quality and the other the difference between image pairs. The results demonstrated the use of different visual strategies in the tasks. The quality estimation was found to include more visual planning during the first fixation than the difference estimation, but afterward needed only a few long fixations on the semantically important areas of the image. The difference estimation used many short fixations. Salient image areas were mainly attended to when these areas were also semantically important. The results support the hypothesis that these tasks’ general characteristics (evaluation time, number of fixations, area fixated on) show differences in processing, but also suggest that examining only single fixations when comparing tasks is too narrow a view. When planning a subjective experiment, one must remember that a small change in the instructions might lead to a noticeable change in viewing strategy.
The most common tasks in subjective image estimation are change detection (a detection task) and image quality
estimation (a preference task). We examined how the task influences the gaze behavior when comparing detection and
preference tasks. The eye movements of 16 naïve observers were recorded with 8 observers in both tasks. The setting
was a flicker paradigm, where the observers see a non-manipulated image, a manipulated version of the image and again
the non-manipulated image and estimate the difference they perceived in them. The material was photographic material
with different image distortions and contents. To examine the spatial distribution of fixations, we defined the regions of
interest using a memory task and calculated information entropy to estimate how concentrated the fixations were on the
image plane. The quality task was faster and needed fewer fixations and the first eight fixations were more concentrated
on certain image areas than the change detection task. The bottom-up influences of the image also caused more variation
to the gaze behavior in the quality estimation task than in the change detection task The results show that the quality
estimation is faster and the regions of interest are emphasized more on certain images compared with the change
detection task that is a scan task where the whole image is always thoroughly examined. In conclusion, in subjective
image estimation studies it is important to think about the task.
Subjective quality rating does not reflect the properties of the image directly, but it is the outcome of a quality decision
making process, which includes quantification of subjective quality experience. Such a rich subjective content is often
ignored. We conducted two experiments (with 28 and 20 observers), in order to study the effect of paper grade on image
quality experience of the ink-jet prints. Image quality experience was studied using a grouping task and a quality rating
task. Both tasks included an interview, but in the latter task we examined the relations of different subjective attributes in
this experience. We found out that the observers use an attribute hierarchy, where the high-level attributes are more
experiential, general and abstract, while low-level attributes are more detailed and concrete. This may reflect the
hierarchy of the human visual system. We also noticed that while the observers show variable subjective criteria for IQ,
the reliability of average subjective estimates is high: when two different observer groups estimated the same images in
the two experiments, correlations between the mean ratings were between .986 and .994, depending on the image
content.
This study presents a methodology of forming contextually valid scales for subjective video quality measurement. Any
single value of quality e.g. Mean Opinion Score (MOS) can have multiple underlying causes. Hence this kind of a
quality measure is not enough for example, in describing the performance of a video capturing device. By applying
Interpretation Based Quality (IBQ) method as a qualitative/quantitative approach we have collected attributes familiar to the end user and that are extracted directly from the material offered by the observers' comments. Based on these
findings we formed contextually valid assessment scales from the typically used quality attributes. A large set of data
was collected from 138 observers to generate the video quality vocabulary. Video material was shot by three types of
video cameras: Digital video cameras (4), digital still cameras (9) and mobile phone cameras (9). From the quality
vocabulary, we formed 8 unipolar 11-point scales to get better insight of video quality. Viewing conditions were adjusted
to meet the ITU-T Rec. P.910 requirements. It is suggested that the applied qualitative/quantitative approach is especially
efficient for finding image quality differences in video material where the quality variations are multidimensional in
nature and especially when image quality is rather high.
The subjective quality of an image is a non-linear product of several, simultaneously contributing subjective factors such
as the experienced naturalness, colorfulness, lightness, and clarity. We have studied subjective image quality by using a
hybrid qualitative/quantitative method in order to disclose relevant attributes to experienced image quality. We describe
our approach in mapping the image quality attribute space in three cases: still studio image, video clips of a talking head
and moving objects, and in the use of image processing pipes for 15 still image contents. Naive observers participated in
three image quality research contexts in which they were asked to freely and spontaneously describe the quality of the
presented test images. Standard viewing conditions were used. The data shows which attributes are most relevant for
each test context, and how they differentiate between the selected image contents and processing systems. The role of
non-HVS based image quality analysis is discussed.
Stereoscopic technologies have developed significantly in recent years. These advances require also more understanding
of the experiental dimensions of stereoscopic contents. In this article we describe experiments in which we explore the
experiences that viewers have when they view stereoscopic contents. We used eight different contents that were shown
to the participants in a paired comparison experiment where the task of the participants was to compare the same content
in stereoscopic and non-stereoscopic form. The participants indicated their preference but were also interviewed about
the arguments they used when making the decision. By conducting a qualitative analysis of the interview texts we
categorized the significant experiental factors related to viewing stereoscopic material. Our results indicate that reality-likeness
as well as artificiality were often used as arguments in comparing the stereoscopic materials. Also, there were
more emotional terms in the descriptions of the stereoscopic films, which might indicate that the stereoscopic projection
technique enhances the emotions conveyed by the film material. Finally, the participants indicated that the three-dimensional
material required longer presentation time, as there were more interesting details to see.
We present an effective method for comparing subjective audiovisual quality and the features related to the quality
changes of different video cameras. Both quantitative estimation of overall quality and qualitative description of critical
quality features are achieved by the method. The aim was to combine two image quality evaluation methods, the
quantitative Absolute Category Rating (ACR) method with hidden reference removal and the qualitative Interpretation-
Based Quality (IBQ) method in order to see how they complement each other in audiovisual quality estimation tasks. 26
observers estimated the audiovisual quality of six different cameras, mainly mobile phone video cameras. In order to
achieve an efficient subjective estimation of audiovisual quality, only two contents with different quality requirements
were recorded with each camera. The results show that the subjectively important quality features were more related to
the overall estimations of cameras' visual video quality than to the features related to sound. The data demonstrated two
significant quality dimensions related to visual quality: darkness and sharpness. We conclude that the qualitative
methodology can complement quantitative quality estimations also with audiovisual material. The IBQ approach is
valuable especially, when the induced quality changes are multidimensional.
Image evaluation schemes must fulfill both objective and subjective requirements. Objective image quality evaluation models are often preferred over subjective quality evaluation, because of their fastness and cost-effectiveness. However, the correlation between subjective and objective estimations is often poor. One of the key reasons for this is that it is not known what image features subjects use when they evaluate image quality. We have studied subjective image quality evaluation in the case of image sharpness. We used an Interpretation-based Quality (IBQ) approach, which combines both qualitative and quantitative approaches to probe the observer's quality experience. Here we examine how naive subjects experienced and classified natural images, whose sharpness was changing. Together the psychometric and qualitative information obtained allows the correlation of quantitative evaluation data with its underlying subjective attribute sets. This offers guidelines to product designers and developers who are responsible for image quality. Combining these methods makes the end-user experience approachable and offers new ways to improve objective image quality evaluation schemes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.