Video quality assessment plays an important role in video processing and communication applications. We propose a full reference video quality metric by combining a content-weighted spatial pooling strategy with a temporal pooling strategy. All pixels in a frame are classified into edge, texture, and smooth regions, and their structural similarity image index (SSIM) maps are divided into increasing and saturated regions by the curve of their SSIM values, then a content weight method is applied to increasing regions to get the score of an image frame. Finally, a temporal pooling method is used to get the overall video quality. Experimental results on the LIVE and IVP video quality databases show our proposed method works well in matching subjective scores.
Most current state-of-the-art blind image quality assessment (IQA) algorithms usually require process training or learning. Here, we have developed a completely blind IQA model that uses features derived from an image’s contourlet transform and singular-value decomposition. The model is used to build algorithms that can predict image quality without any training or any prior knowledge of the images or their distortions. The new method consists of three steps: first, the contourlet transform is used on the image to obtain detailed high-frequency structural information from the image; second, the singular values of the just-obtained “structural image” are computed; and finally, two new universal blind IQA indices are constructed utilizing the area and slope of the truncated singular-value curves of the “structural image.” Experimental results on three open databases show that the proposed algorithms deliver quality predictions that have high correlations against human subjective judgments and are highly competitive with the state-of-the-art.
We propose a universal no-reference (NR) image quality assessment (QA) index that does not require training on human opinion scores. The new index utilizes perceptually relevant image features extracted from the distorted image. These include the mean phase congruency (PC) of the image, the entropy of the phase congruencyPC image, the entropy of the distorted image, and the mean gradient magnitude of the distorted image. Image quality prediction is accomplished by using a simple functional relationship of these features. The experimental results show that the new index accords closely with human subjective judgments of diverse distorted images.
Objective image and video quality measures play important roles in numerous image and video processing applications. In this work, we propose a new content-weighted method for full-reference (FR) video quality assessment using a three-component image model. Using the idea that different image regions have different perceptual significance relative to quality, we deploy a model that classifies image local regions according to their image gradient properties, then apply variable weights to structural similarity image index (SSIM) [and peak signal-to-noise ratio (PSNR)] scores according to region. A frame-based video quality assessment algorithm is thereby derived. Experimental results on the Video Quality Experts Group (VQEG) FR-TV Phase 1 test dataset show that the proposed algorithm outperforms existing video quality assessment methods.
The assessment of image quality is very important for numerous image processing applications, where the goal of
image quality assessment (IQA) algorithms is to automatically assess the quality of images in a manner that is consistent
with human visual judgment. Two prominent examples, the Structural Similarity Image Metric (SSIM) and Multi-scale
Structural Similarity (MS-SSIM) operate under the assumption that human visual perception is highly adapted for
extracting structural information from a scene. Results in large human studies have shown that these quality indices
perform very well relative to other methods. However, the performance of SSIM and other IQA algorithms are less
effective when used to rate amongst blurred and noisy images. We address this defect by considering a three-component
image model, leading to the development of modified versions of SSIM and MS-SSIM, which we call three component
SSIM (3-SSIM) and three component MS-SSIM (3-MS-SSIM).
A three-component image model was proposed by Ran and Farvardin, <sup></sup> wherein an image was decomposed into
edges, textures and smooth regions. Different image regions have different importance for vision perception, thus, we
apply different weights to the SSIM scores according to the region where it is calculated. Thus, four steps are executed:
(1) Calculate the SSIM (or MS-SSIM) map. (2) Segment the original (reference) image into three categories of regions
(edges, textures and smooth regions). Edge regions are found where a gradient magnitude estimate is large, while smooth
regions are determined where the gradient magnitude estimate is small. Textured regions are taken to fall between these
two thresholds. (3) Apply non-uniform weights to the SSIM (or MS-SSIM) values over the three regions. The weight for
edge regions was fixed at 0.5, for textured regions it was fixed at 0.25, and at 0.25 for smooth regions. (4) Pool the
weighted SSIM (or MS-SSIM) values, typically by taking their weighted average, thus defining a single quality index for
the image (3-SSIM or 3-MS-SSIM).
Our experimental results show that 3-SSIM (or 3-MS-SSIM) provide results consistent with human subjectivity
when finding the quality of blurred and noisy images, and also deliver better performance than SSIM (and MS-SSIM) on
five types of distorted images from the LIVE Image Quality Assessment Database.