In recent years, there have been efforts in defining the statistical properties of aesthetic photographs and artworks using
computer vision techniques. However, it is still an open question how to distinguish aesthetic from non-aesthetic images
with a high recognition rate. This is possibly because aesthetic perception is influenced also by a large number of cultural
variables. Nevertheless, the search for statistical properties of aesthetic images has not been futile. For example, we have
shown that the radially averaged power spectrum of monochrome artworks of Western and Eastern provenance falls off
according to a power law with increasing spatial frequency (1/f2 characteristics). This finding implies that this particular
subset of artworks possesses a Fourier power spectrum that is self-similar across different scales of spatial resolution.
Other types of aesthetic images, such as cartoons, comics and mangas also display this type of self-similarity, as do
photographs of complex natural scenes. Since the human visual system is adapted to encode images of natural scenes in a
particular efficient way, we have argued that artists imitate these statistics in their artworks. In support of this notion, we
presented results that artists portrait human faces with the self-similar Fourier statistics of complex natural scenes
although real-world photographs of faces are not self-similar.
In view of these previous findings, we investigated other statistical measures of self-similarity to characterize aesthetic
and non-aesthetic images. In the present work, we propose a novel measure of self-similarity that is based on the
Pyramid Histogram of Oriented Gradients (PHOG). For every image, we first calculate PHOG up to pyramid level 3.
The similarity between the histograms of each section at a particular level is then calculated to the parent section at the
previous level (or to the histogram at the ground level).
The proposed approach is tested on datasets of aesthetic and non-aesthetic categories of monochrome images. The
aesthetic image datasets comprise a large variety of artworks of Western provenance. Other man-made aesthetically
pleasing images, such as comics, cartoons and mangas, were also studied. For comparison, a database of natural scene
photographs is used, as well as datasets of photographs of plants, simple objects and faces that are in general of low
As expected, natural scenes exhibit the highest degree of PHOG self-similarity. Images of artworks also show high selfsimilarity
values, followed by cartoons, comics and mangas. On average, other (non-aesthetic) image categories are less
self-similar in the PHOG analysis. A measure of scale-invariant self-similarity (PHOG) allows a good separation of the
different aesthetic and non-aesthetic image categories.
Our results provide further support for the notion that, like complex natural scenes, images of artworks display a higher
degree of self-similarity across different scales of resolution than other image categories. Whether the high degree of
self-similarity is the basis for the perception of beauty in both complex natural scenery and artworks remains to be