A key aspect of image effectiveness is how well the image visually communicates the main subject. In consumer images,
two important features that impact viewer appreciation of the main subject are the amount of clutter and the main subject
placement within the image. Two subjective experiments were conducted to assess the relationship between aesthetic
and technical quality and perception of clutter and image center. For each experiment, 30 participants evaluated the same
70 images, on 0 to 100-point scales for aesthetic and technical quality. For the clutter experiment, participants also
evaluated the images, on 0 to 100-point scales for amount of clutter and main subject emphasis. For the center
experiment, participants pointed directly onto the image to mark the center of interest. Results indicate that aesthetic
quality, technical quality, amount of clutter, and main subject emphasis are strongly correlated. Based on 95%
confidence ellipses and mean-shift clustering, expert main subject maps are consistent with observer identification of
main subject location. Further, the distribution of the observer identification of the center of interest is related to the
object class (e.g., person, scenery). Additional features related to image composition can be used to explain clusters formed by patterns of mean ratings.
The primary goal of the current research was to develop image categorization algorithms that are more consistent with users' search strategies for their personal image collections. Other goals were to provide users with the option of correcting and labeling these image groups and to understand user behaviors and needs while they are using an automated image-organization system. The main focus of this paper is to provide automatic organization of images by two of the most important semantic classes in the consumer domain-events and people. Methods are described for automatically producing meaningful groups of images whereby each group depicts an event as well as clusters of similar faces in users' collections. Given that the proposed system envisions user interaction and is intended for organizing and searching personal collections, a usability study focused on consumers was conducted to gauge the performance of the system.
In stereoscopic display systems, there is always a balance between creating a “wow factor,” using large horizontal disparities, and providing a comfortable viewing environment for the user. In this paper, we explore the range of horizontal disparities, which can be fused by a human observer, as a function of the viewing distance and the field of view of the display. Two studies were conducted to evaluate the performance of human observers in a stereoscopic viewing environment. The viewing distance was varied in the first study using a CRT with shutter glasses. The second study employed a large field-of-view display with infinity focus, and the simulated field of view was varied. The recorded responses included fusion/no fusion, fusion time, and degree of convergence. The results show that viewing distance has a small impact on the angular fusional range. In contrast, the field of view has a much stronger impact on the angular fusional range. A link between the degree of convergence and the fusional range is demonstrated. This link suggests that the capability of the human observer to perform eye vergence movements to achieve stereoscopic fusion may be the limiting factor in fusing large horizontal disparities presented in stereoscopic displays.
Prior publications have shown that ideal observer models provide a good estimate of measured d' values for varying noise amplitude and target strength after allowing for observer internal noise and human efficiency. To provide a consistent estimate of visual performance in general applications, the internal noise and human efficiency should either be fixed values or calculable based on experimental conditions. In the current study, we test observer models for several sizes of three types of targets (rectangular, Gaussian, or Gabor) at two uniform background luminances and three levels of added Gaussian noise. The ideal observer predictions for each individual experimental condition are well correlated with measured d' values (r2 > 0.90 in most cases); however, the required internal noise and human efficiency vary substantially with target and luminance. A modified ideal observer, which includes a luminance-dependent eye filter and Gabor channels, is developed to simultaneously account for the measured d' values in all experimental conditions with r2 = 0.88. This observer model can be used to estimate general target detectability in flat two-dimensional image areas.