KEYWORDS: Target detection, Sensors, Visualization, Data modeling, Visual system, Electronic components, Mobile devices, Cell phones, Personal digital assistants, Visual process modeling
The issue of reading on electronic devices is getting important as the popularity of mobile devices, such as cell phones or
PDAs, increases. In this study, we used the spatial summation paradigm to measure the spatial constraints for text detection.
Four types of stimuli (real characters, non-characters, Jiagu and scrambled lines) were used in the experiments. All
characters we used had two components in a left-right configuration. A non-character was constructed by swapping the left
and right components of a real character in position to render it unpronounceable. The Jiagu characters were ancient texts
and have the same left-right configuration as the modern Chinese characters, but contain no familiar components. Thus,
the non-characters keep the components while destroy the spatial configuration between them and the Jaigu characters
have no familiar component while keep the spatial configuration intact. The detection thresholds for the same stimulus size
and the same eccentricity were the same for all types of stimuli. When the text-size is small, the detection threshold of a
character decreased with the increase in its size, with a slope of -1/2 on log-log coordinates, up to a critical size at all
eccentricities and for all stimulus types. The sensitivity for all types of stimuli was increased from peripheral to central
vision. In conclusion, the detectability is based on local feature analysis regardless of character types. The cortical
magnification, E2, is 0.82 degree visual angle. With this information, we can estimate the detectability of a character by its
size and eccentricity.
Human visual system is sensitive to both the first-order and the second-order variations in an image.
The latter one is especially important for the digital image processing as it allows human observers
to perceive the envelope of the pixel intensities as smooth surface instead of the discrete pixels. Here
we used pattern masking paradigm to measure the detection threshold of contrast modulated (CM)
stimuli, which comprise the modulation of the contrast of horizontal gratings by a vertical Gabor
function, under different modulation depth of the CM stimuli. The threshold function showed a
typical dipper shape: the threshold decreased with modulation depth (facilitation) at low pedestal
depth modulations and then increased (suppression) at high pedestal modulation. The data was well
explained by a modified divisive inhibition model that operated both on depth modulation and
carrier contrast in the input images. Hence the divisive inhibition, determined by both the first- and
the second-order information in the stimuli, is necessary to explain the discrimination between two
second-order stimuli.
We collected 6604 images of 30 models in eight types of facial expression: happiness, anger, sadness, disgust, fear,
surprise, contempt and neutral. Among them, 406 most representative images from 12 models were rated by more than
200 human raters for perceived emotion category and intensity. Such large number of emotion categories, models and
raters is sufficient for most serious expression recognition research both in psychology and in computer science. All the
models and raters are of Asian background. Hence, this database can also be used when the culture background is a
concern. In addition, 43 landmarks each of the 291 rated frontal view images were identified and recorded. This
information should facilitate feature based research of facial expression. Overall, the diversity in images and richness in
information should make our database and norm useful for a wide range of research.
In this study, we measured blur discrimination threshold at different blur levels. We found that the discrimination
threshold first decreased and then increased again as reference edge width blur increased. This dipper shape of the blur
discrimination threshold vs. reference width functions (TvW) functions can be explained by a divisive inhibition model.
The first stage of the model contains a linear operator whose excitation is the inner product of the image and the
sensitivity profile of the operator. The response of the blur discrimination mechanism is the power function of the
excitation of the linear operator divided by the sum of the divisive inhibition and an additive factor. Changing mean
luminance of the edge has little effect on blur discrimination except at very low luminance. When luminance is low, the
blur discrimination was higher at small reference blur than those measured at medium to high luminance. This difference
diminished at large reference blur. Such luminance effect can be explained by a change in the additive factor in the
model. Reducing contrast of the edge shifted the whole TvW function up vertically. This effect can be explained by the
decrease of gain factors in the linear operator. With these results, we constructed a metric for blur perception from the
divisive inhibition we proposed and tested in this study.
We investigated the spatial summation effect on pedestals with difference luminance. The targets were luminance modulation defined by Gaussian functions. The size of the Gaussian spot was determined by the scale parameter (standard deviation, σ) which ranged from 0.13°to 1.04°. The local luminance pedestal (2° radius) had mean luminance ranged from 2.9 to 29cd/m2. The no-pedestal condition had a mean luminance 58cd/m2. We used a QUEST adaptive threshold seeking procedure and 2AFC paradigm to measure the target contrast threshold at different target sizes (spatial summation curve) and pedestal luminance. The target threshold decreased as the target spatial extent increased with a slope -0.5 on log-log coordinates. However, if the target size was large enough (σ>0.3°), there was little, if any, threshold reduction as the target size further increased. The spatial summation curve had the same shape at all pedestal luminance levels. The effect of the pedestal was to shift the summation curve vertically on log-log coordinates. Hence, the size and the luminance effects on target detection are separable. The visibility of the Gaussian spot can be modeled by a function with a form f(L)*g(σ) where f(L) is a function of local luminance and g(σ) is a function of size.
Models that predict human performance on narrow classes of visual stimuli abound in the vision science literature. However, the vision and the applied imaging communities need robust general-purpose, rather than narrow, computational human visual system (HVS) models to evaluate image fidelity and quality and ultimately improve imaging algorithms. Of the general-purpose early HVS models that currently exist, direct model comparisons on the same data sets are rarely made. The Modelfest group was formed several years ago to solve these and other vision modeling issues. The group has developed a database of static spatial test images with threshold data that is posted on the WEB for modellers to use in HVS model design and testing. The first phase of data collection was limited to detection thresholds for static gray scale 2D images. The current effort will extend the database to include thresholds for selected grayscale 2D spatio-temporal image sequences. In future years, the database will be extended to include discrimination (masking) for dynamic, color and gray scale image sequences. The purpose of this presentation is to invite the Electronic Imaging community to participate in this effort and to inform them of the developing data set, which is available to all interested researchers. This paper presents the display specifications, psychophysical methods and stimulus definitions for the second phase of the project, spatio-temporal detection. The threshold data will be collected by each of the authors over the next year and presented on the WEB along with the stimuli.
KEYWORDS: Linear filtering, Data modeling, Target detection, Image quality, Modulation, Image filtering, Visualization, Modulation transfer functions, Visual process modeling, Chlorine
Contrast discrimination is an important type of information for establishing image quality metrics based on human vision. We used a dual-masking paradigm to study how contrast discrimination can be influenced by the presence of adjacent stimuli. In a dual masking paradigm, the observer's task is to detect a target superimposed on a pedestal in the presence of flankers. The flankers (1) reduce the target threshold at zero pedestal contrast; (2) reduce the size of pedestal facilitation at low pedestal contrasts; and (3) shift the TvC (Target threshold vs. pedestal contrast) function horizontally to the left on a log-log plot at high pedestal contrasts. The horizontal shift at high pedestal contrasts suggests that the flanker effect is a multiplicative factor that cannot be explained by previous models of contrast discrimination. We extended a divisive inhibition model of contrast discrimination by implementing the flanker effect as a multiplicative sensitivity modulation factor that account for the data well.
KEYWORDS: Data modeling, Visual process modeling, Spatial frequencies, Human vision and color perception, Databases, Video compression, Composites, Visualization, Performance modeling, Image compression
A robust model of the human visual system (HVS) would have a major practical impact on the difficult technological problems of transmitting and storing digital images. Although most HVS models exhibit similarities, they may have significant differences in predicting performance. Different HVS models are rarely compared using the same set of psychophysical measurements, so their relative efficacy is unclear. The Modelfest organization was formed to solve this problem and accelerate the development of robust new models of human vision. Members of Modelfest have gathered psychophysical threshold data on the year one stimuli described at last year's SPIE meeting. Modelfest is an exciting new approach to modeling involving the sharing of resources, learning from each other's modeling successes and providing a method to cross-validate proposed HVS models. The purpose of this presentation is to invite the Electronic Imaging community to participate in this effort and inform them of the developing database, which is available to all researchers interested in modeling human vision. In future years, the database will be extended to other domains such as visual masking, and temporal processing. This Modelfest progress report summarizes the stimulus definitions and data collection methods used, but focuses on the results of the phase one data collection effort. Each of the authors has provided at least one dataset from their respective laboratories. These data and data collected subsequent to the submission of this paper are posted on the WWW for further analysis and future modeling efforts.
Do all parts of the face contribute equally to face detection or are some parts more detectable than others? The task was to detect the presence of normalized frontal-face images within in aperture windows of varying extent. We performed such a face summation study using two-alternative forced-choice psychophysics. The face stimuli were scaled to equal eye-to- chin distance, foveated on the bridge of the nose. The images were windowed by a fourth-power Gaussian envelope ranging from the center of the nose to the full face width. Eight faces (4 male and 4 female) were presented in randomized order, intermixed with 8 control stimuli consisting of phase- scrambled versions of the images with equal Fourier energy. The integration functions for detection of random images did not deviate significantly from a log-log slope of -0.5, suggesting the operation of a set of ideal integrators with probability summation over all aperture sizes. The data for face detection showed that observers were not ideal integrators for the information in the face images, but integrated linearly up to some small size and failed to gain any improvement for information beyond some larger size. This performance suggested the operation of a specialized face template filter at detection threshold, differing in extent among the observers.
In Year One of the Modelfest project, several laboratories collaborated to collect threshold data of human observers on 45 pattern stimuli. In this preliminary study, we used a principal component analysis (PCA) and a confirmatory factor analysis on the variations among observers to explore the underlying visual mechanisms for detecting Modelfest Stimuli. This analysis is based on the assumption that there are channels in common among observers that are represented with variations in sensitivity level only. We found three principal components. Assuming that each principal component represents a single mechanism, we compute the sensitivity profile of each mechanism as the sum of test stimuli weighted by the factor loadings on each component. The first mechanism is a spot detector. The second mechanism is dominated by a horizontal periodic pattern around 4 c/deg and the third may be characterized as a narrow bar detector.
KEYWORDS: Visual process modeling, Data modeling, Spatial frequencies, Databases, Visualization, Image quality, Image compression, Human vision and color perception, Performance modeling, Linear filtering
Models that predict human performance on narrow classes of visual stimuli abound in the vision science literature. However, the vision and the applied imaging communities need robust general-purpose, rather than narrow, computational human visual system models to evaluate image fidelity and quality and ultimately improve imaging algorithms. Psychophysical measure of image imaging algorithms. Psychophysical measures of image quality are too costly and time consuming to gather to evaluate the impact each algorithm modification might have on image quality.
KEYWORDS: Data modeling, Modulation, Visual process modeling, Colorimetry, Target detection, Visual system, Visualization, Nonlinear dynamics, Current controlled current source, Color vision
We studied the detection of chromoluminance patterns in the presence of chromoluminance pedestals. We examined how thresholds depend on the color directions of the target and the pedestal. Both targets and pedestals were spatial Gabor patterns. The patterns were spatially modulated in color, luminance or both. Equidiscrimination contours describe contrast thresholds for targets in different color directions on the same pedestal. We measured the equidiscrimination contours on green/red and blue/yellow pedestals. The equidiscrimination contours changes with the contrast and the color directions of the pedestals. We applied a model with three pairs of mechanisms that we proposed earlier to these data. Each mechanism consists of a linear receptive-field like color-spatial operator followed by a nonlinear process. The nonlinear process takes two inputs: the excitation comes directly from the linear operator and the divisive inhibition is a nonlinear sum of all linear operator response. Two linear operator pairs are color opponent while the third is non-opponent. The detection variable is computed from the outputs of the nonlinear processes combine by Quick's pooling rule.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.