Embodied cognition is a concept that provides a deeper understanding of the aesthetics of art images. This study considers the role of embodied cognition in the appreciation of 3D pictorial space, 4D action space, its extension through mirror reflection to embodied self-‐cognition, and its relation to the neuroanatomical organization of the aesthetic response.
This paper focuses in the interpretation of material properties of reflectivity and specularity assessed by the visual system under illumination consisting of both a focal and a diffuse component (the ‘sun-and-sky’ illumination assumption). This assumption provides for four kinds of luminance gradients: gradients of incident illumination, gradients of reflectivity, gradients of secondary self-illumination and gradients of shadowing. The analysis considers the dissociation of the material properties carried by specularity from the geometric properties of object shape, taking the sinusoidal surface as a canonical shape exemplar.
Binocular eye movements form a finely-tuned system that requires accurate coordination of the oculomotor dynamics
and supports the vergence movements for tracking the fine binocular disparities required for 3D vision, and are
particularly susceptible to disruption by brain injury and other neural dysfunctions. Saccadic dynamics for a population
of 84 diverse participants show tight coefficients of variation of 2-10% of the mean value of each parameter.
Significantly slower dynamics were seen for vertical upward saccades. Binocular coordination of saccades was accurate
to within 1-4%, implying the operation of brainstem coordination mechanisms rather than independent cortical control of
the two eyes. A new principle of oculomotor control - reciprocal binocular inhibition – is introduced to complement
Sherrington’s and Hering’s Laws. This new law accounts for the fact that symmetrical vergence responses are about five
times slower than saccades of the same amplitude, although a comprehensive analysis of asymmetrical vergence
responses revealed unexpected variety in vergence dynamics. This analysis of the variety of human vergence responses
thus contributes substantially to the understanding of the oculomotor control mechanisms underlying the generation of
vergence movements and of the deficits in the oculomotor control resulting from mild traumatic brain injury.
One of the enduring mysteries in the history of the Renaissance is the adult appearance of the archetypical "Renaissance Man," Leonardo da Vinci. His only acknowledged self-portrait is from an advanced age, and various candidate images of younger men are difficult to assess given the absence of documentary evidence. One clue about Leonardo's appearance comes from the remark of the contemporary historian, Vasari, that the sculpture of David by Leonardo's master, Andrea del Verrocchio, was based on the appearance of Leonardo
when he was an apprentice. Taking a cue from this statement, we suggest that the more mature sculpture of St. Thomas, also by Verrocchio, might also have been a portrait of Leonardo. We tested the possibility Leonardo was the subject for Verrocchio's sculpture by a novel computational technique for the comparison of
three-dimensional facial configurations. Based on quantitative measures of similarities, we also assess whether
another pair of candidate two-dimensional images are plausibly attributable as being portraits of Leonardo as
a young adult. Our results are consistent with the claim Leonardo is indeed the subject in these works, but
we need comparisons with images in a larger corpora of candidate artworks before our results achieve statistical significance.
The two major aspects of camera misalignment that cause visual discomfort when viewing images on a 3D display
are vertical and torsional disparities. While vertical disparities are uniform throughout the image, torsional rotations
introduce a range of disparities that depend on the location in the image. The goal of this study was to determine
the discomfort ranges for the kinds of natural image that people are likely to take with 3D cameras rather than the
artificial line and dot stimuli typically used for laboratory studies. We therefore assessed visual discomfort on a
five-point scale from 'none' to 'severe' for artificial misalignment disparities applied to a set of full-resolution
images of indoor scenes.
For viewing times of 2 s, discomfort ratings for vertical disparity in both 2D and 3D images rose rapidly toward
the discomfort level of 4 ('severe') by about 60 arcmin of vertical disparity. Discomfort ratings for torsional
disparity in the same image rose only gradually, reaching only the discomfort level of 3 ('strong') by about 50 deg
of torsional disparity. These data were modeled with a second-order hyperbolic compression function incorporating
a term for the basic discomfort of the 3D display in the absence of any misalignments through a Minkowski norm.
These fits showed that, at a criterion discomfort level of 2 ('moderate'), acceptable levels of vertical disparity were
about 15 arcmin. The corresponding values for the torsional disparity were about 30 deg of relative orientation.
Contrast has always been appreciated as a significant factor in image quality, but it is less widely recognized that
it is a key factor in the representation of depth, solidity and three-dimensionality in images in general, and in
paintings in particular. This aspect of contrast was a key factor in the introduction of oil paint as a painting
medium at the beginning of the fifteenth century, as a practical means of contrast enhancement. However, recent
conservatorship efforts have established that the first oil paintings were not, as commonly supposed, by van Eyck
in Flanders in the 1430s, but by Masolino da Panicale in Italy in the 1420s. These developments led to the use of
chiaroscuro technique in various forms, all of which are techniques for enhanced shadowing.
Some eighteen portraits are now recognized of Leonardo in old age, consolidating the impression from his bestestablished
self-portrait of an old man with long white hair and beard. However, his appearance when younger is
generally regarded as unknown, although he was described as very beautiful as a youth. Application of the principles of
metric iconography, the study of the quantitative analysis of the painted images, provides an avenue for the
identification of other portraits that may be proposed as valid portraits of Leonardo during various stages of his life, by
himself and by his contemporaries. Overall, this approach identifies portraits of Leonardo by Verrocchio, Raphael,
Botticelli, and others. Beyond this physiognomic analysis, Leonardo's first known drawing provides further insight into
his core motivations. Topographic considerations make clear that the drawing is of the hills behind Vinci with a view
overlooking the rocky promontory of the town and the plain stretching out before it. The outcroppings in the
foreground bear a striking resemblance to those of his unique composition, 'The Virgin of the Rocks', suggesting a deep
childhood appreciation of this wild terrain. and an identification with that religious man of the mountains, John the
Baptist, who was also the topic of Leonardo's last known painting. Following this trail leads to a line of possible selfportraits
continuing the age-regression concept back to a self view at about two years of age.
To reveal the cortical network underlying figure/ground perception and to understand its neural dynamics, we
developed a novel paradigm that creates distinct and prolonged percepts of spatial structures by instantaneous refreshes
in random dot fields. Three different forms of spatial configuration were generated by: (i) updating the whole stimulus
field, (ii) updating the ground region only (negative-figure), and (iii) updating the figure and ground regions in brief
temporal asynchrony. FMRI responses were measured throughout the brain. As expected, activation by the
homogenous whole-field update was focused onto the posterior part of the brain, but distinct networks extending
beyond the occipital lobe into the parietal and frontal cortex were activated by the figure/ground and by the negativefigure
configurations. The instantaneous stimulus paradigm generated a wide variety of BOLD waveforms and
corresponding neural response estimates throughout the network. Such expressly different responses evoked by
differential stimulation of the identical cortical regions assure that the differences could be securely attributed to the
neural dynamics, not to spatial variations in the HRF. The activation pattern for figure/ground implies a widely
distributed neural architecture, distinct from the control conditions. Even where activations are partially overlapping, an
integrated analysis of the BOLD response properties will enable the functional specificity of the cortical areas to be
distinguished.
Linear perspective is constructed for a particular viewing location with respect to the scene being viewed and, importantly, the location of the canvas between the viewer and the scene. Conversely, both the scene and the center of projection may be reconstructed with some knowledge of the structure of the scene. For example, if it is known that the objects depicted have symmetrical features, such as equiangular corners, the center of projection is constrained to a single line (or point) in space. For one-point perspective (with a single vanishing point for all lines that are not parallel to the canvas plane), the constraint line runs from the vanishing point perpendicular to the canvas. For two-point perspective, in which the objects depicted are oblique to the canvas, the constraint line is a semicircle joining the two vanishing points. A viewer located at any point on the circle will see the depicted objects as rectangular and symmetric, and will have no grounds for knowing that the perspective was not constructed for this viewing location (unless there are objects that are known to be square, i.e., a further symmetry constraint on the object structures). This semi-circular line of rectangular validity forms a kind of horopter for two-point perspective. Moving around this semi-circular line for an architectural scene gives the viewer the odd impression of the architecture reforming itself in credible fashion to form an array of equally plausible structures.
Previously we studied the effect of spatiotemporal pattern of transients on perceptual organization. Transient synchrony/asynchrony was critical in novel illusions of contextual motion (Likova & Tyler, 2002, 2003a, b). We found that strong image segmentation can be generated from transient asynchronies in fields of homogeneous visual noise, a phenomenon that we term 'Structure-from-Transients' (SfT). Here we used fMRI to reveal cortical mechanisms involved in SfT. The stimuli were random dot fields of 30 x 40°, replaced by uncorrelated dots every 500 ms. Asynchronous updates in subregions of the random-dot fields results in SfT. Exp.1: Figure/ground organization was generated in the test stimuli by transient-asynchrony between a figure area (a horizontal noise strip 8 x 40°) and its surround. The transient changes in the nullstimuli however were synchronized, generating no SfT. Thus the global percepts switched from figure/ground (test) to a homogenous field (null) every 9 s, in 36 blocks per scan. Exp.2: Figure/ground organization was eliminated by segmentation of the field into equal horizontal SfT stripes. We found dramatic reorganization of the cortical activation pattern with manipulation of the perceptual SfT organization. Exp.1 revealed excitation of hMT/V5+ and figure/ground-specific top-down suppression of the background region in V1. Both were abolished by eliminating the figure/ground organization with multiple SfT stripes, which instead activated the higher dorsal and ventral tier retinotopic areas. The results support a view of a recurrent architecture with functional feedback loops, exhibiting complex spatiotemporal behavior in the case of a figure/ground organization extracted from its specific 'generator'. Our study reveals that on a global level the brain makes an important use of asynchrony as a relation structuring the spatiotemporal visual input.
The cyclopean paradigm introduced by Bela Julesz remains one of the richest probes into the neural organization of sensory processing, by virtue of both its specificity for purely stereoscopic form and the sophistication of the processing required to retrieve it. The introduction of the sinusoidal stereograting showed that the perceptual limitations of human depth processing are very different from those for monocular form. Their use has also revealed the existence of hypercyclopean form channels selective for specific aspects of the monocularly invisible depth form. The natural extension of stereogratings to patches of stereoGabor ripple has allowed the measurement of the summation properties for depth structure, which is specific for narrow horizontal bars in depth. Consideration of the apparent motion between two cyclopean depth structures reveals the existence of a novel surface correspondence problem operating for cyclopean surfaces over time after the binocular correspondence has been solved. Such concepts imply that remains to be discovered about cyclopean stereopsis and its relationship to 3D form perception from other depth cues.
A new evaluation of the local structure of sustained spatial channels with local stimuli in peripheral retina employs the
masking sensitivity approach to minimize analytic assumptions. The stimuli were designed to address predominantly the
sustained response system at 5 deg eccentricity. Under these conditions, the lowest spatial-frequency channel peaked at
about 2 cycle/deg, 4 times higher than previous estimates, with a bandwidth of 1.5-2 octaves. The highest spatialfrequency
channel peaked at 5-6 cycle/deg with about a 1 octave bandwidth. The data are consistent with there being
only one channel tuned between these extremes, although they do not exclude a more continuous channel structure. Our
analysis shows that there are no sustained channels tuned below 2 cycle/deg but there may be channels above the
highest-frequency channel measured if tested with more selective stimuli than employed in our study. For local
sustained stimuli, human peripheral spatial processing therefore appears to be based a simpler channel structure than is
often supposed.
Binocular disparity is one of the most powerful sources of depth information. Stereomotion is motion-in-depth generated by disparity changes. This study is focused on the hMT+/V5 complex, which is known to support both motion and disparity processing in primates. Does the motion-complex process stereomotion as well? BOLD functional magnetic resonance imaging (fMRI) was used. The fMRI contrasts of stereomotion vs stationary stimuli, as well as lateral non-stereoscopic motion vs stationary stimuli, showed strong fMRI activation of the motion complex. Direct contrasts of stereomotion vs different types of lateral-motion also revealed differential activity but in a restricted subregion of the motion complex, suggesting a distinct
stereomotion-selective neuronal subpopulation within it. No consistent activation was found for the stimuli viewed
non-stereoscopically. The stereomotion-specific locus revealed within the hMT+/V5 complex contributes to the understanding of stereomotion perception and of interactions between motion and stereo
mechanisms, as well as to the understanding of the organization of overlapping functionally distinct neuronal subpopulations.
Motion capture is one of the basic effects of a moving surround. To explore the existence of a motion capture in the stereodomain, we designed dynamic autostereograms with target-surround configuration. The images consisted of several horizontal lines of disparate disks with the central line of disks specified as a target. The surround was set in stereomotion by alternating between two disparities, while the target did not change in disparity, but suddenly disappeared in phase with one of the two surround depth-planes, reappearing in phase with the other surround plane. The target was vividly perceived as moving in depth despite its lack of any disparity change or even a paired location in the second of the two alternating frames. The motion capture hypothesis predicts that the target should be seen to move in synchrony and in the same direction as the surround. However, surprisingly, the data showed that the target was always perceived disappearing in a backward direction and reappearing in a forward direction irrespective of the surround direction, thus suggesting that the reported illusory depth-motion into a stationary target is an independent perceptual phenomenon that has no relation to the expected capture paradigm.
Models that predict human performance on narrow classes of visual stimuli abound in the vision science literature. However, the vision and the applied imaging communities need robust general-purpose, rather than narrow, computational human visual system (HVS) models to evaluate image fidelity and quality and ultimately improve imaging algorithms. Of the general-purpose early HVS models that currently exist, direct model comparisons on the same data sets are rarely made. The Modelfest group was formed several years ago to solve these and other vision modeling issues. The group has developed a database of static spatial test images with threshold data that is posted on the WEB for modellers to use in HVS model design and testing. The first phase of data collection was limited to detection thresholds for static gray scale 2D images. The current effort will extend the database to include thresholds for selected grayscale 2D spatio-temporal image sequences. In future years, the database will be extended to include discrimination (masking) for dynamic, color and gray scale image sequences. The purpose of this presentation is to invite the Electronic Imaging community to participate in this effort and to inform them of the developing data set, which is available to all interested researchers. This paper presents the display specifications, psychophysical methods and stimulus definitions for the second phase of the project, spatio-temporal detection. The threshold data will be collected by each of the authors over the next year and presented on the WEB along with the stimuli.
A variety of developments in twentieth century painting have expanded the depiction of space beyond the direct representation of optical space. This paper analyzes some of the artistic explorations of spatial concepts, giving particular attention to: (1) spatial composition, (2) spatial density and optical impressions, and (3) the deconstruction of visual space.
KEYWORDS: Linear filtering, Data modeling, Target detection, Image quality, Modulation, Image filtering, Visualization, Modulation transfer functions, Visual process modeling, Chlorine
Contrast discrimination is an important type of information for establishing image quality metrics based on human vision. We used a dual-masking paradigm to study how contrast discrimination can be influenced by the presence of adjacent stimuli. In a dual masking paradigm, the observer's task is to detect a target superimposed on a pedestal in the presence of flankers. The flankers (1) reduce the target threshold at zero pedestal contrast; (2) reduce the size of pedestal facilitation at low pedestal contrasts; and (3) shift the TvC (Target threshold vs. pedestal contrast) function horizontally to the left on a log-log plot at high pedestal contrasts. The horizontal shift at high pedestal contrasts suggests that the flanker effect is a multiplicative factor that cannot be explained by previous models of contrast discrimination. We extended a divisive inhibition model of contrast discrimination by implementing the flanker effect as a multiplicative sensitivity modulation factor that account for the data well.
Do all parts of the face contribute equally to face detection or are some parts more detectable than others? The task was to detect the presence of normalized frontal-face images within in aperture windows of varying extent. We performed such a face summation study using two-alternative forced-choice psychophysics. The face stimuli were scaled to equal eye-to- chin distance, foveated on the bridge of the nose. The images were windowed by a fourth-power Gaussian envelope ranging from the center of the nose to the full face width. Eight faces (4 male and 4 female) were presented in randomized order, intermixed with 8 control stimuli consisting of phase- scrambled versions of the images with equal Fourier energy. The integration functions for detection of random images did not deviate significantly from a log-log slope of -0.5, suggesting the operation of a set of ideal integrators with probability summation over all aperture sizes. The data for face detection showed that observers were not ideal integrators for the information in the face images, but integrated linearly up to some small size and failed to gain any improvement for information beyond some larger size. This performance suggested the operation of a specialized face template filter at detection threshold, differing in extent among the observers.
In Year One of the Modelfest project, several laboratories collaborated to collect threshold data of human observers on 45 pattern stimuli. In this preliminary study, we used a principal component analysis (PCA) and a confirmatory factor analysis on the variations among observers to explore the underlying visual mechanisms for detecting Modelfest Stimuli. This analysis is based on the assumption that there are channels in common among observers that are represented with variations in sensitivity level only. We found three principal components. Assuming that each principal component represents a single mechanism, we compute the sensitivity profile of each mechanism as the sum of test stimuli weighted by the factor loadings on each component. The first mechanism is a spot detector. The second mechanism is dominated by a horizontal periodic pattern around 4 c/deg and the third may be characterized as a narrow bar detector.
Depth representation, in both its geometric and its more generic forms, has often served as an impetus in artistic development through the millennia. The first historical mentions of art, by Plato and contemporaries in the 5th century BC, were provoked by the dramatic use of perspective in the scenery for the plays of Aeschylus and Sophocles. One of these innovative scene painters, Agatharchus, even wrote a commentary on his use of convergent perspective, whose effects had inspired several contemporary Greek geometers to analyze the productive transform mathematically.
KEYWORDS: Data modeling, Visual process modeling, Spatial frequencies, Human vision and color perception, Databases, Video compression, Composites, Visualization, Performance modeling, Image compression
A robust model of the human visual system (HVS) would have a major practical impact on the difficult technological problems of transmitting and storing digital images. Although most HVS models exhibit similarities, they may have significant differences in predicting performance. Different HVS models are rarely compared using the same set of psychophysical measurements, so their relative efficacy is unclear. The Modelfest organization was formed to solve this problem and accelerate the development of robust new models of human vision. Members of Modelfest have gathered psychophysical threshold data on the year one stimuli described at last year's SPIE meeting. Modelfest is an exciting new approach to modeling involving the sharing of resources, learning from each other's modeling successes and providing a method to cross-validate proposed HVS models. The purpose of this presentation is to invite the Electronic Imaging community to participate in this effort and inform them of the developing database, which is available to all researchers interested in modeling human vision. In future years, the database will be extended to other domains such as visual masking, and temporal processing. This Modelfest progress report summarizes the stimulus definitions and data collection methods used, but focuses on the results of the phase one data collection effort. Each of the authors has provided at least one dataset from their respective laboratories. These data and data collected subsequent to the submission of this paper are posted on the WWW for further analysis and future modeling efforts.
KEYWORDS: Visual process modeling, Data modeling, Spatial frequencies, Databases, Visualization, Image quality, Image compression, Human vision and color perception, Performance modeling, Linear filtering
Models that predict human performance on narrow classes of visual stimuli abound in the vision science literature. However, the vision and the applied imaging communities need robust general-purpose, rather than narrow, computational human visual system models to evaluate image fidelity and quality and ultimately improve imaging algorithms. Psychophysical measure of image imaging algorithms. Psychophysical measures of image quality are too costly and time consuming to gather to evaluate the impact each algorithm modification might have on image quality.
The importance of the center of the canvas has long been appreciated in art, as has the way the eyes as revealing the personality of the subjects of portraits. Is there a consistent placement of the eyes relative to the canvas frame, based on the horizontal position of the eyes in portraits? Data from portraits over the past 2000 years quantify that one eye is centered with a standard deviation of less than +/- 5%. Classical texts on composition do not seem to mention the idea that the eyes as such should be positioned relative to the frame of the picture; the typical emphasis is on the placement of centers of mass in the frame or relative to the vanishing point in cases of central perspective. If such a compositional principle is not discussed in art analysis, it seems that its manifestation throughout the centuries and varieties of artistic styles (including the extreme styles of the 20th century) must be guided by unconscious perceptual processes.
Seven types of masking are discussed: multi-component contrast gain control, one-component transducer saturation, two- component phase inhibition, multiplicative noise, high spatial frequency phase locked interference, stimulus uncertainty, and noise intrusion. In the present vision research community, multi-component contrast gain is gaining in popularity while the one- and two-component masking models are losing adherents. In this paper we take the presently unpopular stance and argue against multi-component gain control models. We have a two-pronged approach. First, we discuss examples where high contrast maskers that overlap the test stimulus in both position and spatial frequency nevertheless produce little masking. Second, we show that alternatives to gain control are still viable, as long as uncertainty and noise intrusion effects are included. Finally, a classification is offered for different types of uncertainty effects that can produce large masking behavior.
KEYWORDS: Transducers, Interference (communication), Spatial frequencies, Systems modeling, Visualization, Psychophysics, Signal to noise ratio, Visual system, Target detection, Signal detection
The properties of spatial vision mechanisms are often explored psychophysically with simultaneous masking paradigms. A variety of hypotheses have been proposed to explain how the mask pattern utilized in these paradigms increases threshold. Numerous studies have investigated the properties of a particular origin of masking hypothesis but few have attempted to compare the properties of masking at several points in the process. Our study isolates masking due to lateral divisive inhibition at a point where mechanism responses are combined, and compares it with masking of the same target due to a nonlinearity either intrinsic to a mechanism or directly operating on the response of a single mechanism. We also measure the slopes of psychometric functions to examine the relationship between uncertainty and mask contrast. Studies of simultaneous masking utilizing a pedestal mask (an identical test and mask pattern) have measured facilitation for low contrast masks. This decrease in threshold from the solo target threshold is commonly referred to as the 'dipper' effect and has been explained as an increase in signal-to- noise ratio from the high unmasked level occurring as the visual system becomes more certain of target location. The level of uncertainty is indicated by the slope of sensitivity to the target as a function of target contrast in the threshold region. In these studies, high contrast masks have evoked an increase in target threshold. There have been many theories explaining this threshold increase. Some suggest that masking is the result of an intrinsic nonlinearity within a mechanism or of a contrast nonlinearity that operates directly on the output of a single mechanism. Others put the source of masking at a gain control operation which occurs when a surrounding set of mechanisms divide the response of a single mechanism by their summed response. Still others attribute the masking to noise that is multiplicative relative to the neural response signal, or noise that intrudes on the detecting mechanism from neighboring mechanisms. A detailed review of this debate is provided by the paper by Klein et al., 3016-02 in this Proceedings. Threshold elevation functions that show the relationship between mask spatial frequency and masking magnitude cannot illuminate this debate, as we demonstrated at ARVO (1994). For that study, we generated threshold elevation functions (the ratio of unmasked versus masked target threshold) for multi-channel systems using computational models that invoked either divisive inhibition, a set of transducer nonlinearities or multiplicative noise. Threshold elevation functions were indistinguishable when each masking process was assumed to have similar strength. These results led us to design the experiment presented here, which attempts to compare the effects of two of these masking processes, lateral divisive inhibition and nonlinear transducer compression.
Radial sinusoids (blurry spoke patterns) appear dramatically saturated toward the brighter regions. The saturation is not perceptually logarithmic but exhibits a hyperbolic (Naka- Rushton) compression behavior at normal indoor luminance levels. The object interpretation of the spoke patterns was not consistent with the default assumption of any unidirectional light source, but implied a diffuse illumination (as if the object were looming out of a fog). The depth interpretation was consistent with the hypothesis that the compressed brightness profile provided the neural signal for perceived shape, as an approximation to computing the diffuse Lambertian illumination function for this surface. The surface material of the images was perceived as non-Lambertian to varying degrees, ranging from a chalky matte to a lustrous metallic.
The properties of human stereoscopic mechanisms may be derived from dichoptic interaction and masking effects on stereoscopic detection thresholds in any relevant stimulus domain (spatial frequency, temporal frequency, disparity, orientation, etc.). The present study focuses on the spatial properties of mechanisms underlying stereoscopic depth detection. The computational approach is based on the full exploration of plausible model structures to characterize their idiosyncrasies, which often allows exclusion of proposed mechanisms by comparison with data obtained under conditions in which the idiosyncrasies should be expressed. For example, we conducted a detailed analysis of threshold elevation functions (TEFs) under plausible channel shapes, combination rules and masking behavior derived from previous studies. The analysis reveals that TEFs may be much narrower than and differ in shape from the underlying mechanisms. For example, only two discrete channels are required to produce TEFs peaking close to each fixed test frequency, with no relation to channel peaks. We apply this analysis to the stereospatial masking functions collected by Yang and Blake (1991) to determine the likely channel structure underlying the empirical masking performance. The analysis generally supports the two-mechanism model that they propose but shows that the assumptions underlying their estimates of the unmasked sensitivity function are incorrect. The analysis excludes stereospatial channels tuned below 2.5 c/deg, a region in which Schor, Wood, and Ogawa (1984) obtained evidence for many narrowly tuned channels by measuring disparity thresholds for targets with different peak tunings in the two eyes. Our computational model for the latter data is consistent with the lowest tuned channel being at 2.5 c/deg, this channel being narrowly tuned to dichoptic contrast differences, as described by Legge and Gu (1989) and Halpern and Blake (1988). Thus, all such stereo tuning data can be explained in a model in which all stereoscopic channels are tuned above 2.5 c/deg.
KEYWORDS: RGB color model, Human vision and color perception, Visualization, Optical filters, Calibration, Image resolution, Binary data, LCDs, Computing systems, Video
The precision of human vision requires displays to be accurate to about 0.2% of the luminance range. We present a technique by which this grey-level precision can be achieved with the use of an 8-bit color monitor. The basic idea is to 'steal' adjacent bits from the color variation for use in increasing the precision of the luminance variation. On a monitor with 8 bits per color gun, the technique can provide 1786 or more grey levels at a cost of one bit of color jitter, with standard D/A hardware. The color variations are invisible under almost all conditions.
Stereoscopic and monocular alignment acuity were measured using sinusoidal displacements in time, space, and disparity of a single line stimulus. The stereoscopic detectability was not limited by the sensitivity for the monocular components of the spatio-temporal stereo- alignment target. In fact, the two tasks seemed to be controlled by largely independent processes. Monocular sensitivity was best at high spatial perturbation frequencies, almost independent of temporal frequency, while stereoscopic sensitivity was best at low temporal and medium spatial frequencies, and its surface had a substantially different morphology. Under these dynamic conditions the lowest thresholds of either kind were of the order of 10 arc sec, setting stringent limitations on the accuracy of stereoscopic displays. The spatio- temporal surfaces we measured show regions where sensitivity is reduced by an order of magnitude, suggesting modes in which dynamic human stereopsis is more tolerant of perturbations than suggested by classical data.
A new technique for the presentation of cyclopean stereograms is described, in which complete
information for the two eyes is contained in a single, printed image. Such "autostereograms" may be
generated to contain an unlimited range of 3-D depth forms within certain constraints. The depth image also
occurs at multiple depth planes in front of and behind the physical plane, and may be designed to progress
recursively through these multiple planes. Specific autostereograms may be generated so as to be used to
continuously tile an indefinitely large surface. Further types can be devised to produce different depth
images in the two orthogonal viewing orientations. Finally, the autostereogram principle is used to explore
cyclopean perception based on the percept ofbinocular luster, which has surprising properties.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.