Retinex at 50: color theory and spatial algorithms, a review

Abstract. Retinex Imaging shares two distinct elements: first, a model of human color vision; second, a spatial-imaging algorithm for making better reproductions. Edwin Land’s 1964 Retinex Color Theory began as a model of human color vision of real complex scenes. He designed many experiments, such as Color Mondrians, to understand why retinal cone quanta catch fails to predict color constancy. Land’s Retinex model used three spatial channels (L, M, S) that calculated three independent sets of monochromatic lightnesses. Land and McCann’s lightness model used spatial comparisons followed by spatial integration across the scene. The parameters of their model were derived from extensive observer data. This work was the beginning of the second Retinex element, namely, using models of spatial vision to guide image reproduction algorithms. Today, there are many different Retinex algorithms. This special section, “Retinex at 50,” describes a wide variety of them, along with their different goals, and ground truths used to measure their success. This paper reviews (and provides links to) the original Retinex experiments and image-processing implementations. Observer matches (measuring appearances) have extended our understanding of how human spatial vision works. This paper describes a collection very challenging datasets, accumulated by Land and McCann, for testing algorithms that predict appearance.

1 Introduction Edwin Land coined the word "Retinex" in 1964. 1 He used it to describe the theoretical need for three independent color channels to explain human color constancy. The word was a contraction of "retina" and "cortex." A Retinex is a theoretical spectral channel that makes spatial comparisons between scene regions so as to calculate "Lightness" sensations (the monochromatic range of appearances between light and dark in each channel).
Land had enthusiastically experimented with two-color projections in the late 1950s and early 1960's. 2 By that time, he had hundreds of patents on many different photographic systems. He was well aware of the possibilities, and limitations, of silver halide photography. Before his "Red and White" light projection experiments, he accepted the standard explanation of color, namely, color was the result of the local quanta catches of receptors with different spectral sensitivities. Human color vision was thought to behave the way that color film did, in that color was a local phenomenon that resulted from spectral responses within each very small image segment. Then, Land thought that the quanta catches of the triplet of retinal cones in a small retinal region generated color appearances.
An accidental observation, made by a colleague, in a latenight experiment changed everything Land "knew" about color. The colleague remarked that there was more color than expected from mixtures of photographic separations using red and white lights. Land responded: "Oh yes, that is adaptation." At 2 o'clock in the morning, Land sat up in bed, and said: "Adaptation, what adaptation?" He immediately returned to the lab to repeat the experiment. For the rest of his life, human color vision was a favorite research topic.
What was it that Land had seen, so briefly, that made him return to the lab in the middle of the night? Human trichromatic color theory and film have always been linked. When Thomas Young made his famous suggestion of human trichromacy in 1802, his colleague at the Royal Institution, Humphrey Davy, was studying a black and white photographic system. Young was the editor of the Institution's journal that described Davy's work. 3 Young was well aware of silver halide's response to light.
That night, Land realized there was nothing he could do with a locally responsive silver-halide system to make film behave the way that vision did. The color appearances in those projections could not be understood from the quanta catches of receptors in a tiny local region. He realized that human color appearances are fundamentally different: spatial comparisons control color sensations.
This was a startling observation made by a man whose company was about to bet its future on instant color film. If vision had a different mechanism, what was it? How do humans process information from different parts of the visual spectrum?
Color constancy provided the important clue to the answer. Land's careful study of color appearance in threecolor illumination led to the observation that spectral apparent lightnesses of an object in narrowband light were constant in variable amounts of illumination. The essential new idea was that spatial interaction of postreceptor neural processes depended on scene content, not the absolute amount of light. Film's color separations recorded and reproduced the relative amount of light. Vision used spatial image processing to calculate monochromatic lightness appearances of each spectral channel. Land replaced the spectral response of spots of light on the retina with the spatial comparisons of the entire retina for each spectral sensitivity. Land coined the word "Retinex" to describe the three independent spatial mechanisms that explain color constancy. 1 Color is the comparison of L, M, S Retinex monochromatic lightnesses. Figure 1 illustrates the human visual pathway that begins with the visual pigments located in the distal tips of the cone and rod receptors in the retina (red ellipse). The quanta catch of these visual pigments initiates the spectral response to light. The receptors provide only the first response to the image on the retina. Appearance is the result of spatial processing along the entire visual pathway. John Dowling greatly expanded the work of Hecht and Wald by describing the complex retinal spatial interactions. 4 Berson 5 has recently shown spatial modulation from melanopsin photopigment in ganglion cells. In 1953, Kuffler 6 and Barlow 7 showed that retinal cells make spatial comparisons. Hubel and Wiesel, 8 DeValois and DeValois 9 found spatial comparison cells in the cortex. Zeki 10 found color constancy cells in V4 cortical cells. The dominant theme in research on the human visual pathway over the past 80 years has been the documentation of human spatial mechanisms at every stage along the visual pathway. Vision is a spatial process.

Vision Ratio-Making Sense
In 1974, Land wrote in his Friday Evening Discourse at the Royal Institution: "This Discourse is about a generally unrecognized animal sense-the ratio-making sense. It is the ratio-making sense which processes the radiation reaching our eyes in such a way as to discover the constant properties of objects in relation to the radiation falling on them." 11 Land put forward the idea that spatial comparisons, not receptor quanta catches, are the important stimuli for vision. Of course, quanta catches, as the first input step, play a role, but ratios of quanta catches play a much more fundamental role in synthesizing appearance. Perhaps Land's greatest contribution to vision research is the remarkable legacy of fascinating, simple but elegant, experiments. His "Red and White" projections, "Color and Black & White Mondrians," changed the requirements of vision theories. Scenes required different mechanisms from quanta catch models. This paper will review Land's and others' experiments that help us understand humans' unique spatial vision.

Spatial Algorithms
The best description of the original spatial algorithms that calculated lightness is found in the original literature: Each of these articles describes important aspects of the model. In order to predict lightness in the "B&W Mondrian" and other test targets, the model varies the number and direction of paths. It includes a gradient threshold and a reset step that introduces normalization. Experiments showed that the reset step is the most interesting. Reset is key to the successful compression of HDR images. Frankle and McCann's 1983 patent 16 replaced paths with an array processor that calculated ratio, product, reset, and average using a multiresolution algorithm. This algorithm could calculate lightness predictions for a 512 × 512 image in seconds in 1980. This led to the algorithmic Zoom Processing 17 with O(N) computational efficiency. It is an extremely fast computational model and is even more efficient when combined with special purpose hardware. Sobol's modification 18 was incorporated into a line of commercial digital cameras. Review papers document the advances in the original Land and McCann Retinex theory and image processing algorithms over the past 50 years. 17,[19][20][21] Figure 2 shows a map of  papers and patents that incorporate the original ratio-thresholdproduct-reset algorithms. Reference 22 is a web page with links to the full text of those papers.
For a comprehensive review of "Land and McCann" algorithms and their implementation, see Ref. 21 (Chapter 32).

Two Distinct Parts: Model Vision and Make
Reproductions From the very beginning, the Retinex algorithm had two distinct, but related parts: Cameras require many improvements to mimic human vision, namely, cameras need to have color constancy and HDR scene compression. A successful model of spatial color vision can calculate color constancy in HDR scenes and write those sensations on LDR media. However, color photography research has shown that people prefer enhanced sensations over accurate reproductions, so color and tone-scale enhancements are needed to meet consumer preferences.
Over the past 5 decades of growth in digital imaging, there has been a parallel growth in spatial image processing.
This paper serves as a historical introduction to the Retinex at 50. This paper reviews the original vision experiments, updated to the present. In particular, it describes measurements of spatial vision to serve as ground truth for vision models.
1.5 Outline of the Paper Section 1 (above) reviewed the early history and motivation of Retinex algorithms. As well, it provides an outline with links to the Land   The green-circular paper on the left Mondrian has higher L illumination than that on the red-circular paper on the right Mondrian. The illuminations were adjusted to make the L radiances from both circular papers equal. Nevertheless, the green circular paper looks dark, and the red circular paper looks light in L-Illumination. (c) In middle-wave (M) light, the green paper on the left looks light, and the red paper on the right looks darker despite equal M radiances. The green and red color appearances correlate with their different L and M lightnesses.
Section 3 describes Land's early exploration of appearance in HDR targets using his Black and White Mondrian experiment. This experiment led to Land and McCann's model of calculated lightness. It introduces the need for observer data to define the spatial properties of a model of lightness. It describes the use of observer data to understand the spatial processing of human vision, including appearance in HDR scenes influenced by intraocular glare.
Section 4 provides an introductory framework of additional Retinexes that have different goals, algorithms, and image processing properties.

Color Mondrians and Color Constancy
The Retinex algorithm began as a model of color vision. Its three independent (L, M, S) spatial color channels were needed to explain Land's Color Mondrian experiments. He adjusted the overall uniform illumination on each side so that the green paper in the left Mondrian and the red paper in the right had identical radiances. Appearance did not correlate with quanta catch. The expanded experiments showed that a single triplet of quanta catches can appear as any color, at any location in the Color Mondrian. 11,12 To understand how human vision does this, Land studied the Mondrians in each waveband. Figure 3(b) illustrates a portion of the two Mondrians in long-wave illumination. In Land's experiment, the circular green paper in the left Mondrian had the same radiance as the circular red paper in the right Mondrian. The green circle reflected a smaller percentage of long-wave light than the red circle. To make the left-green circle have the same long-wave radiance as the right-red circle, the L illumination on the left had to be increased. Figure 3(b) illustrates more long-wave illumination on the left Mondrian. Land recognized that a common, everyday phenomenon was happening here. We all have observed that when a cloud passes in front of the sun, we have less light falling on that scene. Nevertheless, the appearance of that scene changes only a small amount. Figure 3(b) illustrates a small darkening of all papers on the right caused by less illumination. The lightnesses of corresponding Mondrian papers in both Mondrians are nearly constant. In Land's experiment, the green circle appears dark, and the red circle appears light in long-wave illumination when they have identical radiances.
In Fig. 3(c), the green circle on the left Mondrian reflected more middle-wave light than the red circle on the right. In that case, the right Mondrian had increased middle-wave illumination. Again, increased uniform illumination of corresponding Mondrian papers makes very small increases in apparent lightness for all papers. Again, the lightnesses of all corresponding Mondrian papers in middle-wave light were nearly constant in variable illumination. The spatial relationships of the appearances of the two Mondrians were nearly constant. The green paper appeared lighter, and the red paper appeared darker in middle-wave illumination when they had identical radiances.
These observations explained to Land why vision has color constancy, while film does not. Color appearance correlates with the relative visual lightness in long-, middle-, and short-wave light. The Retinex is a theoretical independent channel that calculates the apparent monochromatic lightness of each image segment, for each spectral waveband. Color appearance correlates with three Retinex lightnesses (Fig. 4).

Quantitative Model of Color Constancy versus
Observed Match Data McCann et al. 14

measured color sensations in Color
Mondrian color constancy experiments. The experiments used five sets of combinations of L, M, S narrowband illuminations. They showed that in uniform illumination, color sensations correlated with the paper's reflectance using cone spectral sensitivities. They designed a triplet of spectral filters (L-cone, M-cone, S-cone) that modified a telephotometer's spectral response to match that of human cone pigments. Using those three filters, they measured the relative cone quanta catches, and the cone reflectances of all of the papers used in the experiment. L-cone quanta catch was the L-cone-sensitivity meter readings from each paper in combined L, M, S illumination. Since cone-sensitivity spectra are so broad, each cone response includes some contribution from each L, M, S light. L-cone response is the sum of L light plus crosstalk contributions from M and S light. Cone reflectance values are the ratio of quanta catch values of (each paper/white paper) paper in each combined L, M, S illumination. Cone reflectance values change with changes in the relative amounts of L, M, S illuminations.
McCann et al. 14 measured appearances (matches) and cone reflectances in five different illuminants. In all cases, color-constant appearances correlated with cone reflectance values for that illumination. In some cases, the change in illumination caused enough cone crosstalk to predict specific predicted departures from perfect constancy. Observer data correlated with the predicted departures. Apparent color constancy is limited by cone crosstalk. Apparent color constancy does not correlate with the surface reflectance of objects (measured with narrowband spectra), but rather with calculated L, M, S ratios of a paper's cone spectral response divided by a white paper's cone spectral responses.
Furthermore, McCann et al. 14 successfully modeled color sensations using the spatial algorithm described by Land and McCann. 12 This quantitative study provides important data on the limits of color constancy. It is an important set of ground-truth data for models of human color constancy.

Measurements of the Effects of Adaptation in
Color Constancy Additional color matching experiments showed that receptor adaptation cannot explain color appearance [see Ref. 21 (Chapter 27)]. These Color Mondrian experiments modified the surround to compensate for changes in scene averages caused by adjustments in overall illumination. Not only did the different color samples have constant radiances but also they had constant average scene radiances. Receptor adaptation cannot account for these color constancy experiments. As well, Grayworld and vonKries normalization cannot account for human color constancy.

Switching Color Constancy "OFF'" and "ON"
Another experiment shut off color constancy in a complex scene. As proposed by Vadim Maximov, the experiment made two sets of papers with correlated reflectances, shifted in color space. The experiment used illumination with spectra that shifted the combined radiances to be identical. This complex scene made by the combination of reflectances and illuminations creates two displays with identical quanta catch. Identical quanta catches over the entire field of view generated identical sensations. Even though we should expect color constancy in a complex scene, these two complex displays shut constancy off. In principle, it is easy to do (Fig. 5). Imagine two Maximov shoeboxes: one for the upper Tatami and one for the lower. Select two filters that attenuate the color spectra but do not reduce the light at any wavelength to zero.
The experiment used Wratten Color Correction filters: CC40R and CC40C. These filters have different effects on appearances depending on how they are used. When the filters are viewed side-by-side on a lightbox, they appear as high chroma red and cyan areas, surrounded by the light-box white. They look like high-chroma papers. When the 40R filter is held close to one eye, the appearance of the room has a pale pink cask. Replacing 40R with 40C makes the color cast cyan. The room colors are almost constant. When viewed side-by-side, they are highly colored, but in a color constancy experiment, they generate small changes in appearance.

Two complex scenes with identical quanta catch
The experiment demands pairs of colored papers that have color differences equal to that of the Wratten Filters. Papers with such demanding specifications had to be manufactured to fit the measurements. Digital control of local printed areas was not generally available in 1990. McCann used an early digital xerographic Canon CLC 500 printer to make two Tatami with identical colorimetric shifts for all pairs of corresponding papers. The colored papers in A are shifted by the same amount in CIEXYZ space. The amount of the shift is equal and opposite to the shift caused by changing from a Wratten 40R to a Wratten 40C.
The experiment was to compare the color appearances in the two shoe boxes. One (Fig. 5, top) illustrates Tatami A with five colors that were shifted away from red in CC40C illumination; the other (Fig. 5, bottom) illustrates five colors shifted toward red in CC40R illumination. The papers were carefully manufactured to have the exact opposite shift in chromaticity as that caused by the filters.
Ordinarily, illumination has little or no noticeable effect. When we viewed the two Tatami side-by-side on a table in a room, there was very little change in appearance alternating the two filters.
When viewed in the Maximov Shoeboxes, the different sets of reflectances, in different illuminations, changed in appearance from looking different, to looking the same. Tatami A looked the same as Tatami B (Fig. 5, right). The result was that the color constancy mechanism for complex images was shut off using this pair of Maximov Shoeboxes. Despite the fact that the reflectances were different, the color appearances were the same.
Why did Maximov's boxes turn off color constancy? The answer is that both Tatami have to look identical because every pixel in their entire fields of view had identical cone quanta catches. The sets of papers were made to shift the entire image as much as the filters did. When viewed in isolation, the quanta catch for both were the same, everywhere in the field of view. Whenever two images have identical quanta catches everywhere, they look the same. It was a challenge to find a set of papers that all shifted the same amount. The reward for this control experiment was shutting off color constancy.  Aw and Bw should no longer match in the Shoeboxes. If this is true, then it shows that color constancy is the result of spatial comparisons.

New maxima restores constancy
If the fundamental determinant of color appearance is the quanta catch at a pixel, then the small white frame should have only a small effect on appearance. Except for whites, every other pixel in the field of view is identical in TatamiA and Aw as well as B and Bw. Consider the change in appearance caused by the new whites in Aw and Bw (Fig. 6, right), compared to Tatami A and B (Fig. 4, right). Introducing white reflectances in different spectral illuminations in both Tatami revived color constancy.
Two careful observations are important here: • First, the whites in Aw and Bw do not look exactly the same. Aw looks reddish in the CC40R box and coolish in the CC40C box. The influence of the illuminant shift is visible. • Second, the two sets of five original papers look almost the same as they do in the room.
The whites still have a reddish, or coolish, cast depending on the illumination.
Nevertheless, the striking conclusion is that the introduction of white to both displays brought color constancy back to this complex scene. 23 Extended experiments showed that any new maximum in any of the L, M, S cone responses turned constancy back on Ref. 24.
These results support the early Retinex mechanisms using calculations that reset to the maxima in each waveband. 14 As well, observers noted the changes in color appearance of the white papers. That observation supports the hypothesis that small appearance changes are due to changes of overall quanta catches [Refs. 21 (Chapter 21), 23,24].
The changes in color appearances are consistent with the colors expected by normalizing each receptor set independently to a maximum reference. In other words, the colors observed are consistent with the Retinex Color Theory. Observers matched two chromatic and one achromatic samples in all illuminants. Observers reported that the achromatic paper was nearly constant in all spectral illuminants. However, the chromatic samples showed a small but distinctive shift in appearance matches to the Munsell Book. That signature shift correlates with changes in spatial edge ratios due to the overlap in spectral sensitivity of cone photopigments. 14 That signature was distinctly different from predictions made by an incomplete adaptation model. 27

Color Mondrians in Illumination with Edges
All of the Color Constancy experiments described above used flat Mondrians in uniform illuminations. The Mondrian used in Ref. 14 is shown in Fig. 7(a). Recent experiments 28 measured appearances in nonuniform illuminations that had sharp shadows, which created edges in illumination. Human visual appearance mechanisms treat edges in illumination the same way they treat edges in reflectance. They were asked to quantify the degree of color constancy in more real-life illuminations. Figure 7(b) used an integrating illumination box (LDR illumination) that attempted to make uniform illumination. Observers reported that many facets with the same paint appeared nearly constant. Others facets with that paint did not.   constancy. Color appearance correlates with the edges in the retinal image, not with the reflectance of each painted surface. 28 Carinna Parraman made a unique contribution. She painted the appearance of the two 3-D Mondrians in watercolors. She made two paintings by painstakingly reproducing the appearance of each facet (matching its sensation). The watercolor paintings were made using uniform illumination on the watercolor paper. Figure 8(a) shows her painting of the 3-D Mondrian in LDR illumination; and Fig. 8(b) shows the 3-D Mondrian in HDR illumination. She quantified her matching sensations of each scene segment by painting it and then measured sensations by measuring the reflectance of the watercolor painting. 28 Although tedious and demanding great skill in painting, this is an important advance in measuring appearance of HDR scenes. Parraman matched the entire complex scene with watercolor paints. When she measured the reflectance of each individual facet, she converted her sensations to a ground truth color value for each facet. A successful model of vision must predict these painted apparent reflectance sensation values for each facet.
In summary, the 3-D Mondrian experiments measured the limits of color constancy. While departures from ideal (perfect) color constancy are very small in uniform illumination, constancy erodes with the increase of spatial structure in illumination. Color sensations of identical surface reflectances change in real-world illumination. Edges in illumination are processed in the same manner as edges in reflectance. Cone quanta catch cannot discriminate between radiances modified by reflectance and radiances modified by illumination.

Summary: A Model of Human Color Vision
The body of work in Sec. 2 using Color Mondrians provides an extensive dataset for ground truth information for Color Constancy models. The experiments provide observer data for models of human vision that include: In retrospect, these quantitative data on the limits of observer color constancy are very important. One cannot just assume perfect color constancy when modeling human vision. Color sensations do not correlate with surface reflectances in complex natural scenes. That model needs to account for the fact that color constancy varies with scene content. Edges in illumination have the same visual impact as edges in reflectance. Universally effective spatial algorithms must mimic human spatial mechanisms. After all, reproductions are made solely for human viewing.

Black and White Mondrians-Lightness
Constancy When Land realized that human vision was a spatial mechanism, he approached image reproduction in a new way. He thought that reproduction of real scenes must incorporate a spatial model of vision. 29 The idea evolved to the sequence of capturing scene information; then, spatial processing to calculate visual sensation; then, writing sensations on film [Refs. 21 (Chapter 32), 16,30,31].
In 1968 Land and McCann extended Retinex Theory to include nonuniform illumination using the Black and White Mondrian experiment. 12 Here, gradients of illumination made near-white and near-black papers have the same retinal luminance (Fig. 9). Despite equal cone quanta catches, the white paper looked white and the black paper looked black. The retinex lightness algorithm added thresholds and reset normalization to its spatial comparison mechanism. Spatial comparisons successfully modeled sensations. Land's Black and White Mondrian was the first quantitative study of appearance in high-dynamic range (HDR) imaging. It used a range of illuminations falling on the scene that was equal to the range of reflectances of objects in the scene. It asked observers the sensation question, namely, "What is the appearance of the papers"?   The Black and White Mondrian also points out a serious concern. One can never just look at a picture to evaluate the success of a computational algorithm's output. Algorithm analysis requires study of the output numerical values. When we look at an output image (Visual Inspection), human spatial image processing transforms radiance information into sensations. Since radiance does not correlate with appearance, a pixel's appearance tells you nothing about the numerical content of the output image. One cannot evaluate the computational success, or failure, of an algorithm by inspecting a processed image on a display. Human observations, while inspecting the display image, add vision's own spatial transformations. 14 Obviously, one has to use human observers to measure observer preferences for the most desirable camera images, but the evaluation of computational imaging requires an actual analysis of the numerical output values, without human signal processing.

Extending Measurements of Appearance
One of Edwin Land's greatest talents was his unique ability to think of critical experiments. His experiments tested the fundamental principles of a hypothesis or theory. As described above, Land used Color and Black and White Mondrian experiments as an exploration of the imaging properties of vision. These simple combinations of measurements of reflectance, illumination, and human sensations made an essential contribution to our thinking about appearance.
Can we add to Land's experiments with additional tests, which inform us about the fundamental mechanisms of vision and provide additional ground truths for our models? Can we use the quantitative measurements of human responses to scenes to better test our models?

Surrounds and averages
What are the important properties of an image's digital content? Should we look to image averages, contrast ranges, histograms, or other metrics of scene content?
Following the modeling protocol described in the 1960s, 14 McCann et al. measured the appearances of lightnesses using many types of scene contents. This set of targets included variations in reflectances, uniform and gradient illuminations, and visual phenomena in order to study vision's spatial properties. An essential part of this study was to include test targets in which appearances did not correlate with reflectances. Figure 10 shows a series of 15 blackand-white test targets used to evaluate lightness models. The targets were transparencies with a dynamic range of 1000∶1, with angular subtends of 30 × 25 deg. The targets included variations in scene average luminance, gradients in illumination, variations of simultaneous contrast, extremes in background, and combinations of edges and gradients. The entire scene of calibrated luminances was the input to each spatial vision model. Observers matched the lightness of all the areas in all targets. Models of appearance calculated sensations using scene radiances as input. The results compared calculated sensations for all image segments with corresponding observer matches. Observer inspection of processed images and observer preferences were not part of the evaluation. The results showed that all these design parameters shown in Fig. 10

Spatial relationships versus image statistics
Robert Savoy made as set of six targets using identical histograms, namely, he used constant areas of a dark gray test patch, and constant areas of maximum luminance (white) and minimum luminance black surrounds in a dark room. Figure 11 (top) shows the spatial arrangement of six scenes made from identical pixel populations. 32 The 30 deg × 25 deg displays had a constant 2.5 deg dark-gray square at the center. The background around test area T was constant (0.1% transmission), with the exception of the addition of a fixed number of maximum luminance pixels (1.0% transmission) in a variety of spatial arrangements. Figure 11 (middle row) shows the measurements of the variable appearance of test area T from identical pixel populations. The same pixel populations are just rearranged in their spatial locations. All six targets had the same-size constant luminance central square area, labeled T.
In Fig. 11 (left target), all the maximum radiance pixels surround the test square. Observers matched T to Lightness 1.5, nearly black.
In Fig. 11 (right target), all the maximum radiance pixels are adjacent to the test square on only one side. Observers matched the test square to Lightness 3.9, near to middle gray (Lightness 5.0). Other spatial arrangements gave intermediate matches. Despite identical histograms, lightness varied over 30% of the range from white to black when viewed separately.
The set of six targets has different spatial positions of maximum luminance pixels and different adjacent stimuli. Asymmetry, contiguity, and enclosure are important. There is no simple rule that explains this spatial data. The only direct conclusion is that neither scene averages (Grayworld) nor the population of luminances (histogram) controls appearance.

Local image statistics
There are a number of studies that provide a challenge to models of vision using local statistics. One study measures the appearance of a central gray square with eight surround squares. 33 Half of the surrounding squares are white, the other half black. The experiment measures the sensations of the central gray in all the combinations of spatial arrangements. Figure 12 is a plot of segment pattern versus log matching luminance (LML). The graph plots the eight-white elements; all 14 patterns with 4 white and 4 black elements; and 8-black elements in the surround. 33 They are sorted from left to right in order of increasing average LML. The two lowest LML values are from all-white, and 0 of 4 adjacent blacks. The next two patterns have one adjacent black, and the following seven LML values have two adjacent blacks. The next four LML values have three adjacent blacks, with more variability than previous patterns. The highest matching luminance is for the eight black squares.
Contrast is the psychophysical term used to describe the observation that a gray test area looks lighter when adjacent to black areas. The range of contrast effect from all-white to all-black surrounds is identified with small images of them on the vertical axis (Fig. 12). When we varied the eight half-white and half-black surrounding areas, we measured matching luminances that nearly covered the entire contrast range.
The adjacent segments have more influence than the diagonal segments on matching luminance. The data from the 14 test targets with 4-white and 4-black elements correlate with the number and location of gray-black edges/graywhite edges. 33 Those data do not correlate with the constant average luminance of the surround (Grayworld) and the constant pixel-luminance histogram of the test target.
All of these detailed studies [Ref. 21 (Chapters 20 to 25)] point out that the spatial organization of boundaries is in control of sensations. Lightness appearance correlates with: • spatial comparisons at edges; • the direction of the spatial comparison; • the enclosure by areas of higher luminance;  • the angular subtend of areas; • and the separation from local maxima.
Scene statistics cannot account for observer matches and model their appearance.

Retinal Contrast
Simultaneous contrast is the familiar demonstration that surrounds affect appearance. Figure 13 illustrates the test target. This simple experiment uses two identical gray papers on white and black surrounds. Observers report that gray-onwhite appears darker than the same gray-on-black. What makes the experiment more interesting is the fact that the retinal stimulus of the apparently darker square is higher than the other. When we consider intraocular glare, the white surround scatters light into its gray square, yet it looks darker. Why does more light look darker? Two powerful spatial mechanisms, "intraocular glare" and postquanta catch "neural contrast" tend to cancel each other. Neural contrast is slightly stronger than "glare" for this target. It overcompensates glare, making the gray-on-white darker.
The effects of intraocular glare are hard to see, except in severe clinical cases. Nevertheless, it limits the range of light that reaches our retinas. Depending on the scene, amounts of glare can vary from very small to very large amounts. A scene composed of just stars at night has little glare, while a beach scene will have an extremely low range of light on the retina. Despite this limit of range of light on the retina, observers report that they see the richest, deepest blacks under high-average luminance and high glare conditions.
A set of HDR test targets with almost 6 log units of dynamic range was used to study the role of intraocular scatter [Ref. 21 (Chapters 14 to 19)]. The test targets have different backgrounds covering maximal to minimal glare. The target with half-white and half-black surround is shown in Fig. 14. Using Vos and van den Berg's Glare Spread Function, 34 it is possible to calculate the radiance image on the retina. The dynamic range of its retinal image is 2.0 log units. Depending on the content of the surround, the dynamic range of the retinal image changes from 1.5 to 4.0 log units.
Young observers, with low levels of intraocular glare, were asked to make magnitude estimates of appearances of test areas in Fig. 15. Given the endpoints of sensations (White ¼ 100, and Black ¼ 1), the observers estimated the appearance of 40 gray squares, in 20 pairs of squares. The vertical axis in Fig. 15 is the magnitude estimates of lightness. The plots of the retinal response functions (retinal luminance versus lightness appearance) show markedly different functions depending on scene content.
The envelope of visual response functions is measured by these experiments. There is no single visual response function to light. The response varies with the specific scene content.
Intraocular glare causes large changes in the dynamic range of light on the retina as the result of scene content. This is illustrated in Fig. 16. The first powerful spatial process is optical. Glare from all parts of the scene reduces the retinal light range of a beach scene to very low levels. Nevertheless, apparent contrast is highest when retinal range is lowest. The second powerful spatial process is neural; it is performed by post-quanta-catch spatial processes.
The combination these two processes is a cancelation of scene-dependent glare by scene-dependent neural contrast. The first spatial mechanism introduces substantial changes to the optical image, and the second mechanism transforms   the neural response. Remarkably, the resulting sensations minimize the effects of intraocular glare. They show only small residual differences in appearance. Objects appear more constant because of the powerful postquanta-catch neural processing.

Summary: Observer Data Defines a Model of
Spatial Vision The ensemble of Lightness experiments reviewed in Sec. 3 measures important properties of human vision. This ensemble reveals vision's unique pair (optical and neural) spatialimage-processing mechanisms. Section 2 documents the need for three independent color Retinex channels, each with spatial lightness rendering.
To understand, and improve, our image reproduction algorithms, we must understand how human vision processes our reproductions. If a reproduction has to reproduce what we see in all scenes, then that process must have a sophisticated model of human spatial vision.

Retinex Scene Reproduction Algorithms
Land initiated the idea that we needed a model of spatial vision to make better reproductions. That model needed to capture the wide range of scene radiances as input, spatially compare them to calculate sensations, and then display them. 16,30,31 The Land and McCann Retinex Reproduction Algorithm has four ideas as its foundation. They are analogous to the four legs of a table.
1. Retinex is a model of human vision. The idea was to make better reproductions by incorporating an algorithm that mimicked vision. The first leg was extensive measurements of appearance in a wide variety of scenes in which appearances did not correlate with luminances [Refs. 13 and 21 (Chapter 35)] 2. The Land and McCann (L&M) Reset-As described in Ref. 12, it was an accident. The Retinex analog electronic circuit had a reset introduced by the electronics that acted to normalize the output. 35 When we modeled reset's properties we learned that it acted to normalize the different values reported on different paths. We found empirically that the combination of reset with the right length of path were all the parameters needed to model all of our difficult appearance test targets. We also found that the threshold, a logical operation designed to remove gradients, did not mimic vision (McCann). 17,20 Using the Land and McCann Reset, we learned that we could successfully mimic vision. We used that data to set the parameters of our reproduction model. 3. Computational efficiency-In the 1970s, any attempt to perform electronic image processing had to be extremely efficient. By 1975, we abandoned 1-D paths and moved to experimenting with 2-D array processors to implement our algorithms using 512 × 512 arrays. 16 The L&M Reset was extremely efficient as a design feature. The Zoom Multiresolution implementation 17 is O(N) in BigO notation. 4. Sensation versus perception-In 1980, at the AIC conference in Cambridge United Kingdom, L&M Retinex made a major clarification of our language about the Retinex model. We turned to the JOSA definitions of Sensation and Perception. 36 We wanted to differentiate our bottom-up model (sensation) from Helmholtz idea of discounting the illumination, to recognize surface reflectance. Using the OSA definitions, perception implies recognition, implies top-down lightness generation, implies Helmholtz-not Land.
That lecture 36  An important additional problem is that the spatial algorithm that mimics vision resides in the middle of the scene-reproduction processing pipeline. Assuming that the model successfully calculates sensations, we still have the practical problem of transforming that 2-D array of sensations into the appropriate signal for the reproduction media device. The print or display device needs an image that is calibrated for its conversion process from digits to light, viewed by the observer. That postspatial process also requires chroma and tone-scale enhancements to suit consumers' preferences. 16 Unfortunately, it can be much more convenient to take a shortcut. If the goal is simply to make a better scene reproduction, one can take a photograph of a scene, apply a spatial algorithm, and send that processed image to the output device. This shortcut removes two tedious tasks: • First, it omits camera calibration to capture accurate radiance information. • Second, it replaces the task of matching sensations with just asking the observer to evaluate the output. Which image looks best? Or, does the image appear to have the desired improvement?
Many authors have used this approach. There is no doubt that their algorithms have made improved renditions of the images that they selected. But, are these algorithms successful models of vision? Do these algorithms provide a general solution to the problems of scene reproduction? Or, are they simply singular examples of trial-and-error image manipulations?
The biggest problem with the visual inspection technique is that it does not include a discussion of the role of human vision in the algorithm's evaluation process. If vision is a powerful spatial image processing mechanism, then what are the specific effects of using vision to measure success? Looking at the algorithm's output image means that the observer is applying those same spatial image processing mechanisms a second time in looking at the experiment.
It is a mistake to use observer preferences to evaluate the accuracy of a model of vision. It fails to separate the model's spatial processing from subsequent human spatial processing. Is the human processing the source of the improvement, rather than the digital algorithmic processing?

Additional Retinex Algorithms
In the Art and Science of HDR Imaging, Section F: HDR Image Processing, 21 McCann and Rizzi attempt to discriminate between all the different Retinexes and related algorithms. That description took about 100 pages to cover the history and make clear distinctions between algorithms.
McCann and Rizzi defined and differentiated the following: Land

Discrimination between Spatial Algorithms
The dual challenge of Retinex continues today. How do we model human vision? How do we make better reproductions using that model? The answer to that challenge will be determined by the ground-truth data that we decide are important in evaluating images.
The quality of ground-truth selection will determine the quality of the algorithms. We need to get beyond simple evaluation principles of observer preference, color balance, and HDR compression. The Retinex approach studied human vision to understand its mechanisms. By thoughtfully collecting sets of difficult scene content, we can improve our ability to discriminate between moderately successful algorithms for some scenes and excellent algorithms for all scenes. In recent decades, the number and diversity of spatial algorithms have expanded dramatically. However, visual inspection of processed images lacks the discrimination to identify superior algorithms.
The original Retinex process used measured sensations created by a collection of challenging scene content: color constancy, gradients in illumination, constant spatial statistics, and illumination with edges. Each of these scenes provides a different challenge for a model of human vision. A successful model of vision should be able to predict observer matches in all these scene contents.

Retinex Falls between Colorimetry and
Perception of the Surface of Objects Colorimetry makes the unspecified assumption that spatial processes are absent from vision. While everyone agrees that quanta catch is necessary in a model of vision, no one should argue that it is sufficient. Human color vision is a spatial process.
There is an equally bad underlying "perception" assumption, namely, that "Objects Appear Constant" in all complex scenes. Here, the pendulum has swung to the opposite extreme. The underlying assumption is that a surface's reflectance controls its appearance. Unfortunately, many authors mistakenly cite Land's experiments as evidence for this idea. Some even cite Land's experiments as evidence that spatial image processing can "discount the illumination," so as to separate illumination from reflectance. Retinex does not do that. That notion is incompatible with Land's writings: • The last sentence in Land's Ives Medal Address: "the function of retinex theory is to tell how the eye can ascertain reflectance in a field in which the illumination is unknowable and the reflectance is unknown." 12 • In the discussion of the "biological correlate of reflectance" [Refs. 11, 13, 15, 21 (Chapter 32)], Land cited many examples of test stimuli in which lightness did not correlate with physical reflectance.
Just as we cannot think that cone quanta catch alone can predict color, we cannot think that all objects always appear constant. Both the "Colorimetry" and the "Objects Appear Constant" assumptions are incompatible with accurate measurements of vision.  On the left side, the red patches fall on top of the yellow stripes; and on the right side, they fall on blue stripes. The left patches appear a purple red, while the right ones appear a yellow orange. In other words, the left patches appear more blue, and the right ones more yellow. 39 In Fig. 17 (bottom), the apparent lightnesses of the sets of red squares are different: • In the L separation, the squares are lighter on the right; • In the M separation, these squares are lighter on the right; • in the S separation, the squares are darker on the right.
Land's Retinex Theory predicts that whenever L and M separations are lighter and the S separation is darker, then that patch will appear more yellow. Whenever the S separation is lighter, and L and M separations are darker, then those squares will appear more blue. Colors correlate with L, M, S lightnesses.
Land's Retinex Theory predicts that color in complex scenes correlates with the apparent lightnesses in long-, middle-and short-wave light. The triplet of Retinex Lightnesses, rather than the triplet of surface reflectances, predicts color appearance. That prediction still stands after more than 50 years.
The Retinex Theory of Color led to a wide variety of spatial image algorithms discussed in this Retinex at 50-Special Issue. Land introduced the idea that a model of spatial vision should be the foundation of spatial image processing algorithms that make better scene reproductions. Furthermore, the measurements of observer sensations should be the ground truth used to design and to evaluate the success of these algorithms. This paper reviews the ground truth measurements that can help us model vision. Furthermore, these ground truth data help us find the general solution for image reproductions for all types of scenes. 40