With the wide-spread availability of rigorous electromagnetic (vector) analysis codes for describing the diffraction of electromagnetic waves by specific periodic grating structures, the insight and understanding of nonparaxial parametric diffraction grating behavior afforded by approximate methods (i.e., scalar diffraction theory) is being ignored in the education of most optical engineers today. Elementary diffraction grating behavior is reviewed, the importance of maintaining consistency in the sign convention for the planar diffraction grating equation is emphasized, and the advantages of discussing “conical” diffraction grating behavior in terms of the direction cosines of the incident and diffracted angles are demonstrated. Paraxial grating behavior for coarse gratings (*d* ≫ λ) is then derived and displayed graphically for five elementary grating types: sinusoidal amplitude gratings, square-wave amplitude gratings, sinusoidal phase gratings, square-wave phase gratings, and classical blazed gratings. Paraxial diffraction efficiencies are calculated, tabulated, and compared for these five elementary grating types. Since much of the grating community erroneously believes that scalar diffraction theory is only valid in the paraxial regime, the recently developed linear systems formulation of nonparaxial scalar diffraction theory is briefly reviewed, then used to predict the nonparaxial behavior (for transverse electric polarization) of both the sinusoidal and the square-wave amplitude gratings when the +1 diffracted order is maintained in the Littrow condition. This nonparaxial behavior includes the well-known Rayleigh (Wood’s) anomaly effects that are usually thought to only be predicted by rigorous (vector) electromagnetic theory.

## 1.

## Introduction

The fundamental diffraction problem consists of two parts: (i) determining the effects of introducing the diffracting aperture (or grating) upon the field immediately behind the screen and (ii) determining how it affects the field downstream from the diffracting screen (i.e., what is the field immediately behind the grating and how does it propagate).

A “diffraction grating” is an optical element that imposes a “periodic” variation in the amplitude and/or phase of an incident electromagnetic wave.^{1} It thus produces, through constructive interference, a number of discrete diffracted orders (or waves) which exhibit dispersion upon propagation. Diffraction gratings are thus widely used as dispersive elements in spectrographic instruments,^{2}^{–}^{5} although they can also be used as beam splitters or beam combiners in various laser devices or interferometers. Other applications include acousto-optic modulators or scanners.^{6}

One example of a diffraction grating would be a periodic array of a large number of very narrow slits. This would be a binary amplitude grating (completely opaque or completely transparent). Consider the cylindrical Huygens’ wavelet produced at each narrow slit when the grating is illuminated by a normally incident plane wave as shown in Fig. 1. It is clear to see that there will be constructive interference only in those discrete directions where the optical path difference from adjacent slits is an integral number of wavelengths (i.e., phase differences in multiples of $2\pi $). Every point $P$ in the focal plane of the lens that satisfies this condition will exhibit a primary maximum. The angular width of this interference maximum depends upon the number of slits making up the grating. Figure 2 illustrates the one-dimensional profile of the Fraunhofer diffraction pattern of an array of slits as we progress from two slits (Young’s interference pattern) to three slits, to five slits, and to eleven slits.

The trend is evident. In the limit of a large number of very narrow slits, the primary interference maxima (diffraction orders) become narrower and narrower, with more and more ($n-2$) small secondary maxima in between them.

The first reported observation of diffraction grating effects was made in 1785 when Francis Hopkinson (one of the signers of the declaration of independence and George Washington’s first Secretary of the Navy) observed a distant street lamp through a fine silk handkerchief. He noticed that this produced multiple images, which to his astonishment did not change location with motion of the handkerchief. He mentioned his discovery to the astronomer David Rittenhouse. Rittenhouse recognized the observed phenomenon as a diffraction effect and promptly made a diffraction grating by wrapping fine wire around the threads of a pair of fine pitch screws. Knowing the pitch of his screws in terms of the Paris inch, he determined the approximate wavelength of light.^{7}

Since the diffraction angle for a given order varies with wavelength, a diffraction grating produces angular dispersion. This angular dispersion is illustrated in Fig. 3 for a grating with a period $d=10\text{\hspace{0.17em}\hspace{0.17em}}\mu \mathrm{m}$. Diffracted orders for wavelengths 450, 550, and 650 nm are plotted versus angle.

Spectral resolution and diffraction efficiency are quantities of practical interest in many diffraction grating applications. The diffraction efficiency is defined as the fraction of the incident optical power that appears in a given diffracted order of the grating. Note from Fig. 3 that the zero order exhibits no dispersion, and there is twice as much dispersion in the second order as there is in the first order.

Diffraction gratings can be categorized according to several different criteria: their geometry, material, their efficiency behavior, their method of manufacture, or their intended application. We thus talk about:

amplitude and phase gratings;

reflection and transmission gratings;

binary gratings;

symmetrical and blazed gratings;

plane and concave gratings;

ruled, holographic, and lithographic gratings;

Bragg type and Raman-Nath type gratings;

waveguide gratings; and

fiber gratings.

We acknowledge that this list of grating types is nonexhaustive and nonexclusive but none-the-less is useful for comparing and contrasting grating performance for different gratings types, characteristics, and manufacturing techniques.

Joseph von Fraunhofer began his detailed study of diffraction gratings about 1821. He built the first ruling engine for fabricating reflection gratings on metallic substrates. His insight into the diffraction process led him to predict that diffraction efficiency behavior would “strain even the cleverest of physicists,” which it did for the next 150 years. Many of Fraunhofer’s findings were written up in great detail, so we are entirely justified in calling him the father of diffraction grating technology.^{8}^{,}^{9}

A whole new era of spectral analysis opened up with Rowland’s famous paper in 1882. He constructed sophisticated ruling engines and invented the “concave grating,” a device of spectacular value to modern spectroscopists.^{10}

John Strong, quoting G. R. Harrison, stated in a JOSA article in 1960—It is difficult to point to another single device that has brought more important experimental information to every field of science than the diffraction grating. The physicist, the astronomer, the chemist, the biologist, the metallurgist, all use it as a routine tool of unsurpassed accuracy and precision, as a detector of atomic species to determine the characteristics of heavenly bodies and the presence of atmospheres in the planets, to study the structures of molecules and atoms, and to obtain a thousand and one items of information without which modern science would be greatly handicapped.”^{11}

A troublesome aspect of the multiple order behavior of diffraction gratings is that adjacent higher order spectra frequently overlap. In fact, in Fig. 3, one can see the third-order principle maximum for blue light almost overlapping the second-order red principle maximum. One can readily show that the second order for wavelengths 100, 200, and 300 nm is diffracted into the same directions as the first order for wavelengths 200, 400, and 600 nm.

Two generalizations to the behavior of gratings must now be discussed. First, if the individual slits making up the grating have significant width (in order to transmit more light), the Fraunhofer diffraction pattern of an individual slit will form an envelope function modulating the strength of the discrete diffracted orders.^{12}^{–}^{15} For the case illustrated in Fig. 4, we have chosen the width of the slits to be one-third of the slit separation. You will note that every third diffracted order is absent. This is caused by the envelope function going to zero at those locations.

The second generalization includes the situation where the light is incident upon the grating at an arbitrary angle ${\theta}_{i}$ rather than normal incidence. This situation will be taken care of by including the incident angle in the grating equation discussed in Sec. 2, where we will review the planar grating equation and the sign convention for numbering the various diffracted orders.

The more general phenomenon of “conical” diffraction that occurs with large obliquely incident angles will be discussed in Sec. 3 and the parametric behavior will be shown to be particularly simple and intuitive when formulated and displayed in terms of the direction cosines of the incident and diffracted angles. In Sec. 4, we will use the remarkably intuitive direction cosine diagram to portray the conical grating behavior exhibited in the presence of large obliquely incident beams and arbitrary orientation of the grating. Section 5 examines the paraxial diffraction efficiency behavior of several elementary grating types. Section 6 will review the underlying concepts of nonparaxial scalar diffraction theory and apply them to the sinusoidal and square-wave amplitude gratings when the $+1$ diffracted order is maintained in the “Littrow condition.” This nonparaxial behavior includes the well-known Rayleigh (Wood’s) anomaly effects that are usually thought to only be predicted by rigorous (vector) electromagnetic theory.^{16}

A summary, statement of conclusions, and an extensive set of references will then complete this paper.

## 2.

## Planar Grating Equation and Sign Convention

Monochromatic light of wavelength $\lambda $ incident upon a refractive transmission grating (interface between two dielectric media exhibiting a periodic surface relief pattern) of spatial period $d$ at an angle of incidence ${\theta}_{i}$ will be diffracted into the discrete angles ${\theta}_{m}$ according to the following (planar) grating equation:^{3}^{,}^{4}^{,}^{16}^{–}^{18}

## Eq. (1)

$${n}^{\prime}\text{\hspace{0.17em}}\mathrm{sin}\text{\hspace{0.17em}}{\theta}_{m}-n\text{\hspace{0.17em}}\mathrm{sin}\text{\hspace{0.17em}}{\theta}_{i}=-m\lambda /d,\phantom{\rule[-0.0ex]{2.0em}{0.0ex}}m=0,\pm 1,\pm 2,\pm 3,$$The equation for a reflection grating can be obtained by setting ${n}^{\prime}=-n$, just as we do when tracing rays from a reflecting surface:^{4}

## Eq. (2)

$$\mathrm{sin}\text{\hspace{0.17em}}{\theta}_{m}+\mathrm{sin}\text{\hspace{0.17em}}{\theta}_{i}=m\lambda /nd,\phantom{\rule[-0.0ex]{2.0em}{0.0ex}}m=0,\pm 1,\pm 2,\pm 3.$$Note that setting $m=0$ in Eq. (1) results in ${\theta}_{0}$ having the same sign as ${\theta}_{i}$. Likewise, setting $m=0$ in Eq. (2) results in ${\theta}_{0}$ having the opposite sign as ${\theta}_{i}$ . We have thus adopted a sign convention that conforms to that used in geometrical optics whereby all angles are directional quantities measured from optical axes or surface normals to refracted or reflected rays. These directional angles are “positive if counterclockwise,” and “negative if clockwise.” An “angle” here is the smaller of the two angles that a ray forms with the axis or surface normal.

For a thin diffraction grating in air, we thus have $n={n}^{\prime}=1$, and the two grating equations can be combined to yield

## Eq. (3)

$$\mathrm{sin}\text{\hspace{0.17em}}{\theta}_{m}\mp \mathrm{sin}\text{\hspace{0.17em}}{\theta}_{i}=\mp m\lambda /d,\phantom{\rule[-0.0ex]{2.0em}{0.0ex}}m=0,\pm 1,\pm 2,\pm 3.$$Here the minus signs describe a transmission grating and the plus signs describe a reflection grating as illustrated in Fig. 5. Note from this figure that the zero order corresponds to the directly transmitted or specularly reflected beam.

The arrangement of the diffracted orders is the same for the two gratings except they are reflected about the plane of the reflection grating. Note also that the algebraic signs of two directional angles are different if they are measured on different sides of the grating normal. A final useful observation is that for both the transmission and the reflection grating, the positive diffracted orders lie on the same side of the grating normal as the incident beam; whereas the negative diffracted orders lie on the opposite side of the grating normal from the incident beam. A “plus” sign has thus been placed on the lower side of the grating normal in Fig. 5 and a “minus” sign has been placed on the upper side of the grating normal as an indicator of our sign convention. Some authors absorb the minus sign on the right side of Eq. (3) into the $m$, thus achieving a seemingly simpler equation. However, this results in a different sign convention for labeling the diffracted orders.

We have specifically chosen the form of Eq. (3) not only to maintain the sign convention for directional angles used almost exclusively in geometrical optics and optical design ray trace codes (positive if counterclockwise and negative if clockwise), but also to be consistent with the sign convention for labeling diffraction grating order numbers used by the popular *Diffraction Grating Handbook* published and distributed free by the Newport Corporation (formerly Richardson Grating Laboratory).^{19}

The above grating equations are restricted to the special case where the grating grooves/lines are oriented perpendicular to the plane of incidence, i.e., the plane containing the incident beam and the normal to the grating surface. For this situation, all of the diffracted orders lie in the plane of incidence.

The more general phenomenon of conical diffraction that occurs with large obliquely incident angles is rarely discussed in elementary optics or physics text books. However, the formulation of a nonparaxial scalar diffraction theory^{20}^{–}^{23} provides a simple and intuitive means of gaining additional insight into this nonparaxial diffraction grating behavior.

## 3.

## Conical Diffraction in Direction Cosine Space

Consider diffraction from a conventional linear reflection grating. However, suppose the incident light strikes the grating at a large oblique angle (represented by direction cosines ${\alpha}_{i}$ and ${\beta}_{i}$) as illustrated in Fig. 6. The resulting diffraction behavior is described by the following grating equation written in terms of the direction cosines of the propagation vectors of the incident beam and the diffracted orders (the grooves are assumed to be parallel to the $y$ axis):^{24}

## Eq. (4)

$${\alpha}_{m}+{\alpha}_{i}=m\lambda /d,\phantom{\rule[-0.0ex]{2.0em}{0.0ex}}{\beta}_{m}+{\beta}_{i}=0,$$## Eq. (5)

$${\alpha}_{m}=\mathrm{sin}\text{\hspace{0.17em}}{\theta}_{m}\text{\hspace{0.17em}}\mathrm{cos}\text{\hspace{0.17em}}{\varphi}_{\mathrm{o}},\phantom{\rule[-0.0ex]{1.0em}{0.0ex}}{\alpha}_{i}=-\mathrm{sin}\text{\hspace{0.17em}}{\theta}_{\mathrm{o}}\text{\hspace{0.17em}}\mathrm{cos}\text{\hspace{0.17em}}{\varphi}_{\mathrm{o}},\phantom{\rule[-0.0ex]{1.0em}{0.0ex}}{\beta}_{i}=-\mathrm{sin}\text{\hspace{0.17em}}{\varphi}_{\mathrm{o}}.$$The diffracted orders now propagate along the surface of a cone and will strike the observation hemisphere in a cross section that is not a great circle, but instead a latitude slice as illustrated for a reflection grating in Fig. 6. Note that the direction cosines are obtained by merely projecting the respective points on the hemisphere down onto the plane of the aperture and normalizing to a unit radius. Even for large angles of incidence and large diffracted angles, the various diffracted orders are equally spaced and lie on a straight line only in the direction cosine space.

This behavior is even more evident in Fig. 7, in which the location of the incident beam and the diffracted orders are displayed in direction cosine space for a reflection grating whose grooves are parallel to the $y$ or $\beta $ axis. The diffracted orders are always exactly equally spaced in direction cosine space and lie in a straight line perpendicular to the orientation of the grating grooves. From Eq. (4), this equidistant spacing of diffracted orders is readily shown to be equal to the nondimensional quantity $\lambda /d$. The diffracted orders that lie inside the unit circle are real and propagate, and the diffracted orders that lie outside the unit circle are evanescent (and thus do not propagate).

For a reflection grating, the undiffracted zero order always lies diametrically opposite the origin of the $\alpha -\beta $ coordinate system from the incident beam. As the incident angle is varied, the diffraction pattern (size, shape, separation, and orientation of diffracted orders) remains unchanged but merely shifts its position maintaining the above relationship between the zero order and the incident beam. Note also that when the plane of incidence is perpendicular to the grating grooves (${\varphi}_{0}=0$), Eq. (4) reduces to the familiar grating equation presented in Eq. (3).

For a transmission grating, with our sign convention, the diffraction angle for the zero order is equal to the incident angle (${\theta}_{0}={\theta}_{i}$). Thus the coordinates of the location in the direction cosine diagram representing the zero order and the incident beam are superposed as illustrated as shown in Fig. 8.

As with the case of the reflection grating, the diffracted orders remain equally spaced and in a straight line as the incident angle is changed, i.e., the size, shape, separation, and orientation of diffracted orders again remains unchanged, merely shifting its position such that the zero order remains superposed upon the incident beam.

Figure 9 illustrates the propagating diffracted orders that would exist if a beam were normally incident upon a transmission diffraction grating with $\lambda /d=0.08333$. There would be precisely 25 propagating diffracted orders including the two at $\pm 90\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{deg}$. The uniform diffracted order spacing in direction cosine space $\mathrm{\Delta}\beta $ is contrasted with the increasing angular spacing $\mathrm{\Delta}\theta $ , and the even more rapidly increasing linear spacing $\mathrm{\Delta}x$, when the diffracted orders are projected upon a plane observation screen.

For normal incidence, the diffraction grating equation yields

## Eq. (6)

$$m=\frac{d}{\lambda}\text{\hspace{0.17em}}\mathrm{sin}\text{\hspace{0.17em}}{\theta}_{m},\phantom{\rule[-0.0ex]{1.0em}{0.0ex}}\text{thus}\phantom{\rule[-0.0ex]{1.0em}{0.0ex}}\frac{dm}{d{\theta}_{m}}=\frac{d}{\lambda}\text{\hspace{0.17em}}\mathrm{cos}\text{\hspace{0.17em}}{\theta}_{m}.$$Taking the reciprocal of this derivative and writing it as a ratio of differences, we have

## Eq. (7)

$$\frac{\mathrm{\Delta}{\theta}_{m}}{\mathrm{\Delta}m}=\frac{\lambda}{d\text{\hspace{0.17em}}\mathrm{cos}\text{\hspace{0.17em}}{\theta}_{m}}.$$Setting $\mathrm{\Delta}m$ equal to unity, we obtain the following expression for the angular spacing of “adjacent” diffracted orders as a function of diffracted angle:

## Eq. (8)

$$\mathrm{\Delta}{\theta}_{m}=\frac{\lambda}{d\text{\hspace{0.17em}}\mathrm{cos}\text{\hspace{0.17em}}{\theta}_{m}}.$$Similarly, from Fig. 9, we can see that

where $L$ is the distance between the grating and the observation screen.Taking the derivative of ${x}_{m}$ with respect to $m$, we obtain

## Eq. (10)

$$\frac{\mathrm{\Delta}{x}_{m}}{\mathrm{\Delta}m}=\frac{d{x}_{m}}{dm}=\frac{d{x}_{m}}{d{\theta}_{m}}\frac{d{\theta}_{m}}{dm}=\frac{\lambda}{d}L(\frac{1}{\mathrm{cos}\text{\hspace{0.17em}}{\theta}_{m}}+\frac{{\mathrm{sin}}^{2}\text{\hspace{0.17em}}{\theta}_{m}}{{\mathrm{cos}}^{3}\text{\hspace{0.17em}}{\theta}_{m}}).$$Again setting $\mathrm{\Delta}m$ equal to unity yields an expression for the linear spacing of adjacent diffracted orders projected upon a plane observation screen as a function of diffracted angle

## Eq. (11)

$$\mathrm{\Delta}{x}_{m}=\frac{\lambda}{d}L(\frac{1}{\mathrm{cos}\text{\hspace{0.17em}}{\theta}_{m}}+\frac{{\mathrm{sin}}^{2}\text{\hspace{0.17em}}{\theta}_{m}}{{\mathrm{cos}}^{3}\text{\hspace{0.17em}}{\theta}_{m}}).$$Plotting the expressions provided by Eqs. (8) and (11) provides a graphical comparison of the relative spacing between adjacent diffracted orders $\mathrm{\Delta}x$, $\mathrm{\Delta}\theta $, and $\mathrm{\Delta}\beta $.

Figure 10 indicates that both $\mathrm{\Delta}x$ and $\mathrm{\Delta}\theta $ asymptotically approaches infinity for diffracted angles of 90 deg, whereas $\mathrm{\Delta}\beta $ remains constant for all diffracted angles. When projected upon a plane screen, the spacing of adjacent diffracted orders increases by a factor of two (100% increase) at a diffraction angle of merely 38 deg. The angular spacing of adjacent diffracted orders increases by a factor of two at a diffraction angle of 60 deg. If only a 5% increase in $\mathrm{\Delta}x$ were allowed, the diffraction angle would have to be held below 10 deg. For $\mathrm{\Delta}\theta $, a 5% increase is observed at a diffraction angle of 18 deg.

## 4.

## General Grating Equation and the Direction Cosine Diagram

For obliquely incident beams and arbitrarily oriented gratings, a complicated three-dimensional diagram is required to depict the diffraction behavior in real space.^{25} However, the direction cosine diagram provides a simple and intuitive means of determining the diffraction grating behavior even for these general cases. The general grating equation for a reflection grating with arbitrarily oriented lines (grooves) is given by^{24}

## Eq. (12)

$${\alpha}_{m}+{\alpha}_{i}=\left(\frac{m\lambda}{d}\right)\mathrm{sin}\text{\hspace{0.17em}}\psi \phantom{\rule{0ex}{0ex}}{\beta}_{m}+{\beta}_{i}=\left(\frac{m\lambda}{d}\right)\mathrm{cos}\text{\hspace{0.17em}}\psi ,$$Note that in all cases, the zero order is diametrically opposite to the origin from the incident beam and the diffracted orders remain equally spaced in a straight line. However, this line is rotated about the zero order such that it is always perpendicular to the grating grooves. This simple behavior of conical diffraction from linear gratings when expressed in direction cosine space provides understanding and insight not provided by most textbook treatments. It is interesting to note that Rowland expressed the grating equation in terms of direction cosines in a paper published over 125 years ago.^{26}

We have demonstrated that when the grating equation is expressed in terms of the direction cosines of the propagation vectors of the incident beam and the diffracted orders, even wide-angle diffraction phenomena (including conical diffraction from arbitrarily oriented gratings) is shift invariant with respect to variations in the incident angle. New insight and an intuitive understanding of diffraction behavior for arbitrary grating orientation were then shown to result from the use of a simple direction cosine diagram.

## 5.

## Paraxial Grating Behavior (Coarse Gratings)

In this section, we discuss the paraxial predictions of diffraction efficiency for five basic types of diffraction gratings: sinusoidal amplitude gratings, square-wave amplitude gratings, sinusoidal phase gratings, square-wave phase gratings, and the classical blazed grating (sawtooth groove profile). The paraxial diffraction efficiencies of various diffracted orders will then be tabulated and compared for these five elementary grating types. For all cases, transverse electric (TE) polarization for the incident beam has been assumed.

If the grating is placed immediately behind an aberration-free positive lens of focal length $f$ that is uniformly illuminated by a normally incident plane wave as illustrated in Fig. 12, the Fraunhofer diffraction pattern produced in the back focal plane of the lens is given by^{27}^{,}^{28}

## Eq. (13)

$${E}_{2}({x}_{2},{y}_{2})=\frac{{E}_{0}}{{\lambda}^{2}{f}^{2}}|\mathcal{F}\{{t}_{A}({x}_{1},{y}_{1})\}{|}_{\genfrac{}{}{0ex}{}{\xi ={x}_{2}/\lambda f}{\eta ={y}_{2}/\lambda f}}{|}^{2},$$## Eq. (14)

$$\mathcal{F}\{t({x}_{1},{y}_{1})\}={\int}_{-\infty}^{\infty}{\int}_{-\infty}^{\infty}{t}_{A}({x}_{1},{y}_{1})\mathrm{exp}[-i2\pi ({x}_{1}\xi +{y}_{1}\eta )]\mathrm{d}{x}_{1}\text{\hspace{0.17em}}\mathrm{d}{y}_{1}.$$Both Goodman^{27} and Gaskill^{28} discussed in some detail both the Fraunhofer and the Fresnel approximations and the geometrical criteria for each. Goodman, in particular, showed that the cosine obliquity factor in the more general Huygens–Fresnel principle must be approximately unity for both the Fraunhofer and the Fresnel approximations to be valid. It is this requirement that limits our diffraction angles to be paraxial angles.

The classical definition of a paraxial ray is that the ray must lie close to, and make a small angle with, the optical axis, i.e., ^{29}^{,}^{30}

## Eq. (15)

$$\mathrm{sin}\text{\hspace{0.17em}}\theta \sim \theta ,\phantom{\rule[-0.0ex]{1.0em}{0.0ex}}\mathrm{tan}\text{\hspace{0.17em}}\theta \sim \theta ,\phantom{\rule[-0.0ex]{1.0em}{0.0ex}}\text{and}\phantom{\rule[-0.0ex]{1.0em}{0.0ex}}\mathrm{cos}\text{\hspace{0.17em}}\theta \sim 1.$$This paraxial requirement obviously places strong limitations on the applicability of the results of this section concerning the grating period-to-wavelength ratio $d/\lambda $. The paraxial expressions in Eq. (15) are accurate to within 5% if the angle does not exceed about 18 deg. Although scalar diffraction theory is known to predict diffraction grating performance for TE-polarized light, not transverse magnetic (TM) or unpolarized light,^{22} at these paraxial angles there will be very little difference between the diffraction efficiency for the two orthogonal polarizations.

## 5.1.

### Sinusoidal Amplitude Grating

The complex amplitude transmittance of a thin sinusoidal amplitude grating can be written as

## Eq. (16)

$${t}_{A}({x}_{1},{y}_{1})=[\frac{1}{2}+\frac{a}{2}\text{\hspace{0.17em}}\mathrm{cos}(2\pi {x}_{1}/d)]\mathrm{rect}(\frac{{x}_{1}}{w},\frac{{y}_{1}}{w}).$$We have assumed that the grating is bounded by a square aperture of width $w$. The parameter $a$ represents the peak-to-peak variation in amplitude transmittance and $d$ is the spatial period of the grating. Figure 13(a) shows a two-dimensional image of the grating, and Fig. 13(b) illustrates a profile of the amplitude transmittance in the $x$ direction.

If this grating is placed immediately behind an aberration-free positive lens of focal length $f$ that is uniformly illuminated by a normally incident plane wave as illustrated in Fig. 12, the Fraunhofer diffraction pattern produced in the back focal plane of the lens is given by Eq. (13).

Applying the scaling theorem and the convolution theorem of Fourier transform theory,^{28} we can write the Fourier transform of Eq. (16) as

## Eq. (17)

$$\mathcal{F}\{{t}_{A}({x}_{1},{y}_{1})\}=[\frac{1}{2}\delta (\xi ,\eta )+\frac{a}{4}\delta (\xi +\frac{1}{d},\eta )+\frac{a}{4}\delta (\xi -\frac{1}{d},\eta \left)\right]**{w}^{2}\mathrm{sinc}(w\xi ,w\eta ),$$^{28}

Due to the replication property of convolution with a delta function, and since the two-dimensional function is separable into the product of two one-dimensional functions:^{28}

## Eq. (18)

$$\mathcal{F}\{{t}_{A}({x}_{1},{y}_{1})\}{|}_{\genfrac{}{}{0ex}{}{\xi ={x}_{2}/\lambda f}{\eta ={y}_{2}/\lambda f}}={w}^{2}\text{\hspace{0.17em}}\mathrm{sinc}\left(\frac{{y}_{2}}{\lambda f/w}\right)\left[\frac{1}{2}\mathrm{sinc}\right(\frac{{x}_{2}}{\lambda f/w})+\frac{a}{4}\text{\hspace{0.17em}}\mathrm{sinc}(\frac{{x}_{2}+\lambda f/d}{\lambda f/w})+\frac{a}{4}\text{\hspace{0.17em}}\mathrm{sinc}(\frac{{x}_{2}-\lambda f/d}{\lambda f/w}\left)\right].$$If there are many grating periods within the aperture, then $w\gg d$, and there will be negligible overlap between the three sinc functions; hence, there will be no cross terms in the squared modulus of this sum. Substituting this into Eq. (13) thus yields the diffracted irradiance distribution in the focal plane of the lens:

## Eq. (19)

$$E({x}_{2},{y}_{2})=\frac{{E}_{0}{w}^{4}}{{\lambda}^{2}{f}^{2}}\text{\hspace{0.17em}}{\mathrm{sinc}}^{2}\left(\frac{{y}_{2}}{\lambda f/w}\right)[\underset{m=0}{\underset{\u23df}{\frac{1}{4}\text{\hspace{0.17em}}{\mathrm{sinc}}^{2}\left(\frac{{x}_{2}}{\lambda f/w}\right)}}+\underset{m=+1}{\underset{\u23df}{\frac{{a}^{2}}{16}\text{\hspace{0.17em}}{\mathrm{sinc}}^{2}\left(\frac{{x}_{2}+\lambda f/d}{\lambda f/w}\right)}}+\underset{m=-1}{\underset{\u23df}{\frac{{a}^{2}}{16}\text{\hspace{0.17em}}{\mathrm{sinc}}^{2}\left(\frac{{x}_{2}-\lambda f/d}{\lambda f/w}\right)}}].$$We thus have three discrete diffracted waves or “orders,” each of which is a scaled replica of the Fraunhofer diffraction pattern of the square aperture bounding the grating. The central diffraction lobe is called the “zero order,” and the two side lobes are called the plus and minus “first orders.” The spatial separation of the first orders from the zero order is $\lambda f/d$, whereas the width of the main lobe of all orders is $2\lambda f/w$ as shown in Fig. 14.

The diffraction efficiency is defined as the fraction of the incident optical power that appears in a given diffracted order (usually the $+1$ order) of the grating. Integrating the irradiance distribution representing a given diffracted order and dividing by the incident optical power ${P}_{o}={E}_{0}{w}^{2}$ gives the diffracted efficiency for that order. Since, for any $b$ and ${x}_{o}$

## Eq. (20)

$${\int}_{-\infty}^{\infty}{\int}_{-\infty}^{\infty}\frac{1}{{b}^{2}}\text{\hspace{0.17em}}{\mathrm{sinc}}^{2}(\frac{x-{x}_{o}}{b},\frac{y}{b})=1,$$## Table 1

Diffraction efficiencies for Fig. 14.

Order # | Efficiency |
---|---|

0 | 0.25 |

$+1$ | ${a}^{2}/16$ |

$-1$ | ${a}^{2}/16$ |

The $+1$ diffracted order thus contains at most (if the quantity $a$ is equal to unity) 6.25% of the optical power incident upon a sinusoidal amplitude grating. This very low diffraction efficiency is not adequate for many applications. As seen in Table 1, the sum of the efficiencies of all three orders is only equal to $1/4+{a}^{2}/8$. The rest of the incident optical power is lost through absorption by the grating.

We will find later in Sec. 6 that a nonparaxial analysis indicates somewhat better performance for certain combinations of grating period and incident angle.

## 5.2.

### Square-Wave Amplitude Grating

The complex amplitude transmittance of a thin square-wave amplitude grating can be written as

## Eq. (21)

$${t}_{A}({x}_{1},{y}_{1})=\left[\mathrm{rect}\right(\frac{{x}_{1}}{b})(1)**\frac{1}{d}\mathrm{comb}(\frac{{x}_{1}}{d})\delta ({y}_{1})]\mathrm{rect}(\frac{{x}_{1}}{w},\frac{{y}_{1}}{w}),$$Again, applying the scaling theorem and the convolution theorem of Fourier transform theory,^{28} we can write

## Eq. (22)

$$\mathcal{F}\{{t}_{A}({x}_{1},{y}_{1})\}=[b\text{\hspace{0.17em}}\mathrm{sinc}(b\xi )\delta (\eta )][\mathrm{comb}(d\xi )(1)]**{w}^{2}\text{\hspace{0.17em}}\mathrm{sinc}(w\xi ,w\eta ).$$However, since the sinc function is separable and the two-dimensional convolution of two separable functions can be written as the product of two one-dimensional convolutions, the above equation can be written as

## Eq. (23)

$$\mathcal{F}\{{t}_{A}({x}_{1},{y}_{1})\}={w}^{2}\frac{b}{d}\{[\mathrm{sinc}(b\xi )][d\mathrm{comb}(d\xi )]*\mathrm{sinc}(w\xi )\}\mathrm{sinc}(w\eta ).$$Also the product of a sinc function with a comb function can be written as an infinite sum of shifted and scaled delta functions,^{28} hence,

## Eq. (24)

$$\mathcal{F}\{{t}_{A}({x}_{1},{y}_{1})\}={w}^{2}\frac{b}{d}\left\{\right[\sum _{m=-\infty}^{\infty}\mathrm{sinc}\left(\frac{mb}{d}\right)\delta (\xi -m/d)]*\mathrm{sinc}(w\xi )\}\mathrm{sinc}(w\eta ).$$Due to the replication property of convolution with delta functions, we can now write the quantity in the curly bracket as an infinite series of shifted and scaled sinc functions, thus eliminating the convolution operation from the above equation:

## Eq. (25)

$$\mathcal{F}\{{t}_{A}({x}_{1},{y}_{1})\}={w}^{2}\frac{b}{d}\left[\sum _{m=-\infty}^{\infty}\mathrm{sinc}\right(\frac{mb}{d})\mathrm{sinc}(\frac{\xi -m/d}{1/w}\left)\right]\mathrm{sinc}(w\eta ).$$Evaluating this function at spatial frequencies $\xi ={x}_{2}/\lambda f$ and $\eta ={y}_{2}/\lambda f$, and again writing as a two-dimensional function, we obtain

## Eq. (26)

$$\mathcal{F}\{{t}_{A}({x}_{1},{y}_{1})\}{|}_{\genfrac{}{}{0ex}{}{\xi ={x}_{2}/\lambda f}{\eta ={y}_{2}/\lambda f}}={w}^{2}\frac{b}{d}\left[\sum _{m=-\infty}^{\infty}\mathrm{sinc}\right(\frac{mb}{d}\left)\mathrm{sinc}\right(\frac{{x}_{2}-m\lambda f/d}{\lambda f/w},\frac{{y}_{2}}{\lambda f/w}\left)\right].$$Since $w\gg d$, there is again negligible overlap between the discrete diffracted orders, and there will be no cross terms in the squared modulus of this sum. The Fraunhofer diffraction pattern predicted by Eq. (13) for a square-wave amplitude grating is thus given by

## Eq. (27)

$$E({x}_{2},{y}_{2})=\frac{{E}_{0}{w}^{4}}{{\lambda}^{2}{f}^{2}}\frac{{b}^{2}}{{d}^{2}}\left[\sum _{m=-\infty}^{\infty}{\mathrm{sinc}}^{2}\right(\frac{mb}{d}){\mathrm{sinc}}^{2}(\frac{{x}_{2}-m\lambda f/d}{\lambda f/w},\frac{{y}_{2}}{\lambda f/w}\left)\right].$$There is thus a myriad of diffracted orders produced by the square-wave amplitude grating as shown in Fig. 16. However, they are rapidly attenuated by the ${\mathrm{sinc}}^{2}$ envelope function. The irradiance distribution representing the $m$’th diffracted order is thus given by

## Eq. (28)

$${E}_{m}({x}_{2},{y}_{2})={E}_{0}{w}^{2}\frac{{b}^{2}}{{d}^{2}}\text{\hspace{0.17em}}{\mathrm{sinc}}^{2}\left(\frac{mb}{d}\right)[\frac{1}{{(\lambda f/w)}^{2}}\text{\hspace{0.17em}}{\mathrm{sinc}}^{2}(\frac{{x}_{2}-m\lambda f/d}{\lambda f/w},\frac{{y}_{2}}{\lambda f/w}\left)\right].$$The optical power contained in the $m$’th diffraction order is obtained by integrating the above irradiance distribution over all space in the ${x}_{2}-{y}_{2}$ plane; however, due to Eq. (20), the integral of the quantity in curly brackets is just unity, and we simply obtain

## Eq. (29)

$${P}_{m}({x}_{2},{y}_{2})={E}_{0}{w}^{2}\frac{{b}^{2}}{{d}^{2}}\text{\hspace{0.17em}}{\mathrm{sinc}}^{2}\left(\frac{mb}{d}\right).$$The diffraction efficiency of the $m$’th diffracted order is just the above optical power divided by the optical power in the incident beam, ${P}_{o}={E}_{0}{w}^{2}$, or

## Eq. (30)

$$\text{efficiency}\equiv \frac{{P}_{m}({x}_{2},{y}_{2})}{{P}_{o}}=\frac{{b}^{2}}{{d}^{2}}\text{\hspace{0.17em}}{\mathrm{sinc}}^{2}\left(\frac{mb}{d}\right).$$One can readily calculate that a square-wave amplitude grating with transparent and opaque strips of equal width ($b=d/2$) results in only 10% of the incident optical power being diffracted into the $+1$ order. This is a little better than we achieved with the sinusoidal amplitude grating, but still not adequate for many applications. Figure 17 illustrates the diffraction efficiency of the first several orders as a function of the parameter $b/d$.

In spite of the fact that increasing the parameter $b/d$ reduces the absorption of the grating, we see that for $b/d>0.5$, all of the additional transmitted power, plus some, goes into the zero order, with the efficiency of the $+1$ order actually diminishing with increasing $b/d$.

Table 2 lists the efficiency for the first several orders for $b/d=0.5$. Note that the efficiency of all even orders is identically zero because the zeros of the envelope function in Eq. (28) fall exactly upon the even diffracted orders. We can also see from Fig. 17 that the maximum efficiency that can be achieved for the second order is 0.025 for $b/d=0.25$ or 0.75.

## Table 2

Diffraction efficiencies for b/d=0.5.

Order # | Efficiency |
---|---|

0 | 0.250 |

$\pm 1$ | 0.101 |

$\pm 2$ | 0.000 |

$\pm 3$ | 0.011 |

$\pm 4$ | 0.000 |

## 5.3.

### Sinusoidal Phase Grating

One of the disadvantages of amplitude gratings is that much of the incident optical power is lost through absorption, whereas phase gratings can be made with virtually no absorption losses. Transmission phase gratings can consist of periodic index of refraction variations, or of a periodic surface relief structure, in a thin transparent optical material. Reflection phase gratings are merely a surface relief grating covered with some highly reflective material.

Following Goodman,^{27} a thin sinusoidal phase grating can be defined by the amplitude transmittance function:

## Eq. (31)

$${t}_{A}({x}_{1},{y}_{1})=\mathrm{exp}[i\frac{a}{2}\text{\hspace{0.17em}}\mathrm{sin}(2\pi {x}_{1}/d)]\mathrm{rect}(\frac{{x}_{1}}{w},\frac{{y}_{1}}{w}),$$Making use of the Bessel function identity^{27}

## Eq. (32)

$$\mathrm{exp}[i\frac{a}{2}\text{\hspace{0.17em}}\mathrm{sin}(2\pi {x}_{1}/d)]=\sum _{m=-\infty}^{\infty}{J}_{m}\left(\frac{a}{2}\right)\mathrm{exp}(i2\pi m{x}_{1}/d),$$^{28}it is readily shown that, within the paraxial limitation, the irradiance distribution in the back focal plane of the lens is given by

## Eq. (33)

$$E({x}_{2},{y}_{2})={E}_{0}{w}^{2}\left\{\sum _{m=-\infty}^{\infty}{J}_{m}^{2}\right(\frac{a}{2}\left)\right[\frac{1}{{(\lambda f/w)}^{2}}\text{\hspace{0.17em}}{\mathrm{sinc}}^{2}(\frac{{x}_{2}-m\lambda f/d}{\lambda f/w},\frac{{y}_{2}}{\lambda f/w})\left]\right\},$$^{1}

^{,}

^{22}

^{,}

^{27}

^{,}

^{31}

^{,}

^{32}

## Eq. (34)

$$\text{efficiency}\equiv \frac{{P}_{m}({x}_{2},{y}_{2})}{{P}_{o}}={J}_{m}^{2}\left(\frac{a}{2}\right),$$The conservation of energy is easily shown for this perfectly conducting paraxial ($d\gg \lambda $) reflection grating at normal incidence because the sum over $m$ from $-\infty $ to $\infty $ of the squared Bessel function in Eq. (33) is equal to unity.

Since the Fraunhofer diffraction integral implicitly contains the paraxial approximation, the diffraction angle is proportional to displacement on the focal plane containing the Fraunhofer diffraction patterns

## Eq. (35)

$${\theta}_{x}={\mathrm{tan}}^{-1}({x}_{2}/f)\approx {x}_{2}/f,\phantom{\rule[-0.0ex]{2.0em}{0.0ex}}{\theta}_{y}={\mathrm{tan}}^{-1}({y}_{2}/f)\approx {y}_{2}/f.$$Recalling our definitions of radiometric quantities, it is clear that the diffracted intensity distribution (radiant power per unit solid angle) emanating from the grating is thus proportional to the diffracted irradiance distribution (radiant power per unit area) incident upon the focal plane as given by Eq. (33):

## Eq. (36)

$$I({\theta}_{x},{\theta}_{y})={I}_{0}\sum _{m=-\infty}^{\infty}{J}_{m}^{2}\left(\frac{a}{2}\right)[\frac{1}{{(\lambda f/w)}^{2}}\text{\hspace{0.17em}}{\mathrm{sinc}}^{2}(\frac{{x}_{2}-m\lambda f/d}{\lambda f/w},\frac{{y}_{2}}{\lambda f/w}\left)\right].$$Figure 18 illustrates the diffracted intensity distribution as a function of diffraction angle ${\theta}_{x}$ and groove depth $h$, for a sinusoidal “reflection” grating with period $d=20\lambda $ operating at normal incidence.

The maximum value of ${J}_{1}^{2}(a/2)$ is 0.3386 and occurs for $a=3.68$, corresponding to a groove depth of $h=0.293\lambda $ . The diffraction efficiency of the first few orders for this value of $a$ is tabulated in Table 3. Note that the energy falls off rapidly, with 99.88% of the diffracted radiant power contained in diffracted orders $|m|\le 3$. This paraxial model is accurate only for very coarse gratings ($d\gg \lambda $).

## Table 3

Diffraction efficiencies for a=3.68, corresponding to h=0.293λ.

Order # | Efficiency |
---|---|

0 | $1.003\times {10}^{-1}$ |

$\pm 1$ | $3.386\times {10}^{-1}$ |

$\pm 2$ | $9.970\times {10}^{-2}$ |

$\pm 3$ | $1.093\times {10}^{-2}$ |

$\pm 4$ | $6.320\times {10}^{-4}$ |

The paraxial behavior described by Eq. (36) above leads to the common misconception that it is impossible to get more than 33.86% of the incident energy into the first diffracted order with a sinusoidal phase grating. “Nothing could be further from the truth!” In fact, if you decrease the grating period, the diffracted angles increase and the higher orders eventually go evanescent. When only the zero and $\pm 1$ diffracted orders remain, changing the incident angle will cause the $-1$ order to go evanescent. Then one can vary the groove depth to squelch the energy in the zero order. For a perfectly conducting sinusoidal reflectance grating, we can thus get 100% of the incident energy in the $+1$ diffracted order!^{33}

In addition to being a paraxial ($d\gg \lambda $) grating, if the sinusoidal reflection grating is also shallow (i.e., the groove depth is much less than a wavelength of the incident light), then the diffraction efficiency of the first orders of the sinusoidal reflection grating can be approximated by

Figure 19 compares the predicted diffraction efficiency of this approximation with the results of Eq. (34) for a perfectly conducting surface ($R=1$) and illustrates how shallow the grating must be to satisfy various error tolerances. Note that the above approximation exhibits only a 1% error in the prediction of diffraction efficiency of the $+1$ diffracted order at $h=0.0318\lambda $, a 5% error at $h=0.0702\lambda $, and a 10% error at $h=0.098\lambda $.

## 5.4.

### Square-Wave Phase Grating

Let us first look at a special case of a rectangular phase grating where the peak-to-peak phase step is equal to $\pi $ (this should result in zero efficiency for the zero diffracted order) and a duty cycle of $b/d=0.5$ as illustrated in Fig. 20. From Euler’s equation

## Eq. (38)

$$\mathrm{exp}(i\varphi )=\mathrm{cos}(\varphi )+i\text{\hspace{0.17em}}\mathrm{sin}(\varphi ),$$## Eq. (39)

$${t}_{A}({x}_{1},{y}_{1})=\left\{\right[2\text{\hspace{0.17em}}\mathrm{rect}\left(\frac{{x}_{1}}{d/2}\right)(1)**\frac{1}{d}\mathrm{comb}\left(\frac{{x}_{1}}{d}\right)\delta ({y}_{1})]-1\}\mathrm{rect}(\frac{{x}_{1}}{w},\frac{{y}_{1}}{w}).$$Following the discussion of the square-wave amplitude grating, we obtain a Fraunhofer diffraction pattern given by

## Eq. (40)

$$E({x}_{2},{y}_{2})=\frac{{E}_{0}{w}^{4}}{{\lambda}^{2}{f}^{2}}\left[\sum _{m=-\infty}^{\infty}{\mathrm{sinc}}^{2}\right(\frac{m}{2}){\mathrm{sinc}}^{2}(\frac{{x}_{2}-m\lambda f/d}{\lambda f/w},\frac{{y}_{2}}{\lambda f/w}\left)\right],$$## Eq. (41)

$$\text{efficiency}\equiv \frac{{P}_{m}({x}_{2},{y}_{2})}{{P}_{o}}={\mathrm{sinc}}^{2}\left(\frac{m}{2}\right)\phantom{\rule[-0.0ex]{1.0em}{0.0ex}}\text{for}\text{\hspace{0.17em}\hspace{0.17em}}m\ne 0.$$Table 4 thus lists the efficiency for the first several orders for this special case of a rectangular phase grating. Note that the $\pi $ phase step has eliminated the zero order, and the efficiency of all other even orders is identically zero because the zeros in the envelope function in Eq. (40) fall exactly upon the even diffracted orders. This thus maximizes the efficiency of the remaining orders.

## Table 4

Diffraction efficiencies for a square-wave phase grating with a π phase step.

Order # | Efficiency |
---|---|

0 | 0.000 |

$\pm 1$ | $4.053\times {10}^{-1}$ |

$\pm 2$ | 0.000 |

$\pm 3$ | $4.503\times {10}^{-2}$ |

$\pm 4$ | 0.000 |

$\pm 5$ | $1.621\times {10}^{-2}$ |

We have thus seen that the maximum efficiency of the $+1$ diffracted order (in the paraxial limit) increases from 0.0625 for a sinusoidal amplitude grating, to 0.1013 for a rectangular amplitude grating, to 0.3386 for a sinusoidal phase grating, and to 0.4053 for a rectangular phase grating.

Before we proceed to discuss the classical blazed grating, we want to derive the general solution for the diffraction behavior of an “arbitrary rectangular phase grating.” This derivation will lay the groundwork for studying the behavior of diffraction gratings with “arbitrary groove shapes.”

For a rectangular phase grating with an arbitrary phase step, the complex amplitude transmittance can be written as

where the phase variation is given by## Eq. (43)

$$\phi ({x}_{1})=a\text{\hspace{0.17em}}\mathrm{rect}\left(\frac{{x}_{1}}{b}\right)(1)**\frac{1}{d}\mathrm{comb}\left(\frac{{x}_{1}}{d}\right)\delta ({y}_{1}).$$This phase variation is illustrated graphically in Fig. 22.

Since this is an even function, it can be decomposed into a discrete cosine Fourier series. The Fourier series coefficients for the above periodic function can be shown to be given by

thus## Eq. (45)

$$\varphi ({x}_{1})=\frac{a}{2}+\sum _{n=1}^{\infty}{c}_{n}\text{\hspace{0.17em}}\mathrm{cos}(2\pi n{x}_{1}/d).$$However, we can ignore the constant term resulting from the fact that $\varphi ({x}_{1})$ as illustrated above does not have a zero mean. The rectangular phase variation is thus represented as a superposition of cosinusoidal phase variations:

## Eq. (46)

$$\varphi ({x}_{1})=\sum _{n=1}^{\infty}{c}_{n}\text{\hspace{0.17em}}\mathrm{cos}(2\pi n{x}_{1}/d).$$A thin rectangular phase grating can thus be defined by the amplitude transmittance function:

## Eq. (47)

$${t}_{A}({x}_{1},{y}_{1})=\mathrm{exp}[i\sum _{n=1}^{\infty}{c}_{n}\text{\hspace{0.17em}}\mathrm{cos}(2\pi n{x}_{1}/d)].$$But this can be written as the infinite product:

## Eq. (48)

$${t}_{A}({x}_{1},{y}_{1})=\prod _{n=1}^{\infty}\{\mathrm{exp}[i{c}_{n}\text{\hspace{0.17em}}\mathrm{cos}(2\pi n{x}_{1}/d)]\}.$$Making use of the Bessel function identity^{34}

## Eq. (49)

$$\mathrm{exp}[iz\text{\hspace{0.17em}}\mathrm{cos}(\theta )]=\sum _{m=-\infty}^{\infty}{i}^{m}{J}_{m}(z)\mathrm{exp}(im\theta ),$$## Eq. (50)

$$\mathcal{F}\{{t}_{A}({x}_{1},{y}_{1})\}=\{{[\sum _{m=-\infty}^{\infty}{J}_{m}({c}_{n})\delta (\xi -n\text{\hspace{0.17em}}m/d)]}_{n=1}*{[\sum _{m=-\infty}^{\infty}{J}_{m}({c}_{n})\delta (\xi -n\text{\hspace{0.17em}}m/d)]}_{n=2}\phantom{\rule{0ex}{0ex}}*{[\sum _{m=-\infty}^{\infty}{J}_{m}({c}_{n})\delta (\xi -n\text{\hspace{0.17em}}m/d)]}_{n=3}*\cdots *{[\sum _{m=-\infty}^{\infty}{J}_{m}({c}_{n})\delta (\xi -n\text{\hspace{0.17em}}m/d)]}_{n=\infty}\}.$$Although the above expression might at first appear to be rather unwieldy, it is rather easily solved numerically with the array operations provided with the MATLAB software package. In fact, the above operation results in an array of delta functions that represents the diffracted orders produced by the rectangular phase grating. The squared moduli of the coefficients of those terms are the efficiencies of the diffracted orders.

We can now readily calculate the diffraction efficiencies for a paraxial rectangular phase grating with an arbitrary phase step and duty cycle. Figure 23 graphically illustrates the efficiency of the first few diffracted orders produced by a rectangular phase grating with a phase step of $\pi $ as a function of the duty cycle ($b/d$). Note that when $b/d$ equals either zero or unity, that no phase variations exist, and all of the diffracted energy remains in the undiffracted beam (zero order). Also note that for $b/d=0.5$, we obtain the same results as those tabulated in Table 4.

Similarly, Fig. 24 graphically illustrates the efficiency of the first few diffracted orders produced by a rectangular phase grating with a duty cycle of 0.5 as a function of the phase step $a$. Note that the even orders are absent. Equation (50) and Figs. 23 and 24 combined constitute a rather unique and comprehensive graphical display of the parametric paraxial performance of square-wave phase gratings.

The above technique can also be used to calculate the paraxial diffraction efficiencies of a reflection grating with arbitrary groove shape by merely supplying the appropriate Fourier coefficients in Eq. (44).

## 5.5.

### Classical Blazed Grating

The concept of a blazed grating is that each groove should be so formed that independently, by means of geometrical optics, it redirects the incident light in the direction of a chosen diffracted order, thus making it appear to “blaze” when viewed from that direction. Lord Rayleigh was first to describe the ideal groove shape in 1874.^{35} He wrote: “…the retardation should gradually alter by a wavelength in passing over each element of the grating and then fall back to its previous value, thus springing suddenly over a wavelength.” He was not very optimistic about achieving such geometry, but 36 years later, in 1910, Wood^{36} produced the first grating that we would call “blazed” for use in the infrared. He did this with a tool of carborundum, ruled into copper.

A missing insight that we now take for granted was provided by John Anderson in 1916 while working at the Mt. Wilson Observatory. He demonstrated that superior gratings could be produced by “burnishing” (plastic deformation of the surface) rather than cutting the grooves into the substrate.^{37} The material thus had to be soft enough to accept local deformation and at the same time be highly polished.

The classical blazed grating is thus a reflection grating with a sawtooth groove profile as shown in Fig. 25. Such gratings have been manufactured for over 150 years by scribing, or burnishing, a series of grooves upon a good optical surface. Originally, this surface was one of highly polished speculum metal.

A major advance in the development of diffraction gratings was the discovery by John Strong in 1936 that vacuum deposited aluminum on glass is a far superior medium into which to rule grating grooves than speculum metal, which had been almost universally used for nearly a century.^{38} Therefore, in recent times, diffraction gratings have been ruled in thin layers of aluminum or gold deposited upon a glass substrate.

Blazed gratings can be designed for a particular wavelength, incident angle, and diffracted order. The blaze angle ${\theta}_{B}$ of the grating is given by

where $h$ is the groove depth and $d$ is the grating period.For a paraxial grating designed to operate at normal incidence, the groove depth must be equal to

where ${n}_{B}$ is the blaze (or design) order and ${\lambda}_{B}$ is the blaze (or design) wavelength.The specularly reflected plane wavefront segments will then be out of phase by precisely $2\pi $, thus producing constructive interference for that wavelength and diffracted order. Stated another way, the reflected phase variation over one period of the above grating can be written as

## Eq. (53)

$$\varphi ({x}_{1})=\frac{2\pi}{\lambda}\mathrm{OPD}({x}_{1})=\frac{2\pi}{\lambda}2h\frac{{x}_{1}}{d}=2\pi {n}_{B}{\lambda}_{B}{x}_{1}/(\lambda d).$$Making use of the replication properties of convolution with a comb function, the complex amplitude transmittance (or reflectance in this case) of a grating blazed for the $n$’th order and operating at the blaze wavelength can thus be written as

## Eq. (54)

$${t}_{A}({x}_{1})=\mathrm{rect}\left(\frac{{x}_{1}}{d}\right)\mathrm{exp}(-i2\pi {n}_{B}{\lambda}_{B}{x}_{1}/\lambda d)*\frac{1}{d}\mathrm{comb}\left(\frac{{x}_{1}}{d}\right).$$Using the scaling theorem and the convolution theorem of Fourier transform theory, we can write

## Eq. (55)

$$\mathcal{F}\{{t}_{A}({x}_{1})\}=\mathrm{sinc}[d(\xi -{n}_{B}{\lambda}_{B}/\lambda d)][d\text{\hspace{0.17em}}\mathrm{comb}(d\xi )].$$The irradiance of the Fraunhofer diffraction pattern in the ${x}_{2}-{y}_{2}$ observation plane a distance $z$ from the grating is proportional to the squared modulus of the Fourier transform of the complex amplitude distribution emerging from the grating:

## Eq. (56)

$${E}_{2}({x}_{2})\propto \frac{1}{\lambda \mathrm{z}}|\mathcal{F}\{{t}_{A}({x}_{1})\}{|}_{\xi ={x}_{2}/\lambda z}{|}^{2},$$## Eq. (57)

$${E}_{2}({x}_{2})\propto {\mathrm{sinc}}^{2}\left[\frac{{x}_{2}-({n}_{B}{\lambda}_{B}/\lambda )\lambda z/d}{\lambda z/d}\right]\frac{1}{\lambda \mathrm{z}/d}\mathrm{comb}\left(\frac{{x}_{2}}{\lambda z/d}\right).$$When operating at the blaze wavelength $\lambda ={\lambda}_{B}$, the peak of the ${\mathrm{sinc}}^{2}$ function is centered on the ${n}_{B}$’th diffracted order and all of the other delta functions (diffracted orders) fall on the zeros of the sinc^{2} function. All of the reflected energy is thus diffracted into the ${n}_{B}$’th diffracted order. Figure 26 shows a plot of diffraction efficiency versus ${x}_{2}\times \lambda z/d$ for a coarse grating blazed to operate in the second order at normal incidence for a wavelength of 550 nm. If $d\gg {n}_{B}{\lambda}_{B}$, we can be assured, from the planar grating equation, Eq. (3), that the ${n}_{B}$’th order will be diffracted at a paraxial angle and this predicted behavior will be accurate.

If the incident angle is nonzero, there would be an additional linear phase variation over the entire grating (not each facet individually). Equation (54) describing the complex amplitude distribution emerging from the reflecting blazed grating would thus have to be modified as follows:

## Eq. (58)

$${t}_{A}({x}_{1})=\left[\mathrm{rect}\right(\frac{{x}_{1}}{d})\mathrm{exp}(-i2\pi {n}_{B}{\lambda}_{B}{x}_{1}/\lambda d)*\frac{1}{d}\mathrm{comb}(\frac{{x}_{1}}{d}\left)\right]\mathrm{exp}(-i2\pi \frac{{\theta}_{0}}{\lambda}{x}_{1}),$$## Eq. (59)

$$\mathcal{F}\{{t}_{A}({x}_{1})\}=\{\mathrm{sinc}[d(\xi -{n}_{B}{\lambda}_{B}/\lambda d)][d\mathrm{comb}(d\xi )]\}*\delta (\xi -{\theta}_{0}/\lambda ).$$Evaluating at $\xi ={x}_{2}/\lambda z$ and substituting into Eq. (56) yields the following expression for the diffraction pattern projected onto a screen at a distance $z$ from the grating:

## Eq. (60)

$${E}_{2}({x}_{2})\propto {\mathrm{sinc}}^{2}\left[\frac{{x}_{2}-(\frac{{n}_{B}{\lambda}_{B}}{\lambda}+\frac{{\theta}_{0}d}{\lambda})\frac{\lambda z}{d}}{\frac{\lambda z}{d}}\right]\frac{1}{\frac{\lambda \mathrm{z}}{d}}\mathrm{comb}\left[\frac{{x}_{2}-\left(\frac{{\theta}_{0}d}{\lambda}\right)\frac{\lambda z}{d}}{\frac{\lambda z}{d}}\right].$$Introducing an arbitrary incident angle will thus shift both the sinc^{2} envelope function and the diffracted orders by precisely the same amount. Therefore, under “paraxial” conditions, the diffraction efficiency does not change with incident angle. For example, if we illuminate the above grating blazed for the second order with an incident angle equal to the blaze angle $({\theta}_{i}={\theta}_{B})$, the incident beam will strike the individual facets at normal incidence and the second order will be retroreflected as illustrated in Fig. 27. This situation $({\theta}_{i}={\theta}_{2})$ is referred to as the Littrow condition for the second order,^{19} and the efficiency will remain at 100% as shown in Fig. 28. The zero order will of course be specularly reflected from the plane of the grating, and the $+1$ order will be diffracted normal to the plane of the grating.

The product of a ${\mathrm{sinc}}^{2}$ function with a comb function can be written as an infinite sum of shifted and scaled delta functions,^{28} each of which represents a different diffracted order. Equation (60) can, therefore, be rewritten as

## Eq. (61)

$${E}_{2}({x}_{2})\propto \sum _{m=-\infty}^{\infty}{\mathrm{sinc}}^{2}\left[\frac{m-({n}_{B}{\lambda}_{B}/\lambda +{\theta}_{0}d/\lambda )\lambda z/d}{\lambda z/d}\right]\delta ({x}_{2}-({\theta}_{0}d/\lambda )\lambda z/d).$$For polychromatic light, we can represent the resulting diffracted orders with a summation over the discrete diffracted orders of an integral over some spectral band $\mathrm{\Delta}\lambda ={\lambda}_{2}-{\lambda}_{1}$:

## Eq. (62)

$${E}_{2}({x}_{2})\propto \sum _{m=-\infty}^{\infty}{\int}_{{\lambda}_{1}}^{{\lambda}_{2}}{\mathrm{sinc}}^{2}\left[\frac{m-({n}_{B}{\lambda}_{B}/\lambda +{\theta}_{0}d/\lambda )\lambda z/d}{\lambda z/d}\right]\delta [{x}_{2}-({\theta}_{0}d/\lambda )\lambda z/d].$$Figure 29 schematically illustrates the dispersive behavior over the visible spectrum of a grating blazed for the first order at a wavelength 500 nm. The seven classical discrete colors: red (${\lambda}_{1}=650\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{nm}$), orange (${\lambda}_{2}=600\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{nm}$), yellow (${\lambda}_{3}=550\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{nm}$), green (${\lambda}_{4}=500\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{nm}$), blue (${\lambda}_{5}=450\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{nm}$), indigo (${\lambda}_{6}=400\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{nm}$), and violet (${\lambda}_{7}=350\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{nm}$) are obtained by replacing the integral in the above equation by a discrete summation:

## Eq. (63)

$${E}_{2}({x}_{2})\propto \sum _{m=-\infty}^{\infty}\sum _{\lambda ={\lambda}_{1}}^{{\lambda}_{7}}{\mathrm{sinc}}^{2}\left[\frac{m-({n}_{B}{\lambda}_{B}/\lambda +{\theta}_{0}d/\lambda )\lambda z/d}{\lambda z/d}\right]\delta [{x}_{2}-({\theta}_{0}d/\lambda )\lambda z/d].$$Figure 30 illustrates that the dispersion is indeed doubled if the grating is blazed for the second diffracted order. Note also that the diffraction efficiency is substantially reduced for all wavelengths other than the blaze wavelength.

In this section, we have systematically described in detail the paraxial behavior of five different classical grating types: the sinusoidal amplitude grating, the square-wave amplitude grating, the sinusoidal phase grating, the square-wave phase grating, and the blazed reflection grating (sawtooth profile). The result of the paraxial diffraction efficiency analyses of these five grating types is summarized in Table 5.

## Table 5

Paraxial efficiencies of various grating types (optimized for +1 order).

Grating type | Zero order | First order | Second order | Third order | Fourth order |
---|---|---|---|---|---|

Sinusoidal amplitude | 0.250 | 0.0625 | N/A | N/A | N/A |

Square-wave amplitude | 0.250 | 0.101 | 0.000 | 0.011 | 0.000 |

Sinusoidal phase | 0.1003 | 0.3386 | 0.0997 | 0.0109 | 0.0006 |

Square-wave phase | 0.0000 | 0.4053 | 0.0000 | 0.0450 | 0.0000 |

Classical blazed | 0.0000 | 1.0000 | 0.0000 | 0.0000 | 0.0000 |

## 6.

## Nonparaxial Scalar Diffraction Theory

As discussed briefly in Sec. 1–Sec. 5, it is well-known that the paraxial irradiance distribution on a plane in the far field (Fraunhofer region) of a diffracting aperture is given by the squared modulus of the Fourier transform of the complex amplitude distribution emerging from the diffracting aperture.^{27}^{,}^{28} A slight variation of Eq. (13), without the presence of the lens, can be written as

## Eq. (64)

$$E({x}_{2},{y}_{2})=\frac{{E}_{0}}{{\lambda}^{2}{z}^{2}}|\mathcal{F}\{{U}_{o}^{+}({x}_{1},{y}_{1})\}{|}_{\xi =\frac{{x}_{2}}{\lambda z},\eta =\frac{{y}_{2}}{\lambda z}}{|}^{2}.$$Here ${U}_{o}^{+}({x}_{1},{y}_{1})={U}_{o}^{-}({x}_{1},{y}_{1}){t}_{1}({x}_{1},{y}_{1})$ is the complex amplitude distribution emerging from the diffracting aperture of complex amplitude transmittance ${t}_{1}({x}_{1},{y}_{1})$, and ${U}_{o}^{-}({x}_{1},{y}_{1})$ is the complex amplitude incident upon the lens.

The spatial frequencies $\xi $ and $\eta $ are the reciprocal variables in Fourier transform space. Also the Fresnel diffraction integral is given by the Fourier transform of the product of the aperture function with a quadratic phase factor.^{27}^{,}^{28} Implicit in both the Fresnel and the Fraunhofer approximation is a “paraxial limitation” that restricts their use to small diffraction angles and small angles of incidence.^{27}^{,}^{28} This paraxial limitation severely restricts the conditions under which this conventional linear systems formulation of scalar diffraction theory adequately describes real diffraction phenomena.

A linear systems approach to modeling nonparaxial scalar diffraction phenomena has been developed by normalizing the spatial variables by the wavelength of light:^{20}^{–}^{23}

## Eq. (65)

$$\widehat{x}=x/\lambda ,\phantom{\rule[-0.0ex]{1.0em}{0.0ex}}\widehat{y}=y/\lambda ,\phantom{\rule[-0.0ex]{1.0em}{0.0ex}}\widehat{z}=z/\lambda ,\mathrm{etc}.$$The reciprocal variables in Fourier transform space become the “direction cosines” of the propagation vectors of the plane wave components in the angular spectrum of plane waves discussed by Ratcliff,^{39} Goodman,^{27} and Gaskill:^{28}

## Eq. (66)

$$\alpha =\widehat{x}/\widehat{r},\phantom{\rule[-0.0ex]{1.0em}{0.0ex}}\beta =\widehat{y}/\widehat{r},\phantom{\rule[-0.0ex]{1.0em}{0.0ex}}\text{and}\phantom{\rule[-0.0ex]{1.0em}{0.0ex}}\gamma =\widehat{z}/\widehat{r}.$$By incorporating sound radiometric principles into scalar diffraction theory, it becomes evident that the squared modulus of the Fourier transform of the complex amplitude distribution emerging from the diffracting aperture yields “diffracted radiance (not irradiance or intensity)^{20}^{–}^{23}:”

## Eq. (67)

$$\begin{array}{cc}{L}^{\prime}(\alpha ,\beta -{\beta}_{0})=K\frac{{\lambda}^{2}}{{A}_{s}}{|\mathcal{F}\{{U}_{o}^{\prime}(\widehat{x},\widehat{y};0)\mathrm{exp}(i2\pi {\beta}_{0}\widehat{y})\}|}^{2}& \text{for}\text{\hspace{0.17em}\hspace{0.17em}}{\alpha}^{2}+{\beta}^{2}\le 1\\ {L}^{\prime}(\alpha ,\beta -{\beta}_{0})=0& \text{for}\text{\hspace{0.17em}\hspace{0.17em}}{\alpha}^{2}+{\beta}^{2}>1.\end{array}$$For large incident and/or diffracted angles, the diffracted radiance distribution function will be truncated by the unit circle in direction cosine space. Evanescent waves are then produced and the equation for diffracted radiance must be renormalized. The renormalization factor in Eq. (67) is given by^{20}^{–}^{23}

## Eq. (68)

$$K=\frac{{\int}_{\alpha =-\infty}^{\infty}{\int}_{\beta =-\infty}^{\infty}L(\alpha ,\beta -{\beta}_{0})\mathrm{d}\alpha \text{\hspace{0.17em}}\mathrm{d}\beta}{{\int}_{\alpha =-1}^{1}{\int}_{\beta =-\sqrt{1-{\alpha}^{2}}}^{\sqrt{1-{\alpha}^{2}}}L(\alpha ,\beta -{\beta}_{0})\mathrm{d}\alpha \text{\hspace{0.17em}}\mathrm{d}\beta}$$In spite of the fact that it is almost universally believed that—“in no way can scalar theory deal with cut-off anomalies,”^{40} the renormalization factor $K$ in Eq. (67) and defined by Eq. (68) enables this linear systems formulation of nonparaxial scalar diffraction theory to predict and model the well-known Wood’s (Rayleigh) anomalies^{16} that occur in diffraction efficiency behavior for simple cases of amplitude transmission gratings discussed in the following two sections of this paper.

This renormalization process is also consistent with the law of conservation of energy. However, it is significant that this linear systems formulation of nonparaxial scalar diffraction theory has been derived by the application of Parseval’s theorem and not by merely heuristically imposing the law of conservation of energy.^{20}^{–}^{23}

## 6.1.

### Rayleigh Anomalies from Sinusoidal Amplitude Transmission Gratings

Since many individual measurements are required to completely characterize the efficiency behavior of a given grating, it has become commonplace to make diffraction efficiency measurements with a given diffracted order in the Littrow condition.^{19} For transmission gratings, a given diffracted order satisfies the Littrow condition if ${\theta}_{m}=-{\theta}_{i}$. For reflection gratings, the Littrow condition is satisfied if the given diffracted order is antiparallel to the incident beam, i.e., ${\theta}_{m}={\theta}_{i}$. This allows the experimenter to leave the detector and the source in a fixed location and merely rotate the grating between measurements.

As previously shown in Table 1 of Sec. 5.1, for a narrow beam normally incident upon a paraxial sinusoidal amplitude grating with modulation of unity, five-eighths of the incident energy is absorbed and three-eights of it is transmitted. Twenty-five percent of the total incident energy is contained in the zero order and six and one-quarter percent is contained in both the $+1$ and the $-1$ orders.

If the $+1$ diffracted order is in the Littrow condition (${\theta}_{1}=-{\theta}_{i}$) as shown in Fig. 31, the grating equation expressed in Eq. (3) results in the following expression for the incident angle

Substituting Eq. (69) into Eq. (3) yields

Hence, the $+1$ and $-1$ diffracted orders produced by a sinusoidal amplitude grating propagate at angles:

## Eq. (71)

$${\theta}_{1}=-{\mathrm{sin}}^{-1}\left(\frac{1}{2}\frac{\lambda}{d}\right)\phantom{\rule[-0.0ex]{1.0em}{0.0ex}}\text{and}\phantom{\rule[-0.0ex]{1.0em}{0.0ex}}{\theta}_{-1}={\mathrm{sin}}^{-1}\left(\frac{3}{2}\frac{\lambda}{d}\right).$$Note that the sign of these two angles are consistent with the sign convention previously illustrated in Fig. 5. Figure 31 illustrates this situation for $\lambda /d=0.4$.

As the grating is rotated to increase $\lambda /d$, both the angle of incidence and the diffraction angles increase. If we use Eq. (71) to calculate at what value of $\lambda /d$ the $-1$ diffracted order goes evanescent, ${\theta}_{-1}=\pi /2$, we obtain

Clearly, the total amount of energy transmitted through this thin grating does not vary as the angle of incidence of the narrow beam is increased. Thus when the $-1$ diffracted order goes evanescent, the energy that was contained in it (6.25% of the incident energy) is redistributed into the two remaining propagating orders (the Rayleigh anomaly phenomenon).

According to Eq. (68), the renormalization constant $K$ is equal to

## Eq. (73)

$$K=\frac{{\eta}_{-1}+{\eta}_{0}+{\eta}_{1}}{{\eta}_{0}+{\eta}_{1}}=\frac{0.0625+0.25+0.0625}{0.25+0.0625}=1.2,$$Note the 20% increase in diffraction efficiency of both the zero and the $+1$ diffracted order at $\lambda /d>0.667$.^{41} It is thus possible to get a maximum diffraction efficiency of 0.075 for the $+1$ order with a sinusoidal amplitude grating. In spite of this increase over the paraxial prediction of Sec. 5.1, this low diffraction efficiency combined with the fact that precision sinusoidal amplitude gratings are difficult to fabricate explains why they are rarely used for practical applications.

## 6.2.

### Rayleigh Anomalies from Square-Wave Amplitude Gratings

The paraxial behavior of the square-wave amplitude grating was discussed in detail in Sec. 5.2. Equation (28) indicated that there is a myriad of diffracted orders produced; however, they are rapidly attenuated by a ${\mathrm{sinc}}^{2}$ envelope function. For a 50% duty cycle square-wave amplitude grating ($d=2b$), the zeros of the envelope function fall precisely on the even diffraction orders as illustrated in Fig. 33. We see from Eq. (28) and Fig. 33 that the diffraction efficiency of the $m$’th diffracted order is given by

## Eq. (74)

$${\eta}_{m}=\frac{1}{4}\text{\hspace{0.17em}}{\mathrm{sinc}}^{2}\left(\frac{m}{2}\right).$$The paraxial diffraction efficiencies of the first 19 diffracted orders of a square-wave amplitude grating with a 50% duty cycle are listed in Table 6. Note that 25% of the incident energy is contained in the zero diffracted order, all even orders are identically zero, and the remaining diffracted orders contain another 25%. The remaining 50% of the energy in the incident beam is absorbed by the opaque strips making up the square-wave amplitude grating.

## Table 6

Diffraction Efficiencies for the 1st 19 diffracted orders of a square-wave amplitude grating with b/d=0.5.

Order # | Efficiency |
---|---|

0 | 0.2500 |

$\pm 1$ | 0.1013 |

$\pm 2$ | 0.0000 |

$\pm 3$ | 0.0113 |

$\pm 4$ | 0.0000 |

$\pm 5$ | 0.0041 |

$\pm 6$ | 0.0000 |

$\pm 7$ | 0.0021 |

$\pm 8$ | 0.0000 |

$\pm 9$ | 0.0013 |

When operating in the Littrow condition, the diffracted orders are distributed symmetrically about the grating normal as shown in Fig. 34. For small $\lambda /d$, there are many diffracted orders, but they all have small diffraction angles. As $\lambda /d$ is increased, both the angle of incidence and the diffraction angles increase, and the higher diffracted orders start going evanescent.

Since the diffracted orders are distributed symmetrically about the grating normal, a positive and a negative order always go evanescent simultaneously. Figure 34 illustrates the situation for a transmission grating with $\lambda /d=0.25$ and the $+1$ diffracted order satisfying the Littrow condition (${\theta}_{1}=-{\theta}_{i}$).

Using Eq. (69) to calculate at what value of $\lambda /d$ the $+2$ diffracted order goes evanescent, we obtain

## Eq. (75)

$$\mathrm{sin}(-\pi /2)=-1=-(2-\frac{1}{2})\frac{\lambda}{d}\phantom{\rule[-0.0ex]{1.0em}{0.0ex}}\text{or}\phantom{\rule[-0.0ex]{1.0em}{0.0ex}}\lambda /d=2/3.$$Similarly, the $-1$ order goes evanescent when

## Eq. (76)

$$\mathrm{sin}(\pi /2)=1=-(-1-\frac{1}{2})\frac{\lambda}{d}\phantom{\rule[-0.0ex]{1.0em}{0.0ex}}\text{or}\phantom{\rule[-0.0ex]{1.0em}{0.0ex}}\lambda /d=2/3.$$We likewise discover that the $-2$ and $+3$ diffracted orders go evanescent when $\lambda /d=2/5$, and the $-3$ and $+4$ diffracted orders go evanescent when $\lambda /d=2/7$, etc.

Hence, when plotting diffraction efficiency versus $\lambda /d$, there can be at most only two propagating orders (the zero order and the $+1$ that is being maintained in the Littrow condition) for $\lambda /d>2/3$. All other orders are evanescent.

As with the sinusoidal amplitude grating, the total amount of energy transmitted through a square-wave amplitude grating does not vary as the angle of the incident beam is increased. Thus as each pair of diffracted orders goes evanescent, the energy that was contained by them is redistributed into the remaining propagating orders (again the Rayleigh grating anomaly phenomenon) according to the nonparaxial scalar diffraction theory summarized earlier in this section. The renormalization constant $K$ is equal to

## Eq. (77)

$$K=\frac{\sum _{m=-\infty}^{\infty}{\eta}_{m}}{\sum _{\mathrm{prop}.\mathrm{orders}}{\eta}_{m}}=\frac{0.5}{\sum _{\mathrm{prop}.\mathrm{orders}}{\eta}_{m}},$$The diffraction efficiency of the zero order and the $+1$ order which is maintained in the Littrow condition for a square-wave amplitude diffraction grating is plotted versus $\lambda /d$ in Fig. 35.

Note in Fig. 35, the incremental increase in diffraction efficiency of both the zero and the $+1$ diffracted order as successive pairs of diffracted orders go evanescent.^{41} A major increase is observed at $\lambda /d>0.667$ when the $-1$ order goes evanescent, after which the renormalization factor has a value of

It is thus possible to get a maximum diffraction efficiency of 0.1442 for the $+1$ order with a square-wave amplitude grating. This is a 42.3% increase over the paraxial value of 0.1013.

## 7.

## Summary and Conclusions

Elementary diffraction grating behavior (including diffraction efficiency and dispersion) was reviewed and early challenges in the development of diffraction grating fabrication technology were discussed. The importance of maintaining consistency in the sign convention for the planar diffraction grating equation was emphasized. The advantages of discussing conical diffraction grating behavior in terms of the direction cosines of the incident and diffracted angles were demonstrated, particularly for oblique incident angles and arbitrary grating orientation.

The paraxial grating behavior for coarse gratings ($d\gg \lambda $) was derived and displayed graphically for five elementary grating types: the sinusoidal amplitude grating, the square-wave amplitude grating, the sinusoidal phase grating, the square-wave phase grating, and the classical blazed grating (sawtooth groove profile). Paraxial diffraction efficiencies for various diffracted orders were calculated, tabulated, and compared for these five elementary grating types.

Since much of the grating community erroneously believes that scalar diffraction theory is only valid in the paraxial regime ($d\gg \lambda $), it was emphasized that this limitation is due to an “unnecessary” paraxial approximation in the traditional Fourier treatment of scalar diffraction theory, not a limitation of scalar theory itself. The development of a linear systems formulation of “nonparaxial scalar diffraction theory”^{20}^{–}^{23} was thus briefly reviewed, then used to predict the nonparaxial behavior of both the sinusoidal and the square-wave amplitude transmission gratings when the $+1$ diffracted order is maintained in the Littrow condition. This nonparaxial behavior included the well-known Rayleigh anomaly effects that are usually thought to require rigorous (vector) electromagnetic theory.

A companion paper, *Understanding Diffraction Grating Behavior, Part II* is currently in progress and will discuss in detail the limits of applicability of nonparaxial scalar diffraction theory to sinusoidal reflection (holographic) gratings as a function of the grating period to wavelength ratio.

## References

## Biography

**James E. Harvey** received his PhD in optical sciences from the University of Arizona. He is a retired associate professor from CREOL at the College of Optics and Photonics of the University of Central Florida, and currently a senior optical engineer with Photon Engineering, LLC, Tucson, Arizona, USA. He is credited with more than 220 publications and conference presentations in diverse areas of applied optics. He is a member of OSA and a fellow and past board member of SPIE.

**Richard N. Pfisterer** received his bachelor’s and master’s degrees in optical engineering from the Institute of Optics at the University of Rochester in 1979 and 1980, respectively. He is a co-founder and a president of Photon Engineering at LLC. Previously, he was the head of optical design at TRW (now Northup-Grumman) and a senior optical engineer at Breault Research Organization. He is credited with more than 20 articles and conference presentations in the areas of optical design, stray light analysis, and phenomenology. He is a member of OSA and SPIE.