Theoretical study for aerial image intensity in resist in high numerical aperture projection optics and experimental verification with one-dimensional patterns

Abstract. In optical lithography, high-performance exposure tools are indispensable to obtain not only fine patterns but also preciseness in pattern width. Since an accurate theoretical method is necessary to predict these values, some pioneer and valuable studies have been proposed. However, there might be some ambiguity or lack of consensus regarding the treatment of diffraction by object, incoming inclination factor onto image plane in scalar imaging theory, and paradoxical phenomenon of the inclined entrance plane wave onto image in vector imaging theory. We have reconsidered imaging theory in detail and also phenomenologically resolved the paradox. By comparing theoretical aerial image intensity with experimental pattern width for one-dimensional pattern, we have validated our theoretical consideration.

Theoretical study for aerial image intensity in resist in high numerical aperture projection optics and experimental verification with one-dimensional patterns Masato  Abstract. In optical lithography, high-performance exposure tools are indispensable to obtain not only fine patterns but also preciseness in pattern width. Since an accurate theoretical method is necessary to predict these values, some pioneer and valuable studies have been proposed. However, there might be some ambiguity or lack of consensus regarding the treatment of diffraction by object, incoming inclination factor onto image plane in scalar imaging theory, and paradoxical phenomenon of the inclined entrance plane wave onto image in vector imaging theory. We have reconsidered imaging theory in detail and also phenomenologically resolved the paradox. By comparing theoretical aerial image intensity with experimental pattern width for one-dimensional pattern, we have validated our theoretical consideration. © The Authors. Published by SPIE under a Creative Commons Attribution 3.0 Unported

Introduction
Since not only fineness but also preciseness of pattern width are required in optical lithography, an accurate imaging model is necessary to predict transfer characteristics in the projection optical system. Several researchers have proposed some advanced arguments and useful theories. [1][2][3][4][5][6][7][8][9] These previous studies, however, involve ambiguity in the treatments of emerging inclination factor from object and incoming inclination factor onto image in terms of the scalar imaging theory. In addition, we point out paradoxical phenomenon between electric intensity and photon number on the image plane in terms of the vector imaging theory.
Fundamentally, scalar imaging theory is only an approximation, but it has been widely used for evaluating optical imaging performance and is also the basis for constructing the vector imaging theory. Therefore, self-consistency is absolutely required in the scalar imaging theory. By respecting the conservation law of energy or law of radiance, 10,11 we have already introduced inclination factors and reconstructed the scalar imaging theory to make it self-consistent. As a result of the research, we have derived and confirmed that the scalar imaging theory satisfies the following self-consistent conditions. (1) Correspondence principle between wave optics and geometrical optics: When λ is limitedly equal to zero, the wave optical point spread function (PSF) should be equal to the spot diagram. 8 (2) Reciprocity: When the object and the image are exchanged with each other, the PSF is perfectly similar to the original one. 7,8 (3) Parseval theorem: 12 Even in the case of high numerical aperture (NA), this mathematical theory for Fourier transform is physically satisfied as energy conservation between the pupil and the image. 7 In this paper, we reconsider the imaging theory in detail and compare theoretical aerial image intensity with experimental pattern width in resist of one-dimensional (1-D) pattern in optical lithography.
In Sec. 2.1, by considering the irradiance on the illuminated plane, we point out the importance of energy conservation and present the meaning of incoming inclination factor. Also, we explain simply the fundamental meaning of Poynting vector and energy flow.
In Sec. 2.2, we review and discuss scalar imaging theory. We introduce the illuminating inclination factor onto the object (mask), the emerging inclination factor from the object, and the incoming inclination factor onto the image (wafer). [7][8][9][10][11]13 As in previous papers, 2,7,9 we also introduce well-known factor of radiometric correction (RC), which is due to the ratio between the cross-section area of the exit plane wave from the object and that of the entrance plane wave onto the image in projection optics. Moreover, in this paper, considering the change of the cross-section area of the inclined illuminating plane wave, we point out scaling factor of amplitude of incoming illuminating plane waves, which is introduced for the first time, to the best of our knowledge. As a result, since above factors cancel each other, the Fourier imaging theory is exactly fulfilled in scalar imaging theory such as Hopkins theory. 14 Eventually, the consistent theory we will present does not need any explicit correction factor.
In order to comprehend the scalar imaging theory, we discuss some additional aspects in Appendix A.
In Sec. 2.3, we discuss vector imaging theory. Although the vector theory is similar to the scalar theory, the calculation of vector diffraction on the object is different from that of scalar diffraction. Also, the treatment on the image is different.
In Sec. 2.4, we point out a paradox that the electric field intensity is contradictory with incoming energy (or photon number or Poynting vector) on the resist surface. By considering the substantial optical path length due to oblique incidence, we can phenomenologically resolve this contradiction. 9 In Sec. 3, by comparing experimental results with numerical calculations of 1-D aerial image intensity in the resist, we confirm the validity of the imaging theory, which directly treats the electric field in the resist. Conversely, optical lithography places severe requirements on size and quality of patterns, so that comparing numerical calculations of aerial image intensity with experimental results for high NA projection is meaningful and worthwhile.
Even though there are many useful lithography simulators, since the detail simulation procedures are not necessarily obtained, we made an in-house software ourselves. By comparing our simulator with other well-known ones, the validity of our simulator is summarized in Appendix B.
Although further discussion may be necessary, our proposal will be useful and valuable for optical lithography and imaging optics.

Basic Algorithm for Imaging Theory
We review and reconsider scalar imaging theory especially by taking into account energy conservation in Secs. 2.1 and 2.2. We discuss not only projection optics but also illumination optics. Also, we reconsider vector imaging theory especially by taking into account the paradox between electric field intensity and the incoming energy on the resist in Secs. 2.3 and 2.4.

Fundamental Concept of Inclination Factor
The illuminated area of the inclined plane wave scales by 1∕ cos θ S as shown in Fig. 1. Therefore, by considering the conservation of energy, the irradiance scales by cos θ. This is the fundamental meaning of incoming inclination factor. 8 This is also called the winter effect. 15 Since the amplitude is proportional to square root of energy density, it can be regarded that the substantial amplitude of incoming plane wave on the illuminated plane scales by ffiffiffiffiffiffiffiffiffiffi ffi cos θ p . This phenomenon is also understood from the viewpoint of geometrical optics. Since a ray carries a certain energy, if the ray obliquely enters the illuminated plane, the substantial space on the plane scales by 1∕ cos θ and then energy density on this plane scales by cos θ.
From another point of view, since a ray carries a certain energy, the normal component of energy flux scales by cos θ as shown in Fig. 2. This is consistent with the decreasing of irradiance.
From the perspective of electromagnetic theory, Poynting vector means the energy flow. In the simple case of Figs. 1 and 2 in which the plane wave propagates in isotropic medium, Poynting vector is along the ray and its normal component on the illuminated plane also scales by cos θ.
By analogically expanding the concept of incoming inclination factor to the emerging plane wave from object, we also introduce the emerging inclination factor.
By taking into account the incoming inclination factor, the correspondence principle between wave optics and geometrical optics 8 and the physical fulfillment of Parseval theorem 7 are satisfied. The set of emerging inclination factor and incoming inclination factor lead the reciprocal theorem. 7,8 In addition, we point out the fact that incoming and emerging inclination factors are not new concepts and have been well known for diffraction at the aperture in conventional text books. 12 Also, we have heard that the incoming inclination factor has been utilized for numerical beam propagation calculation in beam synthesis propagation, which is a function in the optical designing program CODE-V.

Scalar Imaging Theory
The amplitude on unit area of a propagating plane wave is modified (transformed) by propagation through the optics, diffraction by the object (mask), and incidence onto the image (wafer). We will explain these changes step by step by referring to Fig. 3. Here, θ s is the angle between propagation direction of the illuminating plane wave onto the object and optical axis, θ is the angle between propagation direction of the emerging plane wave from the object and optical axis, and θ 0 is the angle between propagation direction of the incoming plane wave onto the image and optical axis.
① Illumination scaling factor: scaling (transformation) of amplitude of incoming illuminating plane waves.
Since the incoming illumination plane wave declines, its width (cross-sectional area) scales by a factor of cos θ S (relative to the on-axis case of θ s ¼ 0). Thus, to conserve energy, the energy density (or the energy flow density) of the incoming plane wave scales by 1∕ cos θ S . Since the amplitude is proportional to square root of energy density, the amplitude of incoming plane wave scales by 1∕ ffiffiffiffiffiffiffiffiffiffiffiffi ffi cos θ S p . We think no one has explicitly pointed out this factor.  In the above argument, we implicitly assume that the condenser lens is f-sin θ lens as shown in Fig. 4. NAi is the numerical aperture emerging from source and is related to the given illuminated area (size) on the object (mask). By assuming f-sin θ lens, the NAi is constant with respect to source height h s ¼ f · sin θ s . Therefore, we can easily recognize the properties of the illuminating plane wave. Even though the condenser lens is not practically f-sin θ lens, we can substantially suppose the set of f-sin θ lens and the modified source intensity distribution, which gives the equivalent character of illuminating radiance. Therefore, the assumption of f-sin θ lens can be adopted without loss of generality in theoretical consideration.
For simplicity, we also assume f-sin θ lens for both front part and rear part of projection optics. In practice, these parts of projection optics are not necessarily f-sin θ lens. 16 However, since we mainly discuss the optical relation between object and image, this assumption can be accepted without loss of generality in theoretical consideration. ② Incoming mask factor: illuminating inclination factor onto the object (mask) The illuminated mask field size is 1∕ cos θ S times as much as the width (cross-sectional area) of the incoming illuminating plane wave. Thus, to conserve energy, the energy density of the mask plane wave scales by cos θ S . Since the amplitude is proportional to square root of energy density, the amplitude on the mask plane scales by a factor of ffiffiffiffiffiffiffiffiffiffiffiffi ffi cos θ S p . ③ Mask Fourier transform: Fourier transform by the object (mask) A certain entrance illuminating plane wave is modified by the transmittance pattern of the object. Then the modified distribution is created just behind the object and is decomposed into various Fourier components. This is simply the Fourier transform. ④ Emerging mask factor: emerging inclination factor from the object (mask) Each Fourier component is transformed into a certain exit (emerging) plane wave. The width (crosssectional area) of the exit plane wave is scaled by cos θ. Thus, conservation of energy requires the energy density to scale by 1∕ cos θ and the amplitude of the exit plane wave to scale by 1∕ ffiffiffiffiffiffiffiffiffiffi ffi cos θ p . Even though this inclination factor is important, it may sometimes have been missed. By considering this change of the width of emerging plane wave, it is shown that the radiance character of the object is Lambertian as explained in Appendix A. ⑤ RC: magnification factor from object space to image space The cross-sectional area of the plane wave in the image space is jβj 2 cos θ 0 ∕ cos θ times as small as that in the object space. Here, β is the lateral magnification of projection optics. (jβj ¼ NAo∕NAi, where NAo is the object-side NA of projection optics and NAi is the NA on the image side.) Therefore, the energy density scales by cos θ∕ðjβj 2 cos θ 0 Þ and the wave amplitude scales by ffiffiffiffiffiffiffiffiffiffi ffi times. This is very clear and is the so-called RC. 2,7,9 ⑥ Incoming image factor: incoming inclination factor onto the image (wafer) The image area size is 1∕ cos θ 0 times as large as the cross-sectional area of the incoming plane wave. Thus, conservation of energy requires the amplitude at the image plane to scale by ffiffiffiffiffiffiffiffiffiffiffiffi ffi cos θ 0 p . We refer to this term as the incoming inclination factor 8 and have called this phenomenon the winter effect. 15 This factor is exactly the same as the incoming mask factor of ②. ⑦ Image Fourier transform: Fourier transform on the image (wafer) For a certain point source, many plane waves are caused by the diffraction on the object. They propagate through the projection lens and enter the image plane. Thus, image amplitude distribution is obtained by summation of these plane waves. This is formally represented by the Fourier transform.
is obtained. Namely, the correction factors of ①, ②, ④, ⑤, and ⑥ compensate for one another. Therefore, only mask Fourier transform of ③ and image Fourier transform of ⑦ are practically active. Namely, the plane wave, which corresponds to the Fourier transform of the transmittance pattern of the object (mask), is completely transformed (regenerated) on the image   (2) Shibuya, Takada, and Nakashima: Theoretical study for aerial image intensity in resist in high numerical aperture projection optics. . .
(wafer) plane. Then the summation or the Fourier transform of these plane waves causes the image amplitude distribution. This result is exactly the same as in Ref. 14.

Vector Imaging Theory
Vector imaging theory is constructed on the basis of scalar imaging theory. For the propagation in optics, the vector theory is similar to the scalar theory. Therefore, illumination scaling factor of ① and RC of ⑤ are applied in the same way as scalar imaging theory. However, to consider the diffraction by the object (mask), we use another method that is different from the scalar imaging theory. For the fine pattern object, the diffraction should be treated by using rigorous electromagnetic theory, such as rigorous coupled wave analysis (RCWA). Incoming mask factor of ②, mask Fourier transform of ③, and emerging mask factor of ④ are automatically involved or considered in the RCWA calculation. In RCWA, the illuminating entrance plane wave is assumed to be spread infinitely and its amplitude is normalized. Also, the emerging plane wave is assumed to be spread infinitely and its amplitude is numerically obtained. Therefore, there is no ambiguity in the mathematical treatment of diffraction by the object in vector imaging theory. Even though we have other methods such as finite-difference time-domain method, since it is possible to compare them with RCWA, the argument based on RCWA is fundamentally extended to other methods.
The energy flux density is proportional to the square of the electric field for plane wave. Thus, the concept of amplitude in scalar imaging theory corresponds to the electric field in vector imaging theory. Since we directly consider the electric field on the image plane, incoming image factor of ⑥ is not necessary in vector imaging theory. We should consider the components of electric field caused by the summation of all entrance plane waves. Namely, image Fourier transform of ⑦ is considered for each electric component respectively. Therefore, there is no ambiguity in the treatment of the electric field in vector imaging theory. However, as shown in Sec. 2.4, there is a paradoxical issue related to the incoming image factor of ⑥.

Paradoxical Phenomenon for Vector Imaging Theory
In Fig. 5(a), the phase-shifting mask 17,18 is illuminated coherently (σ ¼ 0) by an s-polarized source (direction of electric field is normal to the plane of paper) and the L∕S pattern (45 or 32 nm L∕S) is reimaged by the interference between first and minus-first order diffracted waves. Since the entrance plane wave is inclined, its energy flux density is increased by 1∕ cos θ 0 and its amplitude or its electric field is increased by 1∕ ffiffiffiffiffiffiffiffiffiffiffiffi cos θ 0 p . As the direction of electric field is normal to the plane of paper, the electric field distribution on the image plane, which is caused by the interference between two diffracted waves, also scales by 1∕ ffiffiffiffiffiffiffiffiffiffiffiffi cos θ 0 p . Therefore, the electric field intensity scales by 1∕ cos θ 0 , as shown in Fig. 5(b), which is calculated by the lithography simulator of PROLITH.
Since the chemical response is a function of the square of the electric field, the resist response might also increase as a function of 1∕ cos θ 0 . However, the number of photons is exactly the same in these two cases (45 and 32 nm L∕S), and according to our simple calculations, the normal component of Poynting vector (energy flux) on the image plane is the same. Thus, the increasing of response might contradict the invariant of energy and be paradoxical.
This contradiction can be phenomenologically explained as follows. If we assume a limitedly thin resist of thickness h, when the entrance beam inclines, the optical path length is substantially increased by 1∕ cos θ 0 0 , as shown in Fig. 6. Here, θ 0 0 is the angle of incline in the resist. (Note that we should use this angle for calculating RC of ⑤ and incoming image factor of ⑥ instead of θ 0 .) Therefore, the absorption increases by 1∕ cos θ 0 0 and the chemical response also increases in the same manner. 9 This is exactly equivalent to considering the increase in the electric field intensity by 1∕ cos θ 0 0 . This explanation is a phenomenalism, but might be the clear reason for the increase in the resist response on inclining the angle of the entrance beam. Thus, there is no contradiction. In the case of finite thickness, if it is divided into many thin layers, the same argument can be applied.
This argument is closely connected to the concept of the incoming inclination factor in scalar imaging theory. Assuming a limitedly high absorption resist such as a black body, all the energy is absorbed in a limitedly thin surface. Thus, the difference between the absorbance in normal incidence waves and that in inclined incidence waves is not   (2) Shibuya, Takada, and Nakashima: Theoretical study for aerial image intensity in resist in high numerical aperture projection optics. . . substantially presented. Therefore, we should introduce the incoming inclination factor of ⑥ in order to eliminate the effect of the intensity increase in the entrance plane wave in the scalar imaging theory. (However, if we apply the scalar imaging theory to the case of finite absorption resist, substantial increasing of optical path length should be considered. 7 ) 3 Comparing the Simulation Results with Experimental Data for One-Dimensional Patterns In order to confirm the validity of imaging theory especially for the paradox, we compare experiment with simulation. Even though it is desirable to take into account resist process, since there are many parameters to be optimized, there is a possibility that the comparison becomes ambiguous. Therefore, we compare the simulation results of an aerial image with its experimental results. Since we do not know the detail simulation procedures for commercial lithography simulators to be used in public, we have developed an in-house software. Numerical comparison between the well-known commercial simulators and our in-house simulator is presented in Appendix B. Although our theoretical discussion is not restricted to 1-D pattern, we experimentally examine the 1-D pattern for simplicity.

Experimental and Simulation Conditions
Simulation is fundamentally based on vector imaging theory. In our in-house simulator, diffraction by the object (mask) is calculated by RCWA. We consider the two simulation   The CD, which refers to the pattern width in the field of optical lithography, is experimentally defined as the bottom width of the remaining resist, as shown in Fig. 7. Experimental data are obtained in the condition of best focus and best dose (exposure energy amount) for a standard pattern (anchor), which is 40 nm L∕S pattern. Best dose and best focus are decided by the dose-focus process window. 7. Definition of CD for simulation: Aerial image intensity is defined by the square of the electric field. The threshold for aerial image intensity is defined as giving the desired CD for a standard pattern (40 nm L∕S pattern). Resist is divided into 20 layers and aerial image intensity is calculated on 21 boundaries. To compare experimental pattern width, we adopt the average of intensities of five boundaries from just above the bottom of the resist.

Comparison and Discussion
We compare the simulated CD with the experimental pattern width. In order to discuss the necessity of incoming image factor ⑥, we numerically calculate the image intensities for the two methods. First, we calculate the aerial image intensity by simply using vector imaging theory described in Sec. 2.3. That is, we do not consider the incoming inclination factor of ⑥. In Fig. 8, simulations are compared with experimental CD of best focus and dose. Since the absolute value of best focus is not experimentally measured, for simulation, the focus is changed from 80 nm [ Fig. 8(a) Second, we consider the incoming image factor of ⑥. This is consistent with the constant number of photons entering the resist. In Fig. 9, simulations are compared with experimental CD. In the simulation data, the focus is changed from 90 nm [ Fig. 9(a)] to 50 nm [ Fig. 9(c)] in water. In resist, they are 105, 82, and 58 nm, respectively. Figures 8(b) and 9(b) might correspond to the best focus simulation data. Comparing these, especially when the pattern width is narrower than 70 nm, simulation data in Fig. 8(b) are coincident with experimental data more exactly than that in Fig. 9(b). When the pattern width is wide, they are not well coincident with each other. This difference might be due to the effect of the resist process, which depend on the nominal pattern width. Therefore, we can conclude that simulation without considering the incoming image factor of ⑥ is correct in vector imaging theory.
In order to further confirm the validity of our conclusion, it might be valuable to compare the defocus experimental data and two-dimensional (2-D) pattern.

Summary
In optical lithography, high-performance exposure tools are necessary to obtain not only fineness but also preciseness in pattern width. Therefore, an accurate theoretical method is needed to precisely predict these values. That is, lithography experiments enable us to evaluate the validity of imaging theory. However, there might be some ambiguity or lack of consensus for the treatment of diffraction by the object in scalar imaging theory and the paradoxical phenomenon for the inclined entrance plane wave in vector imaging theory. Therefore, we have reconsidered the imaging theory in detail and compared the theoretical aerial image intensity with experimental pattern width.
It might be desirable to take into account resist process. However, when we consider this process, there are many parameters to be optimized. So, comparison between theory and experiment might be complex and ambiguous. Thus, we have concentrated on the comparison between image intensity in the resist and experimental pattern width.
In order to discuss the necessity of incoming inclination factor onto the image (incoming image factor of ⑥), we have calculated the image intensities by two methods. First, we calculate the aerial image intensity by simply using vector imaging theory. In other words, we consider the strengthening of the electric field of the inclined wave due to RC of ⑤ but do not consider the incoming image factor of ⑥. Even though this contradicts the energy conservation entering into resist, we can phenomenologically explain this by the fact that the substantial optical path is elongated. Second, we consider the incoming image factor of ⑥. This is consistent with the constant number of photons entering into resist. From the comparison between experimental 1-D pattern width and simulated aerial image pattern, simulation without considering the incoming image factor seems to be correct in terms of vector imaging theory.
We feel further discussion is necessary and experiments for defocus, larger 1-D features, and 2-D pattern should be examined. However, our proposal and demonstration should be useful and valuable for optical lithography and fundamental optics. Imaging Theory A1 Dividing the Process (Transformation) of ⑤ (RC) into Two Processes of ⑤-A (Object Space to Stop Correction) and ⑤-B (Stop to Image Space Correction) Process of ⑤ (RC) can be divided into two processes of ⑤-A and ⑤-B. ⑤-A is the process in which the emerging plane wave propagates from the object space to the stop (or the virtual stop), and ⑤-B is the process from the stop to the image space.
In step ⑤-A (object space to stop correction), since the plane wave emerging from the object inclines, the width (cross-sectional area) scales by cos θ relative to the normal emerging plane wave. Thus, when the amplitude for the unit area on the plane wave is given, the total energy included in this cross-sectional area should be multiplied by cos θ. Since we assume the front part in projection optics is an f-sin θ lens as shown in Fig. 10 and the Fourier coordinate is proportional to sin θ, the focus point height of plane wave on the stop is proportional to Fourier coordinate of the object. Therefore, the energy on the stop scales by cos θ and the amplitude on the stop scales by ffiffiffiffiffiffiffiffiffiffi ffi cos θ p . In step ⑤-B (stop to image space correction), the width (cross-sectional area) of the incoming plane wave onto the image is cos θ 0 times as small as that of the normal incident plane wave. The energy density scales by 1∕ cos θ 0 and the amplitude scales by 1∕ ffiffiffiffiffiffiffiffiffiffiffiffi ffi cos θ 0 p on the plane wave. This consideration is the same as in ① of illumination scaling factor.
From ⑤-A of object space to stop correction and ⑤-B of stop to image space correction, we can get ffiffiffiffiffiffiffiffiffiffi ffi cos θ p ∕ ffiffiffiffiffiffiffiffiffiffiffiffi ffi cos θ 0 p . In addition, the cross-sectional area of the normally incident plane wave in image space is jβj 2 times as much as that of normally emerging plane wave in object space. Thus, the energy density of the incoming plane wave scales by 1∕jβj 2 relative to the emerging plane wave and the amplitude scales by 1∕jβj.
Therefore, the amplitude totally scales by ffiffiffiffiffiffiffiffiffiffi ffi cos θ p ∕ ðjβj ffiffiffiffiffiffiffiffiffiffiffiffi ffi cos θ 0 p Þ, which is equal to ⑤ of RC. By adding the entrance reference sphere as shown in Figs. 11 and 12, ④ (emerging mask factor) + ⑤-A (object space to stop correction) can be understood as follows. As the argument of emerging mask factor ④, the width (cross-sectional area) of the exit plane wave scales by cos θ, so the diffracted spread width on the (infinite) entrance reference sphere is proportional to 1∕ cos θ as shown in Fig. 11. Thus, the energy density scales by cos θ on the entrance reference sphere while the amplitude scales by ffiffiffiffiffiffiffiffiffiffi ffi cos θ p . As shown in Fig. 12, if the front part of the projection lens is an f-sin θ lens, an elemental area on the stop is cos θ times that of the correspondent area on the entrance reference sphere. Therefore, the energy density on the stop scales by 1∕ cos θ and the amplitude scales by 1∕ ffiffiffiffiffiffiffiffiffiffi ffi cos θ p . By considering these two effects, we can get ffiffiffiffiffiffiffiffiffiffi ffi cos θ p × 1∕ ffiffiffiffiffiffiffiffiffiffi ffi cos θ p ¼ 1. Namely, Fourier transform of object transmittance, which is caused just behind the Mask Fourier transform ③, exactly appears on the stop without additional correction factors.
In addition, Hopkins wrote "Fourier spectrum of object appears over the entrance pupil sphere." 14 However, he has not handled how "it quantitatively appears." As the Fourier imaging theory is completely fulfilled in his mathematical treatment, 14 we think his statement means the fact shown in Figs. 11 and 12. We have already pointed out this issue. 13 If we assume his proposition is quantitatively fulfilled and consider the amplitude transformation from the entrance reference sphere to the exit reference sphere, we will derive the different result that the Fourier imaging theory is not completely fulfilled. 13    Correction) By considering a pinhole object, the processes of ③ (mask Fourier transform), ④ (emerging mask factor), and ⑤-A (object space to stop correction) can be explained as follows. 7,9 When we consider a pinhole object, spatial coherence due to the illumination condition does not affect image property at all. The pinhole can be represented by a 2-D δ-function because the pinhole is a limitedly small 2-D surface element. As the Fourier transform of a δ-function is uniform with respect to the Fourier coordinates, every emerging plane wave has equal energy. Since the Fourier coordinate is proportional to sin θ as shown in Fig. 12, correspondent interval of coordinate on the reference sphere is proportional to 1∕ cos θ. From these two results, the energy density on the entrance reference sphere scales by cos θ and the amplitude scales by ffiffiffiffiffiffiffiffiffiffi ffi cos θ p times. This means that the pinhole is Lambertian.
In addition, this result is consistent with the following fact: since the width (cross-sectional area) of the inclined emerging plane wave is scaled by cos θ relative to the normally emerging plane wave, the diffraction angle scales by 1∕ cos θ.
Any object can be considered to consist of the sum of pinholes (2-D δ-functions). Therefore, in the mathematical limit of incoherent illumination (coherence factor σ ¼ ∞), they form a Lambertian surface.

A4 Reference Sphere in Place of Pupil Sphere
Even though the word pupil sphere has been sometimes used in previous papers, such as Ref. 14, since the entrance pupil and exit pupil are not generally spherical and fundamentally have astigmatism as shown in Fig. 13, 19,20 the word reference sphere is more appropriate than the word pupil sphere.

A5 Processes of ⑤-B (Stop to Image Space
Correction) and ⑥ (Incoming Image Factor) Considering the transformation from stop to image, the processes of ⑤-B and ⑥ compensate for each other.
By adding the exit reference sphere as shown in Figs. 14 and 15, the processes of ⑤-B (stop to image space correction) + ⑥ (incoming image factor) can be considered as follows. As shown in Fig. 14, in the propagation from the stop to the exit reference sphere, the correspondent area size scales by 1∕ cos θ 0 . Therefore, the diffraction angle from this area on the exit reference sphere scales by cos θ 0 relative to the case in which θ 0 ¼ 0 as shown in Fig. 15. Thus, the diffraction size on the plane perpendicular to the propagating direction scales by cos θ 0 . Thus, the intensity on this plane scales by 1∕ cos θ 0 times and the amplitude changes 1∕ ffiffiffiffiffiffiffiffiffiffiffiffi ffi cos θ 0 p times. This factor is due to stop to image space correction of ⑤-B.
By considering this effect of 1∕ ffiffiffiffiffiffiffiffiffiffiffiffi ffi cos θ 0 p and the incoming image factor ⑥ of ffiffiffiffiffiffiffiffiffiffiffiffi ffi cos θ 0 p , ð1∕ ffiffiffiffiffiffiffiffiffiffiffiffi ffi cos θ 0 p Þ × ffiffiffiffiffiffiffiffiffiffiffiffi ffi cos θ 0 p ¼ 1 is obtained. Thus, a Fourier component of object transmittance, which appears at the stop, is retransformed by inverse Fourier transform and deposited on the image as a plane wave. This is also the reason for the physical fulfillment of the Parseval theorem.

Appendix B: Comparing Our In-House Software with General Lithography Simulators
Even though there are many useful lithography simulators to be used in public, such as PROLITH by KLA-Tencor, Dr.LiTHO by Fraunhofer, Sentaurus Lithography (S-Litho) by Synopsys, Berkeley Lithography Simulator and HyperLith by Panoramic Technology, the detail models in the simulators are not necessarily obtained. Thus, we made an in-house software ourselves.
We confirm the validity of our software by comparing its simulation results with those of PROLITH and S-Litho. Simulation parameters are NA of 1.35, exposure wavelength of 193 nm, illumination coherence factor σ of 0.2, s-polarized illumination, lateral magnification β ¼ 4X, and defocus of zero (0 nm). The mask structure is the same as that in Sec. 3.1.   Image intensities calculated by MaskSide-mode in PROLITH, those by WaferSide-mode in PROLITH, those by SourceIntensity-mode in S-Litho, and those by OpenFrame-mode in S-Litho, and those by our in-house simulator without considering the incoming image factor of ⑥ almost completely coincide with each other after normalization. In all cases, the aerial image intensities are normalized by the intensity for the uniformly white (clear) pattern respectively. Thus, the validity of our in-house simulator is confirmed. On the contrary, when we consider the factor of ⑥ in our in-house simulator, the above coincidence is not obtained. In addition, we found that illumination scaling factor of ① hardly effects in these cases.
Aerial image intensities of 40 nm line/80 nm pitch are compared in Fig. 16 and those of 40 nm line/400 nm pitch are compared in Fig. 17. In these figures, we show the result by WaferSide-mode in PROLITH, that by our in-house simulator with considering the factor of ⑥ and without considering the factor of ①, and that by our in-house simulator with considering neither the factor of ⑥ nor the factor of ①. We also compare in the defocus images and obtain the same results.
Masato Shibuya is a professor in the Department of Media and Image Technology, Tokyo Polytechnic University. He graduated from Tokyo Institute of Technology in 1977 with a master's degree in physics and joined Nikon Corporation. He had been designing space and lithography optics and studying resolution enhancement technology. He is an inventor of phase-shifting mask. He received his PhD from the University of Tokyo in 1996. He joined Tokyo Polytechnic University in 2001. He has been a SPIE fellow since 2015.
Akira Takada is an optical designer in the Applied Photonics Laboratory, Topcon Corporation. He received his BS degree in physics from Nihon University, Tokyo, in 1993 and his PhD from Tokyo Polytechnic University, Japan, in 2009. His research interests are optical lithography, diffractive optics, and micro-optics.
Toshiharu Nakashima received his BS degree in 1990 and his MS degree in industry mechanical engineering from University of Tokyo, Japan, in 1992. In the same year, he joined Nikon Corporation in Japan. He is currently working on the imaging application in the field of optical microlithography.