Parametric source-mask-numerical aperture co-optimization for immersion lithography

Abstract. Source mask optimization (SMO) is a leading resolution enhancement technique in immersion lithography at the 45-nm node and beyond. Current SMO approaches, however, fix the numerical aperture (NA), which has a strong impact on the depth of focus (DOF). A higher NA could realize a higher resolution but reduce the DOF; it is very important to balance the requirements of NA between resolution and the DOF. In addition, current SMO methods usually result in complicated source and mask patterns that are expensive or difficult to fabricate. This paper proposes a parametric source-mask-NA co-optimization (SMNO) method to improve the pattern fidelity, extend the DOF, and reduce the complexity of the source and mask. An analytic cost function is first composed based on an integrative vector imaging model, in which a differentiable function is applied to formulate the source and mask patterns. Then, the derivative of the cost function is deduced and a gradient-based algorithm is used to solve the SMNO problem. Simulation results show that the proposed SMNO can achieve the optimum combination of parametric source, mask, and NA to maintain high pattern fidelity within a large DOF. In addition, the complexities of the source and mask are effectively reduced after optimization.


Introduction
Optimization techniques play an important role in the improvement of the pattern fidelity and depth of focus (DOF) of current optical lithography systems. Source optimization is mainly aimed at altering the source shape, 1,2 whereas mask optimization is aimed at modulating the amplitude of the electric field to compensate for the optical proximity effect. 3,4 In 2002, Rosenbluth et al. proposed the first source mask optimization (SMO) method that exploits the synergy between the source and mask to achieve a higher resolution. 5 Since then, a number of SMO methods have been proposed in the literature. [6][7][8][9] Most methods are based on a scalar imaging model that is no longer accurate for a numerical aperture ðNAÞ > 0. 6. 10 In high-NA immersion lithography systems, the vector nature of the electromagnetic field must be taken into account. Recently, we proposed a pixelated SMO based on a vector imaging model that significantly improved the simulation precision for lithography at the 45-nm node and beyond. 11,12 Previous SMO methods fixed the NA and fell short in considering the mutual impact of the NA with respect to the source and mask. Prior work has demonstrated that a larger NA could realize a higher resolution, but the DOF would decrease because of the relation DOF ¼ k 2 · λ∕ðNAÞ 2 , 13 where λ is the wavelength and k 2 is the process factor. Hence, it is highly important to pursue the optimal NA during the SMO procedure so as to realize the designated resolution and achieve high image fidelity within a larger DOF. In addition, current pixelated SMO methods dramatically increase the complexity and fabrication cost of the optimized source and mask patterns; thus, they suffer from an inherent disadvantage in manufacturing. 14,15 To overcome these limitations, this paper proposes a parametric source-mask-NA co-optimization (SMNO) method to improve the pattern fidelity within a large DOF and the complexity of the source and mask patterns. To our knowledge, this paper is the first to solve for the parametric SMNO problem based on a vector imaging model. First, the vector imaging model described in Refs. 11 and 16 is used to formulate the SMNO framework, which significantly improves the simulation precision for the 45-nm node in immersion lithography. Then, an analytic model of the parametric source, mask, and NA is built. Since the derivative of the arch function exists, it is used to approximate the parametric source pattern, phase-shifting mask (PSM) pattern, and NA values such that the gradient-based algorithm can be applied to the SMNO problem. In particular, the source is modeled by its partial coherent factor and opening angle. The mask is represented by the main feature and the serif. During the SMNO process, all parameters are simultaneously optimized by using the steepest-descent algorithm. In order to validate the proposed SMNO algorithm, simulations based on a quasar source and a two-dimensional mask are presented as an example. The simulations show that, in comparison with the SMO method, the proposed SMNO method can achieve the optimal combination of source, mask, and NA to achieve superior imaging performance over a wider DOF. In addition, the parametric source can maintain an extremely simplistic distribution to avoid customizing the diffractive optical elements or installing an expensive pixel source generator. The parametric mask greatly reduces the complexity, which is beneficial for mask fabrication and lowers the cost.
The remainder of the paper is organized as follows. Section 2 briefly summarizes the vector imaging model used in this paper. The analytic and parametric source, mask, and NA are modeled in Sec. 3. The SMNO algorithm is proposed and described at length in Sec. 4. The simulations are provided in Sec. 5. Finally, conclusions are drawn in Sec. 6.
2 Vector Imaging Model for Immersion Lithography In this paper, we choose the vector imaging model described in Refs. 11 and 16 as the basis for developing the SMNO framework. This vector imaging model can provide accurate simulation results even when NA > 1 in immersion lithography. The accuracy of the model has been proven by comparison with PROLITH. 11 In the following part, we use ðx; yÞ and ðf; gÞ to represent the coordinate systems in the spatial and frequency domains. ðα; β; γÞ is the direction cosine of the light propagating through ðf; gÞ. The direction cosines on the source side, mask side, and image side are denoted as ðα s ; β s ; γ s Þ, ðα; β; γÞ, and ðα i ; β i ; γ i Þ, respectively.
A schematic of the lithography imaging process is illustrated in Fig. 1. A source point Sðα s ; β s Þ in the source plane emits the polarized light E in ¼ ½E x ; E y propagating in the directionk and incident on the mask Mðx; The electric field on the exit pupil E ext can be expressed as where δ is the defocus factor representing the actual focal plane's deviation from the best focus position. Ff g is the Fourier transform. Ψðα i ; β i Þ is a 3 × 2 transfer matrix.
is the radiometric correction factor. Q NA ðα i ; β i Þ is the pupil function that denotes the diffraction-limited effect in an optical system; it can be formulated as a circle function.
where n i is the refractive index of the immersion medium on the image side.
The electric field at the image plane can be written as where R is the reduction factor of the projection lens and F −1 f g is the inverse Fourier transform. The complete aerial image intensity I im of the partial coherent source S can be obtained by In order to obtain unit intensity in the aerial image, 17 the normalized aerial image intensity I AI can be written as where I clear ðx i ; y i ; z i Þ is the image intensity of the mask M clear in which all entries equal 1. I clear can be calculated by the same procedure as I im .
The aerial image represents the distribution on the wafer plane of the optical intensity that will cause the exposure of the resist. Usually, the exposure dose is described by the aerial image threshold value r t in a constant threshold resist model. 18 The resist can be developed when the aerial image intensity I AI is larger than r t . For numerical consideration, we employ a sigmoid function instead of a hard threshold to calculate the resist image. 19 Then, the exposure resist image Z can be expressed as where a dictates the steepness of the sigmoid function.
3 Analytic Model of the Parametric Source, Mask, and NA In the aerial image equation, the source, mask, and NA are all binarized and have been formulated using a rectangular function. We often avoid using the rectangular function in optimization because it is not differentiable. In this section, we use a differentiable function instead of the rectangular function to describe the source, mask, and NA as the use of a differentiable function is more convenient for optimization in inverse lithography. Let us introduce the name "arch function" for the following rational approximation of a rectangular function: where b is the steepness. As shown in Fig. 2, the arch function is approximately equivalent to the rectangular function when b is sufficiently large.

Source Model
Let Sðα s ; β s Þ ∈ R N s ×N s denote the source with all entries equal to 0 or 1, where N S is the dimension of the source. The parametric source can be described by a partial coherent factor, such as the outer sigma σ out , the sigma width σ width , and the opening angle θ. For annular illumination, the source shape is described by the rectangular function.
In order to calculate the derivative of the source parameter, the source can be modeled by a two-dimensional arch function. The annular illumination S A can be formulated as where S A ðα s ; β s ; σ out ; σ width Þ ∈ ½0;1. The steepness is represented by b s . For quasar illumination, the source is also restricted by the opening angle θ. The quasar illumination S Q can be modeled by the product of S A and a radial function K R .
where K R ðα s ; β s ; θ 1 ; θ 2 Þ ∈ ½0;1 is also a differentiable arch function, and where θ 1 and θ 2 represent the opening angles of the vertical and the horizontal poles, respectively. Figure 3 shows the quasar source with b s ¼ 10 and b s ¼ 40. The root mean square errors between the approximate source and the binary source are 3.4 and 0.41%, respectively. We know that the source shape approximates to the binarized distribution when b s is large enough.

Mask Model
Let Mðx; yÞ ∈ R N×N be the mask with all entries equal to t m · expðiφ m Þ, where N is the dimension of the mask, and t m and φ m are the feature transmittance and phase shift of the attenuated PSM (AttPSM), respectively. For a 6% AttPSM, t m ¼ ffiffiffiffiffiffiffiffi ffi 0.06 p and φ m ¼ π in the phase-shift region, whereas t m ¼ 1 and φ m ¼ 0 in the non-phase-shift region.
The parametric mask is composed of the main feature and the serif, where the serif is placed at the corner of the main feature. The main feature is controlled by the feature width and feature height, while the serif is modeled by the serif size and the serif offset. The serif offset is the distance that the serif deviates from the main feature. The schematic of the parametric mask is shown in Fig. 4.
For the binary mask with all entries equal to 0 or 1, the l'th main feature can be expressed by using the arch function.
where m l F ∈ ½0;1, b m is the steepness, m l w and m l h are the width and height of the feature, and x l 0 and y l 0 represent the center position of the feature.
The j'th serif can be expressed as   where m j S ∈ ½0;1, m j ss and m j so are the serif size and serif offset, and x j 0 and y j 0 represent the center position of the serif.
The function of the whole binary mask m 0 can then be derived from the summation of the main feature and the serif.
where m 0 ∈ ½0;1, and m 0 describes the distribution of the mask. Because the mask distribution is binarized, we convert m 0 into binary when simulating the aerial and resist image.
It is noted that we use m 0 in the process of calculating the derivative of the main feature and serif parameters, and use m 0 0 instead of m 0 when performing the aerial and resist image simulation.
The AttPSM can be modeled by the product of the attenuated layer M att ðm 0 Þ and the phase shift layer Φðm 0 Þ.
Thus, each independent main feature M l F and serif M j S of the AttPSM can be written as

NA Model
NA can also be modeled by the differentiable arch function to replace the circle function of Eq. (2). Then, Q NA ðα i ; β i Þ can be written as where b na is the steepness. Just as with the binarization process of the mask in Eq. (15), NA should also be binarized as Q 0 NA while performing the aerial and resist image simulation.
Q NA in Eq. (19) is only used to calculate the derivative of NA.

Cost Function
Given a binary target patternZ ∈ R N×N with all entries equal to 0 or 1, the pattern error (PE) F 0 Z can be expressed as the difference betweenZ and the resist image Z.
where k k 2 2 is the square of the Euler distance between the two arguments and τ is the τ'th grid of the resist image Z. The PE can explicitly describe the pattern fidelity, which is a very important parameter when evaluating the lithography performance.
In order to improve the convergence of the optimization, we add the difference between the aerial image I AI and the target patternZ in the cost function.
where c is a constant to modify the amplitude of the target pattern. When the DOF is sufficiently large, the aerial image I AI will be as close to the target patternZ as possible. Thus, we combine Eqs. (21) and (22) to form the cost function.
where ω g ∈ ½0;1 is the weight of F 0 Z . In order to maintain high pattern fidelity over a large range of DOF, the final cost function D should be adjusted by adding the off-focus term. Thus, where F foc and F defoc are the cost functions at the focal and defocus planes, and ω foc ∈ ½0;1 is the weight parameter.

SMNO Algorithm
The SMNO algorithm can be formulated as the search for the optimal source, mask, and NA to minimize the cost function D, such that fS; M; Q NA g ¼ min D: In order to find the best combination of the source, mask, and NA, the steepest-descent method is used to implement the proposed SMNO algorithm. Then, the k'th iterative parameter set fP n g ðkÞ , which includes all parameters of the source, mask, and NA, can be calculated by fP n g ðkÞ ¼ fP n g ðk−1Þ þ Λ · d ðk−1Þ ; where P represents the parameters of the source, mask, and NA. Λ is a vector that represents the step length of the parameters. The step length should be assigned before the optimization for each individual parameter P. The steepest-descent direction is denoted by d, which can be derived by The partial derivative of the cost function D with respect to each parameter P of the source, mask, and NA can be written as Then, for the derivative of the cost function F at the focal and defocus positions, From Eqs. (28)-(31), we find that ∂D∕∂P can be derived once ∂I AI ∕∂P is calculated. Therefore, It is noted that all the parameters of source, mask, and NA are constrained by boundaries in lithography. In order to reduce the bound-constrained optimization problem to an unconstrained optimization problem, we adopt the following parametric transformation to convert P into Ω.
where Ω is a function of parameter P, b p is a constant, and P max and P min are the maximum and minimum values of the parameter P, respectively. P ∈ ½P max ; P min and Ω ∈ ð−∞; ∞Þ. The parameter P can be given by Then, the derivative of the cost function D with respect to the parameter P can be converted to the derivative of D with respect to Ω.
Therefore, the derivative ∂I AI ∕∂P can also be calculated by ∂I AI ∕∂Ω. In the following part, we will calculate the derivatives of the parameters of the source, mask, and NA.

Derivative of the source parameter
The derivative of the cost function D with respect to the source parameter Ωðs v Þ is where s v represents the source parameters σ out , σ width , θ 1 , and θ 2 . The term ∂I AI ∕∂Ωðs v Þ can be derived from Eq. (5).
We use I 0 im to represent I im and I clear ; then, ∂I 0 im ∕∂Ωðs v Þ can be derived from Eq. (4).
where E represents the electric field of the mask M and the mask M clear . For quasar illumination, the source S is defined by S Q as in Eq. (10). The derivatives of ∂S Q ∕∂Ωðs v Þ are represented in Appendix A. By combining Eqs. (37) and (39), we can obtain the derivatives ∂D∕∂Ωðσ out Þ, ∂D∕∂Ωðσ width Þ, ∂D∕∂Ωðθ 1 Þ, and ∂D∕∂Ωðθ 2 Þ accordingly.

Derivative of the mask parameter
The derivative of the cost function D with respect to the mask parameter Ωðm z Þ is where m z represents the mask parameters m w , m h , m ss , and m so in Eq. (16). The term ∂I AI ∕∂Ωðm z Þ can be derived from Eq. (5).
For E im and its complex conjugate E Ã im , For each individual feature on the mask M, where M V represents both M F and M S . The derivative of M V can be derived from Eqs. (17) and (18).
where m V represents both m F and m S . The derivatives of the mask parameters ∂m V ∕∂Ωðm z Þ are listed in Appendix B. By substituting ∂m V ∕∂Ωðm z Þ into Eq. (46), the derivative of the cost function D with respect to the mask parameter Ωðm z Þ can be solved.

Derivative of the NA
The derivative of the cost function D with respect to ΩðNAÞ is The term ∂I AI ∕∂ΩðNAÞ can be derived from Eq. (5).
Finally, the derivative of the cost function D with respect to the NA ∂D∕∂ΩðNAÞ can be solved.
The steepest-descent direction d can be calculated after all the derivatives have been solved. Finally, the parameters can be optimized by using the steepest-descent method iteratively until the termination condition is satisfied.

Implementation of Parametric SMNO
In order to demonstrate the validity of the proposed optimization method, we illustrate the simulation results for the SMNO in this section. For comparison consideration, a parametric SMO without NA optimization is also performed in the simulations.
The lithography system is an argon fluoride (ArF) immersion lithography system with variable NA. The reflection index of the immersion medium is n ¼ 1.44. The wavelength is λ ¼ 193 nm. The reduction of the projector is R ¼ 4.
The simulations use the parametric source and mask. The source shape is quasar with Y polarization. Four source parameters, σ out , σ width , θ 1 , and θ 2 , are optimized. The source array dimension N S is 41. A 45-nm AttPSM linespace pattern is used as a target in the simulation. There are three lines in the 600 nm × 600 nm wafer, while the mask array dimension N is 600. As shown in Fig. 5(a), each line feature is 45 nm in width and 360 nm in height, while the space between the lines is 90 nm. The feature transmittance is t m ¼ ffiffiffiffiffiffiffiffi ffi 0.06 p , with a 180 deg phase shift to enhance the resolution. With the modeling in Sec. 3.2, we can change the feature width, feature height, and serif to compensate for the optical proximity effects. All the main features and serifs are numbered in Fig. 5(b). Because the mask pattern has fourfold symmetry, we need only optimize the mask parameters on the top-left quarter, which simplifies the optimization procedure. Thus, 10 mask parameters are optimized, including m 1 w and m 1 h of main feature 1, m 2 w and m 2 h of main feature 2, m 4 ss and m 4 so of serif 4, m 5 ss and m 5 so of serif 5, and m 6 ss and m 6 so of serif 6. There are 15 parameters in optimization, 4 source parameters, 10 mask parameters, and NA. Nonoptimized initial parameter values are given to serve as a starting point for optimization. The initial parameter values, minimum value P min , and maximum value P max are listed in Table 1 The simulations are performed using the optimization algorithm in Sec. 4. The simulation for SMO uses the same optimization condition with SMNO except for NA. Simulation results for the SMO and SMNO are listed in the last two rows of Table 1 and are also shown in Fig. 6. The results show that both source and mask parameters are optimized after SMO and SMNO. It is noted that the optimal NA for this kind of target pattern is 1.013, which can realize the designated resolution and maintain high pattern fidelity within a large DOF.
The optimized source, mask, and resist image at the best focus, 100-nm defocus, and 150-nm defocus positions are all shown in Fig. 6. The red solid line in the figure indicates the target pattern. The PEs are also marked on the top of each resist image. From Fig. 6, one can see that without optimization, the mask can only be printed on the wafer at the best focus position with a large PE of 22,656, whereas no pattern can be printed at the 100-nm and 150-nm defocus positions. When the parametric source and mask have been optimized by using SMO, the mask can be printed at both the best focus and the defocus positions. The resist image is highly faithful to the target pattern with a PE of 5800 at the best focus. However, the PE equals 36,168 at the 150-nm defocus position, which is too large to maintain pattern fidelity. When performing SMNO, we found that the resist image could maintain high pattern fidelity through a large DOF. The PE of SMNO at the best focus, 100-nm defocus, and 150-nm defocus positions are 8038, 7430, and 14,186, respectively. Although the PE of SMNO is larger than that of SMO by 2238 at the best focus position, the PE values at the defocus positions are significantly less. The PE of SMNO at the 150-nm defocus position is 656 and 21,982 smaller than the PE of SMO at the 100-nm and 150-nm defocus positions. Figure 7 shows a comparison of the PEs at various defocus positions, where we found that the PE of SMO increases drastically at a defocus position, but SMNO maintains a low level of PE over a large range of DOF. The comparison of optimization results between the SMO and SMNO demonstrates that SMNO could effectively improve the pattern fidelity and enlarge the DOF, which also reveals that it is necessary to include NA in the optimization.
In contrast to pixelated SMO, we may note that the source and mask always maintain an exceedingly low complexity after optimization. The source has been optimized by adjusting the partial coherent factor, while the mask has been optimized by changing the feature size and adding a serif. There is no need to customize the diffractive optical elements or purchase an expensive pixel source generator when using the parametric source. The mask can maintain a low complexity after optimization. The parametric source and mask can, therefore, effectively reduce the cost of fabrication and provide high stability in production. Figure 8 presents the convergent curves of the cost functions for SMO and SMNO. The red-circle line represents the cost function of SMO, while the black-triangle line illustrates SMNO. By incorporating NA in optimization, the cost function could be reduced much further.
The above performance comparisons reveal that the proposed parametric SMNO method can effectively improve the pattern fidelity and enhance the robustness of the optical lithography systems. NA should be considered in the optimization when optimizing the source and mask.

Conclusion
This paper proposes a parametric SMNO method using a vector imaging model to improve the pattern fidelity and DOF. We develop an analytical approach to the parametric source, mask, and NA, which could be effectively applied in inverse lithography. The mathematical expressions of the derivatives of source, mask, and NA parameters are derived on the basis of the vector imaging model. The steepest-descent algorithm is used to optimize the source, mask, and NA iteratively. Simulation results show that the SMNO produces a better source, mask, and NA combination compared to SMO. Simulation results also reveal that it is necessary to optimize NA to achieve superior lithography performance. The optimized parametric source, mask, and NA are capable of maintaining high pattern fidelity within a large DOF. The parametric source shape and mask layout are in an extremely simple distribution, which could effectively reduce the cost of fabrication and provide high stability in high-volume production. In our future work, we will improve the parametric expression of the mask to model a more complex shape, such as an L-shape and T-shape, and apply the SMNO to full-chip mask optimization.