Co-optimization of the mask, process, and lithography-tool parameters to extend the process window

Abstract. Optimization technologies have been widely applied to improve lithography performance, such as optical proximity correction and source mask optimization (SMO). However, most published optimization technologies were performed under fixed process conditions, and only a few parameters were optimized. A method for mask, process, and lithography-tool parameter co-optimization (MPLCO) is developed to extend the process window. A normalized conjugate gradient algorithm is proposed to improve the convergence efficiency of the MPLCO when optimizing different scale parameters. In addition, a parametric mask and source are used in the MPLCO that could obtain exceedingly low mask and source complexity compared with a traditional SMO.


Introduction
Mask, process, and lithography-tool parameters are the most important aspects for achieving high lithography performance. Various methods have been proposed to optimize these parameters in recent years. Optical proximity correction is the main technology to optimize the mask pattern to enhance the resolution. 1 Source mask optimization (SMO) improves the lithography performance by co-optimizing the illumination source and mask. 2 The optimized mask structure and source shape via SMO suffers from extreme complexity, especially for the pixelated mask and source, [3][4][5] which leads to difficulty in manufacturing. Furthermore, the parameters related to the process, such as the film stacks, postexposure bake (PEB), and photoresist development, have a strong impact on the process window (PW). 6,7 However, most published optimization technologies were implemented under fixed process conditions. 3,8,9 Moreover, lithography-tool parameters such as the numerical aperture (NA) and source parameter also determine the PW. 1 Actually, the parameters related to the mask, process, and lithography tool simultaneously impact the lithography performance. 10,11 The lithography effects caused by multiple parameter errors could compensate for each other, 12 indicating that the co-optimization of multiple parameters could improve the PW. However, no effective methods have been published for co-optimizing the mask, process, and lithography-tool parameters.
In this study, the first time to our knowledge, a co-optimization method is developed to simultaneously optimize the mask, process, and lithography-tool parameters. The PW is extended by co-optimizing the parameters rather than optimizing them independently. An effective co-optimization flow is built to realize this goal. Conjugate gradient algorithm is used to calculate the search direction in optimization.
The parameters are optimized along the search direction, where a line search is applied within the parameters' boundary. Mask, process, and lithography-tool parameter cooptimization (MPLCO) is a multidimensional optimization method. As large-scale parameters are obstructed by smallscale parameters during co-optimization, we also propose a normalized conjugate gradient (NCG) algorithm to improve the convergence efficiency of the MPLCO. In order to reduce the complexity of the mask and source, a parametric mask and source are used in the MPLCO. The mask pattern is optimized by adding bias and changing the feature transmittance, while the source shape is optimized by only adjusting the sigma value in the MPLCO. As an example, we chose eight parameters in the MPLCO to optimize 45-nm node patterns. The MPLCO results show that the depth of focus (DOF) of the line-space pattern is 93.75% larger than only mask and source are optimized. At the same time, the complexity of the mask and source has been reduced compared with pixelated SMO in Ref. 3. In order to further improve the PW of the SMO work in Ref. 3, we optimize the process parameters of the SMO. The DOF of the SMO with optimized process parameters is 13.54% larger than the SMO for the line-space pattern but still smaller than that of MPLCO. The co-optimization results demonstrate that the large PW could be obtained only if the mask, process, and lithography-tool parameters are optimized simultaneously.
The remainder of the paper is organized as follows. In Sec. 2, the MPLCO is described in detail. The optimization results are presented in Sec. 3. Finally, the conclusions are summarized in Sec. 4.

Co-Optimization Method
The goal of MPLCO is to achieve high lithography performance in a large PW. To realize this goal, we first create an effective cost function to evaluate the lithography performance. The cost function is sufficient in promoting the optimization of the PW and benefits to the optimization procedure. Then, the parameters are optimized along the search direction iteratively until the largest PW is found.

Cost Function
The cost function is composed of the DOF, image contrast, and normalized image log-slope (NILS). Usually, the PW is measured under a designated metrology plane, where the metrology plane is a slice through the aerial image or photoresist pattern to make the measurement. The overlap of PW (common PW) through various metrology planes could evaluate the overall lithography performance. A common method for evaluating the overlap PW is to examine the DOF with specific exposure latitude (EL), and a larger DOF indicates better lithography performance. 11,13 Thus, the DOF is chosen as the main objective function in the optimization. The DOF is a highly nonlinear function; it is usually been trapped at a local minimum when using a gradient-based algorithm. 14,15 Some other objective functions should be used to assist in optimizing the DOF. Because the image contrast and NILS are strongly related to the DOF 14,16-18 and converge much easier than DOF, they can serve as the assistant objective function in promoting the optimization of the DOF. Therefore, the cost function is formulated as a superposition of the objective functions mentioned above by assigning different weights.
where P t Contrast , P t NILS , and P DOF@ELi represent the value of the aerial image contrast, the NILS, and the DOF with different ELs, respectively; w 1 , w 2 , and ε j are the corresponding weights of the objective functions, respectively; j represents the j'th EL value; t is the t'th metrology plane.
The cost function is beneficial to optimizing the PW, while the image contrast and NILS are used to promote the optimization of the DOF. Thus, the weight ε j should be set much larger than w 1 and w 2 in optimization.
PROLITH software is used to simulate the DOF, contrast, and NILS owing to its superior ability in lithography simulation, where the mask, process, and lithography-tool models are accurately established.

Co-Optimization Flow
In this section, the parametric-based MPLCO is proposed and realized by an NCG algorithm. First, all parameters in the optimization are normalized in the same scale. Then, the parameters are optimized along the search direction iteratively until the best combination of mask, process, and lithography-tool parameters is found. Because the combination of mask, process, and lithography-tool parameters is a multidimensional point in the optimization space, we use point instead of combination and use x n to denote each parameter in the following sections.
The co-optimization flow can be generalized in the following steps, as shown in Fig. 1.
Step 1: Set the initial point and constraints of the parameters.
The constraints contain the range of parameters, the weight of the objective functions, the increment of parameters for derivative calculation, and the termination conditions. The termination conditions can be set as the number of iterations.
Step 2: Normalize the different scale parameters in the same range of [0, 1]. The large-scale parameters are likely to be obstructed by the small-scale parameters when optimizing different scale parameters. Normalization is carried out to eliminate the scale differences of the lithography parameters and to improve the convergence efficiency. The normalization is expressed as wherex n is the normalized parameter, x n is the nonnormalized original parameter, and x max and x min are the maximum and minimum values of x n .
Step 3: Calculate the search direction at the initial or iteration point. A gradient-based algorithm is chosen to calculate the search direction owing to its fast convergence speed in solving lithography problems. 4,[19][20][21][22] At the first iteration, the search direction is (3) where fx n g ð0Þ is the initial point, and d ð0Þ is the first search direction. For the k'th iteration, according to Fletcher-Reeves conjugate gradient method, the search direction is expressed as 23 d ðkÞ ¼ −∇Fðfx n g ðkÞ Þ þ k∇Fðfx n g ðkÞ Þk 2 2 k∇Fðfx n g ðk−1Þ Þk 2 2 · d ðk−1Þ ; ðk ≥ 1Þ; where k • k 2 is the L 2 norm, and fx n g ðkÞ represents the k'th iterative point. As there are no explicitly analytical relations between the cost function F and the point fx n g, the calculation of the derivative is intractable. We use a numerical approximation of the gradient instead of calculating the derivative of F. Hence, the approximate backward derivative is written as ∇Fðfx n g ðkÞ Þ ¼ Fðfx n g ðkÞ Þ − Fðfx n g ðkÞ − fΔx n gÞ fΔx n g ; where Δx n is a small increment or variation between the current iteration point and its neighborhood. For each parameterx n , Δx n needs to be set before calculating ∇F. When x n is normalized tox n , Δx n and ∇F are the normalized values.
Step 4: Calculate the local boundary along the search direction.
We define two types of boundaries: the global boundary and the local boundary. The global boundary is the range of parameters set in step 1, while the local boundary is the upper and lower limitations along the search direction. The local boundary falls within the range of the global boundary. The local boundary calculation ensures that the co-optimization of parameter is always within the valid search space. For the two-parameter local boundary calculation in Fig. 2, the points Aða 1 ; a 2 Þ and Bðb 1 ; b 2 Þ are the global boundaries of the two parameters x 1 and x 2 . Uðu 1 ; u 2 Þ and Vðv 1 ; v 2 Þ are the local boundaries along the direction dðd 1 ; d 2 Þ from the k'th iterative point Xðx k 1 ; x k 2 Þ, which cannot exceed the range of Aða 1 ; a 2 Þ and Bðb 1 ; b 2 Þ. The line search will be carried out between point Xðx k 1 ; x k 2 Þ and Uðu 1 ; u 2 Þ. The local upper boundary point U is calculated as follows: where u n is the local upper boundary point along the direction d, a i and b i are the global upper and lower boundary values, and t is the dimension of the parameters. The function minf·g is used to calculate the minimum common divisor of ða i − x i Þ∕d i . In Eq. (6), minfða i − x i Þ∕d i g ensures that the local boundary point value u k is always less than or equal to the global boundary point value a k . However, the calculation of minfða i − x i Þ∕d i g brings an obstacle in optimizing different scale parameters because the local boundary of the large-scale parameter is usually restricted by the small-scale parameter. The local boundary of the large-scale parameter always falls within a small range along the search direction d, which makes it difficult to optimize the large-scale parameter. For example, the scale for the mask bias value is usually >20 nm, while the scale for the illumination sigma value is 1. The mask bias value is large-scale parameter compared with the illumination sigma value. The local boundary of mask bias is likely restricted by the illumination sigma value when the two parameters are optimized together. However, there will be no difference in scales if the parameters are normalized to the same range in step 2, and the large-scale parameter will not be limited by the small-scale parameter.
Step 5: Apply a line search to find the optimal iteration point along the search direction.
After the local boundary is calculated, a line search is applied to find the optimum value in the segment between the iterative point X and the upper local boundary U. The Hopfinger golden section search method is used to apply the line search owing to its ability in solving the problem in which the cost function is not unimodal. 24 We will not present the details of line search method here; interested readers may refer to Ref. 24 for more details.
Step 6: If the termination condition is satisfied, go to step 7.
Otherwise, go to step 3. The MPLCO could find the best combination of mask, process, and lithography-tool parameters iteratively until the termination condition is satisfied.
Step 7: Output the optimized parameters of the mask, process, and lithography tool and the corresponding cost function F.
It is noted that a restitution operation (an inverse operation of normalization) is necessary for the simulation of F when calculating the search direction (step 3) and applying the line search (step 5). All parameters need to be restituted to their original scales for the PROLITH simulation of DOF, contrast, and NILS.
Without the parameter normalization step (step 2) and the restitution operation, the optimization procedure can be denoted as a conventional conjugate gradient (CCG) method. The NCG method is developed on the basis of the CCG method and is an improvement of the CCG method. The normalizing operation could greatly improve the convergence when optimizing different-scale parameters, and the advantage is demonstrated in Sec. 3.2.
The MPLCO is a multidimensional optimization method; thus, all parameters related to the mask, process, and lithography tool could be optimized simultaneously. The convergence robustness of the MPLCO is guaranteed in three aspects. First, the cost function we built is easy to converge. As the DOF is easily trapped at the local minimum, we add the image contrast and NILS as assistant objective function. Second, the normalization in MPLCO guarantees the optimization of different scale parameters. Third, the Hopfinger golden section search method used in MPLCO is able to solve the nonunimodal cost function.

Simulation Conditions
Two typical geometries of a 45-nm node are used to demonstrate the validity of the MPLCO: the semidense line-space pattern and the dense contact hole. The duty ratios of these patterns are 1:2 and 1:1. The critical dimension (CD) of the line is 45 nm, while the CD of the contact hole is 60 nm. 3 The PROLITH programming interface helps to communicate between the optimization algorithm and PROLITH. 25 The PROLITH simulation condition is same as that in Ref. 3 except that the parameters have been optimized. We use a water (n water ¼ 1.44) immersion lithography system with the demagnification factor R ¼ 4. The source shape is annular with a fixed sigma width σ width ¼ 0.15. The illumination is Y-polarized for the line-space pattern and TE-polarized for the contact hole. 3 Since a tunable transmittance mask is available, the feature transmittance could be varied over a large range, 26 and an attenuated phase-shift mask is used to enhance the lithography capability. The well-calibrated photoresist JSR ARX2895JN is used with a thickness of 120 nm for the line-space pattern and 102 nm for the contact hole. 3 Dual bottom antireflective coatings (BARCs) are applied to reduce the substrate reflectivity and are optimized under the initial parameter settings. The thicknesses of BARCs are 33 and 43 nm for the line-space pattern and 24 and 43 nm for contact hole. The PW is constrained by the tolerances of ΔCD ¼ AE10%CD, an 80-deg sidewall angle, and a 10% photoresist loss to maintain pattern fidelity. 27 It is necessary to use a robust photoresist model to optimize the process parameters. In order to accurately simulate the process conditions, a full physical photoresist model, three-stage PEB model, and Mack development model are used in PROLITH. 25 The full physical photoresist model is capable of simulating the reaction and diffusion of a chemically amplified photoresist accurately. The three-stage PEB model describes the temperature variation when the wafer is placed on a hotplate or chillplate and the wafer transition time between the two plates. 6 The three-stage temperature profile is represented by nine parameters: the bake temperature, the bake duration, and the temperature rise (or fall) time for each stage. The initial temperature is 25°C. The temperatures for the hotplate stage, transfer stage, and chillplate stage are 110, 45, and 25°C, respectively. The rise (or fall) times for these three stages are 5, 50, and 5 s, respectively. 25 Eight parameters including the mask, process, and lithography-tool parameters are selected as an example to demonstrate the validity of the MPLCO. For the mask parameters, the mask bias and feature transmittance are optimized. For the process parameters, the hotplate duration, transition duration, chillplate duration, and photoresist development time are chosen for the optimization. For the lithography-tool parameters, the outer sigma σ out of annular illumination and the NA are selected for the optimization.
The DOF at EL ¼ 5% is chosen to evaluate the PW, where EL ¼ 5% is a common value for 45-nm lithography and is closer to the actual case. 10 The goal of co-optimization is to extend the DOF at EL ¼ 5% to be as large as possible. The DOF at EL ¼ 3% is another objective function in the optimization. The weights for P Contrast , P NILS , P DOF@EL¼5% , and P DOF@EL¼3% in F are w 1 ¼ 0.2, w 2 ¼ 0.1, ε 1 ¼ 10, and ε 2 ¼ 2, respectively. Only one metrology plane is set for each pattern, and the metrology plane could cover a whole pitch. The PWs measured at the metrology planes are the overlap PWs for these two infinite and periodic patterns. The metrology plane for the line space is a cutline perpendicular to the line direction, while for the contact hole the cutline is set at the center of the hole. The optimization terminates after 20 iterations of the line search.
The unoptimized parameters for the line space and contact hole are set as the initial point fxg ð0Þ . The ranges of the parameters are set in a reasonable range, which ensures that the values of the parameters are realizable in lithography. The conditions for the initial value fxg ð0Þ , the minimum value (lower global boundary), the maximum value (upper global boundary), and the increment of the parameters fΔxg are listed in Table 1. Owing to the different properties of these parameters, different increments Δx n and Δx n are set to calculate the gradient for the CCG and NCG methods. It is noted that Δx n is the actual value of the parameter, while Δx n is the normalized value.

Co-Optimization Results
The results are demonstrated in two aspects. We first present the improvement in the PW by using the MPLCO method. The convergence efficiency of the MPLCO by using the CCG (MPLCO-CCG) method and the proposed NCG (MPLCO-NCG) method are also compared. Then, the mask and source complexities are compared between the MPLCO method and the pixelated SMO in Ref. 3.
The optimized parameters of the MPLCO-CCG and MPLCO-NCG methods are listed in the last two columns in Table 1. The MPLCO-CCG results show no changes for the mask bias, PEB duration, and develop time, although the small-scale parameters are well optimized. This is mainly due to the drawback of the local boundary calculation when optimizing the different scale parameters. The optimization of the large-scale parameters is limited by small-scale parameters in the optimization procedure. However, when all the parameters are normalized to the same scale in the MPLCO-NCG method, the optimization can move further. The normalized method could improve the convergence efficiency of the optimization. Figure 3 illustrates the convergence curve of the MPLCO-CCG and MPLCO-NCG methods, which show that the MPLCO-NCG method could effectively reduce the cost function.
The MPLCO-NCG method greatly decreases the hotplate and chillplate duration but increases the development time. The PEB mainly affects the reaction of chemically amplified photoresist, which thereby impacts the profile of the photoresist. The entire PEB durations are 29.96 and 99.67 s for the line-space pattern and contact hole, which are shorter than the initial setting of 100 s. The development time affects the CD of the photoresist, which has been optimized to 60 and 24.37 s for the line-space pattern and contact hole. The optimized process parameters ensure the exact chemical reaction of the photoresist with an acceptable profile in a large PW.
The EL versus DOF curves for the line-space pattern and contact hole are illustrated in Fig. 4. The curves denoted with triangles and stars represent the PW of the MPLCO-CCG and MPLCO-NCG methods, respectively. We use the dashed line to indicate the DOF at EL ¼ 5% in the figure. The DOF at EL ¼ 5% of the MPLCO-CCG and MPLCO-NCG methods are 0.349 and 0.558 μm for the line-space pattern, respectively, and 0.344 and 0.356 μm for the contact hole, Note: CCG, conventional conjugate gradient, NCG, normalized conjugate gradient, MPLCO, mask, process, and lithography-tool parameter co-optimization.  respectively. The PW is greatly extended after co-optimization, especially when the mask, process, and lithography-tool parameters are well optimized by using the MPLCO-NCG method. The DOF at EL ¼ 5% of the MPLCO-NCG method increases by 59.89 and 3.49% for the line-space pattern and contact hole compared with the MPLCO-CCG method. The reason is mainly because the large-scale parameters are not being optimized when using the MPLCO-CCG. The comparison between MPLCO-CCG and MPLCO-NCG indicates that the more parameters have been well optimized, the larger PW can be obtained. As an SMO work is done to extending the PW in Ref. 3, the EL versus DOF curve of the SMO denoted with square is represented in Fig. 4. The DOF at EL ¼ 5% of the SMO is 0.288 μm for line-space pattern and 0.172 μm for contact hole, which are much smaller than the DOF of MPLCO-NCG. The comparison between the MPLCO and SMO could also prove that large PW can be obtained when more parameters are involved in optimization.
In order to demonstrate the merits of MPLCO and the impact of the process parameters on lithography, we only optimize the process parameters under the condition that the mask and source are optimized by the SMO in Ref. 3. The increments of NCG method are Δx n ¼ 0.01 for the line-space pattern and Δx n ¼ 0.1 for the contact hole.
The parameter settings and optimized results are listed in Table 2. The EL versus DOF curves are shown in Fig. 4. The DOF at EL ¼ 5% of the SMO with process optimization (PO) are 0.327 μm for the line-space pattern and 0.178 μm for the contact hole, which are 13.54 and 3.49% larger than that of SMO. This verifies that the PW could be extended by optimizing the process parameters. From Fig. 4, we also find that the DOF at EL ¼ 5% for the MPLCO-NCG method is still larger than that of SMO with PO. Although the maximum EL of the MPLCO-NCG method is smaller than that of SMO and SMO with PO, the DOF at EL ¼ 5% of the MPLCO-NCG method definitely increases. The results indicate that a large PW could be obtained only if the mask, process, and lithography-tool parameters are optimized simultaneously.
The second advantage of the proposed MPLCO method is to reduce the mask and source complexity. A parametric mask and source are used in the MPLCO. For comparison, Fig. 5 shows the mask pattern and source shape of the SMO, MPLCO-CCG method, and MPLCO-NCG method. For the contact hole of the SMO, the mask pattern is much more complex compared with the MPLCO-CCG and MPLCO-NCG results. A complex mask is difficult to manufacture and will greatly increase the cost of mask fabrication. For the parametric mask, the mask pattern has only been biased rather than applying complex mask correction, which could reduce the complexity of the mask. The source shapes of the SMO results suffer from an intricate distribution with a nonuniform intensity. It is necessary to use a customized diffractive optical element or an expensive programmable illuminator to generate the pixelated source. However, one just needs to adjust the sigma value for the parametric source in which the source shape always has high uniformity and stability.
The mask error enhancement factor (MEEF) is compared. The MEEF of the SMO result, MPLCO-CCG result, and MPLCO-NCG result are 5.45, 4.74, and 4.47 for the linespace pattern, respectively, and 12.69, 15, and 15.39 for the contact hole, respectively. There are nearly no improvements for the MEEF when using MPLCO compared with SMO. The MEEF of MPLCO result is smaller than SMO result for line-space pattern, but larger than SMO result for contact hole. Contact hole suffers the largest MEEFs of any feature type because the change in mask size of a regular contact hole will lead to change in exposure dose. 27 There are more clear areas at the corner of contact hole optimized by SMO; the change of mask size will lead to less change in exposure dose compared with the regular contact hole optimized by MPLCO. Therefore, the MEEF of regular contact hole optimized by MPLCO is larger than the contact hole optimized by SMO.

Conclusion and Discussion
This paper proposes an MPLCO method that can optimize the mask, process, and lithography-tool parameters simultaneously. An NCG algorithm is used in the MPLCO to improve the convergence efficiency and is successfully applied to optimize different scale parameters. As the process parameters have strong impact on lithography performance, it is necessary to use a full physical photoresist model to simulate the process behavior accurately when optimizing the process parameters. The PW of the SMO with optimized  process is larger than the SMO, but still smaller than that of MPLCO. The results indicate that the more parameters have been optimized simultaneously, the larger PW could be obtained.
In addition, the complexity of the mask and source could be greatly reduced by using the MPLCO compared with the SMO. The mask pattern has only been biased rather than applying complex mask correction, while the source shape is controlled by adjusting the sigma value. The assumption and simplification for the mask is sufficient when optimizing the one-dimensional line-space pattern and two-dimensional contact hole. The cost of fabrication could be greatly reduced by using the parametric mask and source. In our future work, we will model the complex two-dimensional mask pattern in parametric form and apply the MPLCO to it.