Mask-filtering-based inverse lithography

Abstract. We propose a new regularization framework for inverse lithography that regularizes masks directly by applying a mask filtering technique to improve computational efficiency and to enhance mask manufacturability. This technique is different from the conventional regularization method that regularizes a mask by incorporating various penalty functions to the cost function. We design a specific mask filter for this purpose. Moreover, we introduce a metric called edge distance error (EDE) to guide mask synthesis and establish the correlation between pattern error and edge placement error (EPE) via EDE. We prove that EDE has the same dimension as EPE and has a continuous expression as pattern error. Simulation results demonstrating the validity and efficiency of the proposed method are presented.


Introduction
With ever-shrinking feature size, the physical characteristics of optics have stronger impacts on the imaging system. In particular, the band-limit system causes the output pattern to be a warped version of the input mask. 1 Several resolution enhancement techniques (RETs) have been developed to improve the performance of optical lithography. [1][2][3] Optical proximity correction (OPC) is one of these RETs. 4 Its objective is synthesizing an input mask to deliver a desired output pattern. Inverse lithography technique (ILT), as an active approach to OPC, is considered as an economically viable way to meet various challenges in future technology nodes. The computational efficiency of ILT is most noteworthy, especially when handling a large-scale (or full-chip) optimization problem.
Generally, ILT treats the mask synthesis as an inverse mathematical problem that aims at minimizing a cost function for the difference between the output and desired patterns. Various computation techniques have been proposed to deal with this inverse problem in the literature, such as the level-set method, 5-8 the discrete cosine transform (DCT)based method, 9 and the gradient-based method. [10][11][12][13][14][15] The level-set method treats a mask as a sophisticated continuum, [5][6][7][8] and consequently, the boundary of the mask is iteratively evolved according to an optimization algorithm. The DCT-based method transforms a mask to the frequency space using a two-dimensional DCT. 9 The low frequency components of the mask are adopted, and the corresponding coefficients are iteratively changed in the optimization process. As a result, the synthesized mask only possesses low frequency components and is therefore less complex. The above two techniques can both result in a smooth mask contour, while they are both limited in searching for the whole solution space. The gradient-based method considers a mask as a raster image constituted by pixels directly, where it is synthesized pixel-by-pixel in an iterative direction of the steepest descent, conjugate gradient, and so on. 8,[10][11][12][13][14][15] It is probably the most popular technique in the literature due to its high flexibility, ease of understanding, and implementation.
Since the mask is discretized into pixels in ILT, such flexibility often causes the synthesized mask to be a gray-level image and it may possess small, unwanted block objects, such as isolated holes, protrusions, and jagged edges, which in turn are unreachable during the real manufacturing. [10][11][12][13][14][15] To address these problems, regularization approaches are introduced to guarantee the synthesized mask to be binary and less complex. In the literature, almost all the regularization approaches take the regularization terms as penalty functions incorporated into a cost function with corresponding weighted parameters [10][11][12][13][14][15] as Here, the operator Γf·g implements the forward mapping from the input mask M to the output pattern, Z Ã is the desired output pattern, and R i ðMÞ is the various regularization terms. Generally, regularization terms can be classified into two types: one is related to the manufacturability, such as the quadratic penalty term, 11,12 the total variation penalty term, 11,12 and the wavelet penalty term; 13,15 the other type is related to the fabrication process, such as the image slope term 14,16 and the mask error enhancement factor (MEEF). 17,18 λ i is the weight of the corresponding regularization term R i ðMÞ. It should be noted that λ i plays a critical role in the optimization process; however, how and why values of λ i are chosen is rarely discussed in the literature. From experience, it is usually initially set to be a constant. [10][11][12][13][14][15] Here, we take the quadratic penalty term as an example to illustrate how the weighted parameter λ impacts the optimization process. In this case, both the lower pattern error and the mask quadratic error are preferred, where the pattern error is calculated by kΓfMg − Z Ã k 2 2 and the mask quadratic error is equal to the value of the quadratic penalty term RðMÞ. As shown in Fig. 1, a smaller weighted parameter λ of this quadratic penalty term results in a rapid convergence on pattern error. It is observed from Fig. 1(a) that the pattern error may meet the requirement after 44 iterations where the mask quadratic error is pretty high at this moment; but it still needs extra iterations to reduce the mask quadratic error. On the other hand, a larger λ results in good performance on the mask quadratic error while causing poor convergence on the pattern error, as Fig. 1(d) reveals. As shown in Fig. 1, the convergence of the pattern error and the mask quadratic error under such a regularization framework is out of synchronization. Notice that a smaller λ will not achieve the regularization effects, whereas a larger λ may result in a larger pattern error; it is, therefore, difficult to choose an appropriate constant value of λ. Moreover, the choice of λ has a close relation with the mask features and the simulation resolution. In mathematics, a solid approach is that the λ is adaptive with each iteration. However, this, in turn, increases the freedom of design variable, and it is generally difficult to accomplish.
Most recently, we propose an alternative regularization framework that regularizes mask directly by using a mask filtering technique. 19 In such a framework, the original cost function Eq. (1) is changed to Eq. (2) as where S½· is a mask filtering operator, the design of which is based on manufacturing constraints. It is noted that the pattern error FfMg of the filtered mask S½M instead of the original mask M is calculated and will be iteratively reduced in the optimization process. The filtering technique is widely used in signal and image processing and has been used for many years in various application fields as a numerical method to ensure regularity or existence of solutions to an engineering problem, such as structural topology optimization problems. 20,21 In this article, gray-level transitions and small, unwanted block objects in the mask are all interpreted as unwanted noise, and it is therefore natural to use filters to remove or prevent this noise in order to satisfy the manufacturing constraints. Section 2.4 details this mask filtering technique. Moreover, we introduce a metric called edge distance error (EDE) to guide mask synthesis in the ILT framework and establish the correlation between pattern error and edge placement error (EPE) via EDE. EPE is popularly used in the polygon-based OPC to convey critical dimension (CD) information, which is essentially the CD error at one side. 4 However, it is seldom used in an ILT framework due to its discrete form. One reason is that the gradient (or sensitivity) of EPE with respect to the mask (calculated by a numerical differentiation method) has a computational complexity of OðK 2 Þ, where K is the total number of mask pixels in the simulation area and is significantly slower than an analytical gradient calculation with the computational complexity of O½K logðKÞ. 10,11 Therefore, pattern error instead of EPE is applied in an ILT framework for its continuous expression and high computational efficiency. [6][7][8][10][11][12][13][14][15] The pattern error employs an approximated and continuous resist model, and it is defined as a square of the L 2 norm of the difference between the output pattern of the input mask and the desired feature, which causes pattern error to be continuous and differentiable with respect to the input mask explicitly. 10,11 However, pattern error is a dimensionless quantity and highly depends on mask feature and simulation parameters, such as simulation area and simulation resolution. For this reason, pattern error is not popular in the industry. In this paper, we, therefore, introduce the metric EDE, which has the same dimension as EPE and has a continuous expression as pattern error. The detailed description of EDE will be given in Sec. 2.2.
In addition, with the CD decreasing, the printed dimension becomes increasingly sensitive to the fluctuation of the fabrication process, which limits the yield in the semiconductor industry. Instead of using process penalty terms, such as the image slope term and the MEEF, a statistical strategy is applied to minimize a cost function under different process variations weighted by their statistical probability to enhance the robustness of layout patterns. 7,[22][23][24][25][26][27] This method is directly related to the fabrication process and is well understood and easily accomplished, while using the process penalty terms can be considered as a roundabout regularization approach and requires deeper understanding of mask topology and the imaging system. The remainder of this paper is organized as follows. Section 2 details the proposed mask filtering technique. Section 3 provides the simulation results to demonstrate the validity and efficiency of the proposed method. Finally, we draw some conclusions in Sec. 4.

Lithography Imaging Model
In this section, we review the general lithography imaging model in ILT. 10,15 Abstractly, the imaging process for optical lithography is mathematically described as where r represents spatial coordinates (x; y), and the operator Γf·g implements the forward mapping from the input mask MðrÞ to the output pattern ZðrÞ. In practice, the Γf·g in Eq.
(3) consists of the projection optics effect and the resist effect.
The projection optics effect, namely the optical image in resist IðrÞ, can be modeled as a pupil function with a partially coherent illumination source. 28 This is called the partially coherent imaging system, 29 which can be approximated by the sum of the coherent systems method, 4 the optimal coherent approximation approach, 30 or the analytical circle-sampling technique 31 as the superposition of several coherent systems Here, h q ðrÞ is the q'th optical kernel, μ q is the eigenvalue of the q'th kernel with Q kernels in total, and ⊗ denotes the two-dimensional convolution. The resist effect can be approximated by a constant threshold resist model using the following logarithmic Sigmoid function: 11 with a being the steepness of the Sigmoid function and t being the threshold. In reality, t is equal to the threshold level of the resist. Combining Eqs. (4) and (5), we can write the lithography imaging equation as

Edge Distance Error
Due to the low-pass nature of the optical imaging system, ZðrÞ is typically a blurred version of MðrÞ. Generally, the L 2 norm is employed as a metric to evaluate the difference between the output pattern MðrÞ and the desired pattern Z Ã ðrÞ as Here, FfMg is called pattern error or fidelity error. The only difference between pattern error and fidelity error is that fidelity error uses a Sigmoid function to characterize the resist effect, whereas pattern error uses a step function. The values of fidelity error and pattern error are almost the same since the steepness of the Sigmoid function a is large enough. Therefore, in this paper, we would like to call FfMg pattern error without distinguishing between them. It is noted that pattern error is a continuous function and hence, the gradient of FfMg with respect to the mask can be analytically calculated. However, this metric is not intuitive, for its magnitude is not directly related to the CD error and strongly depends on the mask feature and simulation parameters, such as simulation grid size. In other words, different simulation parameters will result in a different pattern error although with the same pattern.
Therefore, we try to derive a metric from pattern error and explicitly relate it to the commonly used EPE in industry. This metric EDE should convey CD information and be independent of the mask feature and simulation parameters. Figure 2(a) depicts the pixel-based representation of a mask pattern and its output pattern on the wafer, where the red dots are discrete sampling elements (pixels) of the patterns, S shadow denotes the absolute difference area between the desired pattern contour and the output pattern contour, and L is the perimeter of the desired pattern contour. EDE is defined as This means that EDE has the dimension of length and thus has an intuitive physical meaning.
Assuming the grid size is small enough in Fig. 2(a), the absolute difference area S shadow can be approximated by multiplying the total number of elements in shadow and the element area as S shadow ¼ N · ðδ x · δ y Þ: (9) Here, N is the total number of red dots (elements) in shadow, and δ x and δ y are the lengths of the element along the x and y directions, respectively, as shown in Fig. 2(a). Since the value of the element in the output pattern is either 0 or 1, according to the definition of pattern error in Eq. (7), the number N is approximately equal to the pattern error, namely, N ¼ FfMg. So, the absolute difference area S shadow can be expressed as Substituting Eq. (10) into Eq. (8), we have the expression of EDE as It is noted that Eq. (11) directly relates EDE to pattern error FfMg. The portion of ðδ x · δ y Þ∕L is a constant related to the simulation resolution and desired pattern, which makes pattern error have a dimension of length as EPE. This means that EDE is continuous as pattern error, and the computational complexity of EDE is the same as pattern error, i.e., OðNÞ, where N is the total number of elements in shadow as shown in Fig. 2(a).
Alternatively, the absolute difference area S shadow can be formulated as an integral of EPE taken along the closed desired pattern contour curve C: where p denotes an infinite small segment on the desired pattern contour curve and dl is the corresponding segment length. When the pattern contour curve is discretized into a finite number of segments, the pattern is represented as multiple polygons, and this representation is popularly used in polygon-based OPC. In this case, as shown in Fig. 2(b), the absolute difference area S shadow can be approximated as Here, p i is the i'th segment, and l i is the corresponding length of the segment p i . Substituting Eq. (13) into Eq. (8), EDE can be alternatively expressed as Therefore, EDE may be interpreted as the mean EPE. Equations (11) and (14) establish the correlation between pattern error and EPE, and these two metrics are actually equivalent in a sense via EDE. Either of the pattern error, EPE or EDE, can act as a metric (or cost function) to guide mask synthesis. However, since EDE has the same dimension as EPE and has a continuous expression as a pattern error, it outperforms the other two.
Furthermore, EDE can convey the local CD information and can be weighted by adding some metrology windows. Considering a practical case, customers are sometimes only concerned about some special locations (hotspots) in the resist. In this case, we add a window function around the hotspots as shown in Fig. 3. The value inside the metrology windows is usually set at 1 and that outside at 0. The weighted (or local) area S shadow is expressed as Here, N w is the total number of elements in shadow as shown in Fig. 3 and is approximately equal to the weighted pattern error F w fMg as where wðrÞ is a window function represented by rectangular functions with appropriate shifting and scaling constants. Hence, the weighted EDE in this case is where L w is the length of the desired pattern contour within the metrology windows. For simplicity, GðMÞ is used to represent the weighted EDE. Generally, GðMÞ is treated as a cost function to guide mask synthesis under nominal conditions, i.e., no defocus and dosage variations, etc. In order to enhance the process robustness, process variations should be taken into account under the mask synthesizing process. Here, we use the expectation of the weighted EDE under different variations as a cost function as expressed by JðMÞ ¼ ζ½ψðvÞ · GðM; vÞ; (18) where ζ denotes the expectation operation over v, v is a vector representing a combination of multiple process variations including, for example, defocus, exposure dosage variation, and lens aberrations, etc., and ψðvÞ is the statistical probability of the corresponding process variations, which is defined by users and is usually obtained via various experiments or measurements of lithographic tools. JðMÞ is called

Metrology Window
Output Pattern the statistical EDE and is used as a cost function to guide the mask synthesis. The gradient of JðMÞ with respect to mask M will be used in the optimization process. According to Refs. (8, 13, and 15), the gradient of JðMÞ with respect to mask M is given as

S shadow
where † means the conjugate operator; a is the steepness of the Sigmoid function in Eq. (5); and h q ðr; vÞ is the q'th optical kernel under the process variations v; h flip q (r; v) is the up-down and left-right flip of h q ðr; vÞ.

Inverse Lithography Problem Definition and Regularization
The objective of inverse lithography is synthesizing an input mask to deliver a desired output pattern. In order to guarantee the manufacturability of synthesized mask, mask quadratic error and complexity should be considered. The quadratic metric R Q ðMÞ and the complexity metric R TV ðMÞ, i.e., total variation, are usually adopted to quantify the corresponding performance. In this paper, we focus on the binary mask. So, the quadratic metric R Q ðMÞ and the complexity metric R TV ðMÞ are expressed, respectively, 11,12 as where Ω is the simulation area or the number of pixels in M, k · k 1 is the L 1 norm, and D is an operator of the first derivative D ¼ Therefore, combining the optimization objectives of the mask quadratic error, the complexity, and the statistical EDE, we state the inverse lithography problem as Finding M Ã ðrÞ to minimize: JðMÞ, R Q ðMÞ and R TV ðMÞ subject to: 0 ≤ M ≤ 1.

Mask Filtering Method
In this section, we propose an alternative method to solve this multiobjective minimization problem. We first interpret gray-level transitions and small, unwanted block objects, such as isolated holes, protrusions, jagged edges, or other layouts that cannot be fabricated, as unwanted noise in the mask, and then we design a specific filter S½· to remove or prevent this noise to satisfy manufacturing constraints M ¼ S½M: After the filtering process, the quadratic metric R Q ðMÞ and complexity metric R TV ðMÞ of the filtered maskM are rather small. Then, we calculate the cost function of this filtered mask JðS½MÞ ¼ ζ½ψðvÞ · GðS½M; vÞ: Thus, the multiobjective minimization problem is converted into a simpler single-objective minimization problem as Finding M Ã ðrÞ to minimize: JðS½MÞ subject to: 0 ≤ M ≤ 1.
We employ an iterative method to solve this problem. In the iteration process, we ensure that the statistical EDE, i.e., JðS½MÞ, of the filtered mask is iteratively decreasing. It is noted that each obtained mask is filtered and satisfies all the optimization objectives except for the statistical EDE; namely, it satisfies the manufacturing constraints. As a result, we only need to reduce the statistical EDE of this filtered mask. This approach is called the mask filtering technique.
The filter operator S½· can be designed based on different mask manufacturing rules. The most basic filter should filter the gray-level image to be a district 0 or 1 and guarantee the mask to be less complex. We, therefore, define a basic mask filter as Here, the steepness of this Sigmoid function is a S and the threshold is t S . O is a Gaussian filter to relieve mask complexity, where r 0 is the center point and kr − r 0 k means the distance from r to r 0 . τ is the normalized weight of the Gaussian filter, where Ω 1 is the number of pixels in OðrÞ. The gradient of S½· with respect to M is given as The detailed derivation of Eq. (28) is given in the Appendix.
Combining Eqs. (19) and (28), the gradient of JðS½MÞ with respect to M is With the gradient Eq. (29), we apply a steepest descent method to solve this problem. 11 The optimization procedure is Iteration 0: Since the value of the mask is bound constrained to [0, 1], we use the following parametric transformation as Then, given a desired output pattern Z Ã ðrÞ, we compute the initial input mask Θ 0 where κ 1 and κ 2 are parameters to adjust the initial value of the mask; for example, κ 1 ¼ 0.90 and κ 2 ¼ 0.05 in this paper. We do that because Mði; jÞ ¼ 0 or 1 would degrade the gradient of location (i; j) to 0 and therefore, the optimization freedom would be reduced. 15 HðrÞ is a Gaussian function to make the initial mask continuous so that the gradient with respect to the initial mask is smooth. HðrÞ is defined as where r 0 is the center point and kr − r 0 k means the distance from r to r 0 . Ω 2 is the number of pixels in HðrÞ and η is the normalized weight Finally, we calculate the initial gradient Iteration k: Step 1: Search the step length γ k ∈ R in the direction ∇ Θ k J, Step 2: Update Θ kþ1 , M kþ1 , and S kþ1 Step 3: Calculate the gradient for the next iteration, Else, return to Step 1.
Stop: Obtain the optimized mask, In the above procedure, the iteration is terminated when k∇ Θ kþ1 Jk < Λ or kJðΘ kþ1 Þk < Ξ or k > Ψ, where Λ is defined as the minimum value of the norm of velocity, Ξ is defined as the minimum value of the statistical EDE, and Ψ is the prescribed upper limit of the number of iterations. The termination criterion k∇ Θ kþ1 Jk < Ξ means that the iteration stops when the gradient is zero or rather small.

Simulations
Simulations were performed on a partially coherent imaging system with an annular source illumination whose outer radius was σ out ¼ 0.7 and whose inner radius was σ in ¼ 0.4. The wavelength in the simulations was set at 193 nm, and the numerical aperture (NA) was 1.35. The resist effect was approximated by a Sigmoid function with a ¼ 100 and t ¼ 0.7. The Gaussian filter OðrÞ consisted of 21 × 21 pixels and σ O ¼ 4. The parameters of the Sigmoid function in the proposed filter S½· were a S ¼ 300 and t S ¼ 0.5. The parameter κ 1 and κ 2 of the initial mask in Eq. (31) were 0.90 and 0.05, respectively; HðrÞ consisted of 21 × 21 pixels and σ H ¼ 2. The window function wðrÞ had the same size as the mask image, and all the values were set at 1. Instead of computing the step length γ k in Eq. (38) accurately, we set γ k at a constant 0.3 in each iteration. Since this paper focuses on developing a new regularization framework, process variations will not be taken into consideration in the proposed simulations. That means v is the nominal process condition and therefore ψðvÞ ¼ 1. All the simulations were carried out with in-house MATLAB codes on a HPZ800 (3.47 GHz Xeon) workstation using a Windows 7 (64 bit) operating system. Figure 4 depicts an example of a desired pattern and its output pattern on the wafer. In this case, the true absolute area between the desired pattern contour and its output pattern contour is 1.853 × 10 4 nm 2 , the perimeter of the desired pattern is 1.70 × 10 3 nm, and therefore, the true EDE is 10.90 nm. Table 1 summarizes the relative error compared to the true EDE when using different pixel grid sizes. The EDE in Table 1 is calculated by Eq. (11) and the pattern error is calculated by Eq. (7). From Table 1, it is observed that the magnitude of pattern error varies with the pixel grid size, whereas EDE does not. When the pixel grid size is small enough (e.g., 0.5 nm), the EDE calculated by the proposed method is approximately equal to the true EDE. With the increase of pixel grid size, the accuracy of EDE remains acceptable. So, the EDE calculated by the proposed method can be used to guide mask synthesis.

Mask Filter
As shown in Eq. (25), the proposed mask filter consists of two portions: a Gaussian convolution operation and a Sigmoid (or thresholding) operation. Figure 5 demonstrates these filtering operations, where OðrÞ is a defined Gaussian filter with a size of 21 × 21 pixels and σ O ¼ 4, M is an intermediate mask pattern with a size of 321 × 321 pixels and a grid resolution of 2.5 nm, which is commonly encountered during the optimization process of ILT, and S½M is the filtered pattern calculated by Eq. (25). As expected, the Gaussian convolution operation, O ⊗ M, weakens the weight of the small details in M, and the Sigmoid operation leads to a sharper contour. As a result, the filtered pattern S½M has a lower complexity, and its mask quadratic error (denoted as QE in Fig. 5) reduces from 4.28 × 10 4 to 531. Figure 6 presents another set of simulations for the proposed filter, where M is an input mask pattern with a size of 321 × 321 pixels and a grid resolution of 2.5 nm. This input mask pattern is artificially introduced with some objects that are difficult to manufacture in practice, including some small isolated holes, protrusions, hollows, and irregular features shown inside the red circles. From the perspective of signal processing, these details can be considered as high-frequency noise in the mask and can be evaluated with total variation. 11,12 As revealed in Fig. 6, the total variation (denoted as TV in Fig. 6) of M reduces from 3080 to 2416 via the Gaussian convolution operation, which removes these small details. Subsequently, by the Sigmoid operation, it leads to a close-to-binary mask with a total variation of 2677, which reduces total variation by 13.1% compared to the original mask M. On the other hand, it is interesting to find that the EDE of the output pattern of the mask M, the Gaussian filtered mask, and the filtered mask S½M are almost the same. That is because the optical lithography system with a low-pass nature does not deliver high-frequency details to the output pattern on the wafer. Similar to the optical lithography system, the mask filter acts as a low-pass filter to remove these details that are produced in ILT, whereas it does not cause distortions on the output pattern on the wafer. As demonstrated in Figs. 5 and 6, the proposed filter reduces the mask complexity and achieves a close-to-binary mask, so that the filtered mask S½M is reachable in real manufacture. Figure 7 shows the simulated images by using the proposed method for a desired pattern with a CD of 45 nm. The optimization is terminated after 200 iterations. The desired pattern M Ã , which is commonly encountered in the design of static random access memory circuits, consists of 321× 321 pixels with a grid resolution of 2.5 nm. As expected, the optimized mask patterns by the proposed method achieve much smaller EDE compared to that obtained by simply inputting the desire pattern M Ã as the mask pattern. It is also observed that the optimized gray-mask M S is very close to the postprocessing mask M P and reaches an almost identical output pattern and EDE. This demonstrates that the validity of the proposed method is to synthesize a regular mask pattern and to reach a considerably low EDE.    means the mask that is obtained after the n'th iteration and is calculated by using Eq. (41). The EDE means EDE between the output pattern of the M#n and the desired pattern. It is noted that each obtained intermediate mask by the proposed method is very close to binary and has a low mask complexity. This demonstrates that the proposed method can filter (regularize) the mask to eliminate the gray-level transitions and small, unwanted objects. In comparison, Fig. 9 also shows some intermediate results obtained in the iteration process by the conventional regularization method. The conventional regularization method takes different penalty terms and incorporates them into the cost function with the corresponding weight and then seeks the minimum of such a weighted cost function. In this case, we take the quadratic term, for example, and the corresponding weighted parameter λ is set at 0.1. From Fig. 9, it is observed that the intermediate result with this method possesses gray-level transitions. The EDE may satisfy a 5% CD error after 50 iterations, while the mask quadratic error is pretty high at this moment; it still needs extra iterations to reduce the mask quadratic error although the EDE achieves the demanded result. This is one of the drawbacks of the conventional regularization method. Comparing Fig. 8 to Fig. 9, the intermediate mask by using the proposed method has a lower level in both mask quadratic error and mask complexity, which is quite an improvement over the conventional regularization method. Since the convergence of EDE and mask quadratic error by the conventional regularization method is out of synchronization, it, therefore, needs several iterations to achieve a low level on both EDE and mask quadratic error, although in the proposed method, the iteration (optimizing process) can be stopped whenever EDE reaches the demanded result without worrying about the manufacturability. Figure 10 depicts the convergence properties with different methods. The results by the conventional regularization method with different weighted parameters λ demonstrate that a small weight causes a fast convergence on EDE but results in a slow convergence on mask quadratic error; a large weight results in a fast convergence on mask quadratic error while finally causing a higher EDE. That means a smaller λ will not achieve the regularization effects, whereas a larger λ may result in a large EDE. For this reason, it is difficult to choose an appropriate value of weighted parameter λ to get a win-win situation. This is the second drawback of the conventional regularization method. On the other hand, it is observed that the EDE by the proposed method converges rapidly while the mask quadratic error remains at a low level, which demonstrates that all the intermediate masks satisfy the mask quadratic error constrains.

Results of Mask Filtering Technique
Another two sets of simulations under different illumination conditions are shown in Fig. 11. Simulation of the designed mask pattern M 1 is performed on a partially coherent imaging system with an annular source illumination (σ out ∕σ in ¼ 0. i.e., obtained by the proposed method consists of a size 401 × 401 pixels with a grid resolution of 2.5 nm. Simulation of the designed mask pattern M 2 is performed on a partially coherent imaging system with a quasar source illumination (σ out ∕σ in ∕ deg ¼ 0.9∕0.6∕45°) and the NA is of 1.25. The mask M 2S , i.e., obtained by the proposed method consists of a size 361 × 361 pixels with a grid resolution of 2.5 nm. From Fig. 11, it is demonstrated that the proposed method can synthesize a mask pattern under different imaging conditions and shows the possibility of reaching a considerably low EDE. We also performed simulations for more complicated patterns by using the proposed method. Figure 12 depicts the results for one desired pattern M Ã , which is a contact layer of the benchmark AND-OR-INVERT gate circuit layout, 32 consisting of 601 × 1081 pixels with a grid resolution of 2.5 nm, i.e., the simulation area is 1500 × 2700 nm 2 . The proposed method results in a smooth mask pattern with an EDE of 2.11 nm compared to 8.37 nm by simply inputting the desired pattern as the mask pattern. These results further demonstrate that the proposed method has the capability of achieving a small EDE and of ensuring the regularity of the synthesized mask. Table 2 summarizes the average runtime of each iteration using different methods with different mask patterns. As revealed in Table 2, the average runtime of each iteration by the proposed method is almost the same as that by the conventional method. That is because the proposed method just adds a computation of Eq. (28), whose runtime is far less than the total calculation time compared to the conventional regularization method. In other words, the proposed method enhances the mask manufacturability with an almost equal runtime. In this perspective, the proposed method is therefore more efficient than the conventional regularization method.

Conclusions
In this paper, we have demonstrated the application of a mask filtering technique and the metric EDE to solve the inverse lithography problem. The mask filtering technique interprets gray-level transitions and small, unwanted block objects as unwanted noise in the mask, and employs a filter to remove this noise to satisfy manufacturing constraints. The proposed filter consists of two portions: a Gaussian convolution operation to weaken the weight of the small details in mask and a thresholding operation to produce a sharper contour. The advantage of this approach lies in that it enhances the manufacturability of each intermediate mask without raising computational complexity and avoids choosing weighted parameters of various regularization terms. In addition, we introduce a metric called EDE to guide mask synthesis and establish the correlation between pattern error and EPE. EDE is defined as the absolute area between the desired pattern contour and its output pattern contour divided by the perimeter of the desired pattern. It can be interpreted as the mean EPE and can be approximated by pattern error multiplied by a constant portion that only depends on the simulation resolution and desired pattern. Therefore, EDE has the same dimension as EPE and has a continuous expression as pattern error. The mask filtering technique and the metric EDE are expected to have direct applications in mask optimization and synthesis for optical lithography in semiconductor industry. (45) where r, r 1 , and ρ denote the spatial coordinates (x; y).
Noticing that Ω 1 is the number of pixels in OðrÞ and a S is the steepness of the Sigmoid function in S½M, finally we define and derive the gradient of the mask filter S½M with respect to M as