8 November 2013 Mask-filtering-based inverse lithography
Author Affiliations +
J. of Micro/Nanolithography, MEMS, and MOEMS, 12(4), 043003 (2013). doi:10.1117/1.JMM.12.4.043003
Abstract
We propose a new regularization framework for inverse lithography that regularizes masks directly by applying a mask filtering technique to improve computational efficiency and to enhance mask manufacturability. This technique is different from the conventional regularization method that regularizes a mask by incorporating various penalty functions to the cost function. We design a specific mask filter for this purpose. Moreover, we introduce a metric called edge distance error (EDE) to guide mask synthesis and establish the correlation between pattern error and edge placement error (EPE) via EDE. We prove that EDE has the same dimension as EPE and has a continuous expression as pattern error. Simulation results demonstrating the validity and efficiency of the proposed method are presented.
Lv, Xia, and Liu: Mask-filtering-based inverse lithography

1.

Introduction

With ever-shrinking feature size, the physical characteristics of optics have stronger impacts on the imaging system. In particular, the band-limit system causes the output pattern to be a warped version of the input mask.1 Several resolution enhancement techniques (RETs) have been developed to improve the performance of optical lithography.12.3 Optical proximity correction (OPC) is one of these RETs.4 Its objective is synthesizing an input mask to deliver a desired output pattern. Inverse lithography technique (ILT), as an active approach to OPC, is considered as an economically viable way to meet various challenges in future technology nodes. The computational efficiency of ILT is most noteworthy, especially when handling a large-scale (or full-chip) optimization problem.

Generally, ILT treats the mask synthesis as an inverse mathematical problem that aims at minimizing a cost function for the difference between the output and desired patterns. Various computation techniques have been proposed to deal with this inverse problem in the literature, such as the level-set method,56.7.8 the discrete cosine transform (DCT)-based method,9 and the gradient-based method.1011.12.13.14.15 The level-set method treats a mask as a sophisticated continuum,56.7.8 and consequently, the boundary of the mask is iteratively evolved according to an optimization algorithm. The DCT-based method transforms a mask to the frequency space using a two-dimensional DCT.9 The low frequency components of the mask are adopted, and the corresponding coefficients are iteratively changed in the optimization process. As a result, the synthesized mask only possesses low frequency components and is therefore less complex. The above two techniques can both result in a smooth mask contour, while they are both limited in searching for the whole solution space. The gradient-based method considers a mask as a raster image constituted by pixels directly, where it is synthesized pixel-by-pixel in an iterative direction of the steepest descent, conjugate gradient, and so on.8,1011.12.13.14.15 It is probably the most popular technique in the literature due to its high flexibility, ease of understanding, and implementation.

Since the mask is discretized into pixels in ILT, such flexibility often causes the synthesized mask to be a gray-level image and it may possess small, unwanted block objects, such as isolated holes, protrusions, and jagged edges, which in turn are unreachable during the real manufacturing.1011.12.13.14.15 To address these problems, regularization approaches are introduced to guarantee the synthesized mask to be binary and less complex. In the literature, almost all the regularization approaches take the regularization terms as penalty functions incorporated into a cost function with corresponding weighted parameters1011.12.13.14.15 as

(1)

F{M}=Γ{M}Z*22+i=1nλiRi(M).

Here, the operator Γ{·} implements the forward mapping from the input mask M to the output pattern, Z* is the desired output pattern, and Ri(M) is the various regularization terms. Generally, regularization terms can be classified into two types: one is related to the manufacturability, such as the quadratic penalty term,11,12 the total variation penalty term,11,12 and the wavelet penalty term;13,15 the other type is related to the fabrication process, such as the image slope term14,16 and the mask error enhancement factor (MEEF).17,18 λi is the weight of the corresponding regularization term Ri(M). It should be noted that λi plays a critical role in the optimization process; however, how and why values of λi are chosen is rarely discussed in the literature. From experience, it is usually initially set to be a constant.1011.12.13.14.15

Here, we take the quadratic penalty term as an example to illustrate how the weighted parameter λ impacts the optimization process. In this case, both the lower pattern error and the mask quadratic error are preferred, where the pattern error is calculated by Γ{M}Z*22 and the mask quadratic error is equal to the value of the quadratic penalty term R(M). As shown in Fig. 1, a smaller weighted parameter λ of this quadratic penalty term results in a rapid convergence on pattern error. It is observed from Fig. 1(a) that the pattern error may meet the requirement after 44 iterations where the mask quadratic error is pretty high at this moment; but it still needs extra iterations to reduce the mask quadratic error. On the other hand, a larger λ results in good performance on the mask quadratic error while causing poor convergence on the pattern error, as Fig. 1(d) reveals. As shown in Fig. 1, the convergence of the pattern error and the mask quadratic error under such a regularization framework is out of synchronization. Notice that a smaller λ will not achieve the regularization effects, whereas a larger λ may result in a larger pattern error; it is, therefore, difficult to choose an appropriate constant value of λ. Moreover, the choice of λ has a close relation with the mask features and the simulation resolution. In mathematics, a solid approach is that the λ is adaptive with each iteration. However, this, in turn, increases the freedom of design variable, and it is generally difficult to accomplish.

Fig. 1

Convergence of pattern error, mask quadratic error, and cost function values with different weighted parameters λ of the quadratic penalty term, for the desired pattern M1 in Fig. 11.

JM3_12_4_043003_f001.png

Most recently, we propose an alternative regularization framework that regularizes mask directly by using a mask filtering technique.19 In such a framework, the original cost function Eq. (1) is changed to Eq. (2) as

(2)

F{M}=Γ{S[M]}Z*22,
where S[·] is a mask filtering operator, the design of which is based on manufacturing constraints. It is noted that the pattern error F{M} of the filtered mask S[M] instead of the original mask M is calculated and will be iteratively reduced in the optimization process. The filtering technique is widely used in signal and image processing and has been used for many years in various application fields as a numerical method to ensure regularity or existence of solutions to an engineering problem, such as structural topology optimization problems.20,21 In this article, gray-level transitions and small, unwanted block objects in the mask are all interpreted as unwanted noise, and it is therefore natural to use filters to remove or prevent this noise in order to satisfy the manufacturing constraints. Section 2.4 details this mask filtering technique.

Moreover, we introduce a metric called edge distance error (EDE) to guide mask synthesis in the ILT framework and establish the correlation between pattern error and edge placement error (EPE) via EDE. EPE is popularly used in the polygon-based OPC to convey critical dimension (CD) information, which is essentially the CD error at one side.4 However, it is seldom used in an ILT framework due to its discrete form. One reason is that the gradient (or sensitivity) of EPE with respect to the mask (calculated by a numerical differentiation method) has a computational complexity of O(K2), where K is the total number of mask pixels in the simulation area and is significantly slower than an analytical gradient calculation with the computational complexity of O[Klog(K)].10,11 Therefore, pattern error instead of EPE is applied in an ILT framework for its continuous expression and high computational efficiency.67.8,1011.12.13.14.15 The pattern error employs an approximated and continuous resist model, and it is defined as a square of the L2 norm of the difference between the output pattern of the input mask and the desired feature, which causes pattern error to be continuous and differentiable with respect to the input mask explicitly.10,11 However, pattern error is a dimensionless quantity and highly depends on mask feature and simulation parameters, such as simulation area and simulation resolution. For this reason, pattern error is not popular in the industry. In this paper, we, therefore, introduce the metric EDE, which has the same dimension as EPE and has a continuous expression as pattern error. The detailed description of EDE will be given in Sec. 2.2.

In addition, with the CD decreasing, the printed dimension becomes increasingly sensitive to the fluctuation of the fabrication process, which limits the yield in the semiconductor industry. Instead of using process penalty terms, such as the image slope term and the MEEF, a statistical strategy is applied to minimize a cost function under different process variations weighted by their statistical probability to enhance the robustness of layout patterns.7,2223.24.25.26.27 This method is directly related to the fabrication process and is well understood and easily accomplished, while using the process penalty terms can be considered as a roundabout regularization approach and requires deeper understanding of mask topology and the imaging system.

The remainder of this paper is organized as follows. Section 2 details the proposed mask filtering technique. Section 3 provides the simulation results to demonstrate the validity and efficiency of the proposed method. Finally, we draw some conclusions in Sec. 4.

2.

Methodology

2.1.

Lithography Imaging Model

In this section, we review the general lithography imaging model in ILT.10,15 Abstractly, the imaging process for optical lithography is mathematically described as

(3)

Z(r)=Γ{M(r)},
where r represents spatial coordinates (x,y), and the operator Γ{·} implements the forward mapping from the input mask M(r) to the output pattern Z(r). In practice, the Γ{·} in Eq. (3) consists of the projection optics effect and the resist effect.

The projection optics effect, namely the optical image in resist I(r), can be modeled as a pupil function with a partially coherent illumination source.28 This is called the partially coherent imaging system,29 which can be approximated by the sum of the coherent systems method,4 the optimal coherent approximation approach,30 or the analytical circle-sampling technique31 as the superposition of several coherent systems

(4)

I(r)=q=1Qμq|hq(r)M(r)|2.

Here, hq(r) is the q’th optical kernel, μq is the eigenvalue of the q’th kernel with Q kernels in total, and denotes the two-dimensional convolution. The resist effect can be approximated by a constant threshold resist model using the following logarithmic Sigmoid function:11

(5)

sig[I(r)]=11+ea(I(r)t),
with a being the steepness of the Sigmoid function and t being the threshold. In reality, t is equal to the threshold level of the resist.

Combining Eqs. (4) and (5), we can write the lithography imaging equation as

(6)

Z(r)=Γ{M(r)}=sig[I(r)]=sig[q=1Qμq|hq(r)M(r)|2].

2.2.

Edge Distance Error

Due to the low-pass nature of the optical imaging system, Z(r) is typically a blurred version of M(r). Generally, the L2 norm is employed as a metric to evaluate the difference between the output pattern M(r) and the desired pattern Z*(r) as

(7)

F{M(r)}=Γ{M(r)}Z*(r)22.

Here, F{M} is called pattern error or fidelity error. The only difference between pattern error and fidelity error is that fidelity error uses a Sigmoid function to characterize the resist effect, whereas pattern error uses a step function. The values of fidelity error and pattern error are almost the same since the steepness of the Sigmoid function a is large enough. Therefore, in this paper, we would like to call F{M} pattern error without distinguishing between them. It is noted that pattern error is a continuous function and hence, the gradient of F{M} with respect to the mask can be analytically calculated. However, this metric is not intuitive, for its magnitude is not directly related to the CD error and strongly depends on the mask feature and simulation parameters, such as simulation grid size. In other words, different simulation parameters will result in a different pattern error although with the same pattern.

Therefore, we try to derive a metric from pattern error and explicitly relate it to the commonly used EPE in industry. This metric EDE should convey CD information and be independent of the mask feature and simulation parameters. Figure 2(a) depicts the pixel-based representation of a mask pattern and its output pattern on the wafer, where the red dots are discrete sampling elements (pixels) of the patterns, Sshadow denotes the absolute difference area between the desired pattern contour and the output pattern contour, and L is the perimeter of the desired pattern contour. EDE is defined as

(8)

EDE=SshadowL.

Fig. 2

(a) Pixel-based representation of a mask pattern and its output pattern on the wafer and (b) polygon-based representation of a mask pattern.

JM3_12_4_043003_f002.png

This means that EDE has the dimension of length and thus has an intuitive physical meaning.

Assuming the grid size is small enough in Fig. 2(a), the absolute difference area Sshadow can be approximated by multiplying the total number of elements in shadow and the element area as

(9)

Sshadow=N·(δx·δy).

Here, N is the total number of red dots (elements) in shadow, and δx and δy are the lengths of the element along the x and y directions, respectively, as shown in Fig. 2(a). Since the value of the element in the output pattern is either 0 or 1, according to the definition of pattern error in Eq. (7), the number N is approximately equal to the pattern error, namely, N=F{M}. So, the absolute difference area Sshadow can be expressed as

(10)

Sshadow=N·(δx·δy)=F{M}·(δx·δy).

Substituting Eq. (10) into Eq. (8), we have the expression of EDE as

(11)

EDE(M)=SshadowL=(δx·δy)L·F{M}.

It is noted that Eq. (11) directly relates EDE to pattern error F{M}. The portion of (δx·δy)/L is a constant related to the simulation resolution and desired pattern, which makes pattern error have a dimension of length as EPE. This means that EDE is continuous as pattern error, and the computational complexity of EDE is the same as pattern error, i.e., O(N), where N is the total number of elements in shadow as shown in Fig. 2(a).

Alternatively, the absolute difference area Sshadow can be formulated as an integral of EPE taken along the closed desired pattern contour curve C:

(12)

Sshadow=CEPE(p)d,
where p denotes an infinite small segment on the desired pattern contour curve and d is the corresponding segment length. When the pattern contour curve is discretized into a finite number of segments, the pattern is represented as multiple polygons, and this representation is popularly used in polygon-based OPC. In this case, as shown in Fig. 2(b), the absolute difference area Sshadow can be approximated as

(13)

Sshadow=CEPE(p)d=iEPE(pi)li.

Here, pi is the i’th segment, and li is the corresponding length of the segment pi. Substituting Eq. (13) into Eq. (8), EDE can be alternatively expressed as

(14)

EDE(M)=SshadowL=1LiEPE(pi)li.

Therefore, EDE may be interpreted as the mean EPE. Equations (11) and (14) establish the correlation between pattern error and EPE, and these two metrics are actually equivalent in a sense via EDE. Either of the pattern error, EPE or EDE, can act as a metric (or cost function) to guide mask synthesis. However, since EDE has the same dimension as EPE and has a continuous expression as a pattern error, it outperforms the other two.

Furthermore, EDE can convey the local CD information and can be weighted by adding some metrology windows. Considering a practical case, customers are sometimes only concerned about some special locations (hotspots) in the resist. In this case, we add a window function around the hotspots as shown in Fig. 3. The value inside the metrology windows is usually set at 1 and that outside at 0. The weighted (or local) area Sshadow is expressed as

(15)

Sshadow=Nw·(δx·δy).

Fig. 3

Schematic of the weighted (local) edge distance error (EDE) with metrology windows.

JM3_12_4_043003_f003.png

Here, Nw is the total number of elements in shadow as shown in Fig. 3 and is approximately equal to the weighted pattern error Fw{M} as

(16)

Nw=Fw{M}=w(r)·{Γ{M}Z*}22,
where w(r) is a window function represented by rectangular functions with appropriate shifting and scaling constants. Hence, the weighted EDE in this case is

(17)

EDEw(M)=(δx·δy)Lw·w(r)·{Γ{M}Z*}22,
where Lw is the length of the desired pattern contour within the metrology windows.

For simplicity, G(M) is used to represent the weighted EDE. Generally, G(M) is treated as a cost function to guide mask synthesis under nominal conditions, i.e., no defocus and dosage variations, etc. In order to enhance the process robustness, process variations should be taken into account under the mask synthesizing process. Here, we use the expectation of the weighted EDE under different variations as a cost function as expressed by

(18)

J(M)=ζ[ψ(v)·G(M;v)],
where ζ denotes the expectation operation over v, v is a vector representing a combination of multiple process variations including, for example, defocus, exposure dosage variation, and lens aberrations, etc., and ψ(v) is the statistical probability of the corresponding process variations, which is defined by users and is usually obtained via various experiments or measurements of lithographic tools. J(M) is called the statistical EDE and is used as a cost function to guide the mask synthesis. The gradient of J(M) with respect to mask M will be used in the optimization process. According to Refs. (8, 13, and 15), the gradient of J(M) with respect to mask M is given as

(19)

MJ=va·(δx·δy)Lw·ψ(v)·{q=1Qμqhqflip(r;v)[w·(ZZ*)·Z·(1Z)·(hq(r;v)M)]}+va·(δx·δy)Lw·ψ(v)·{q=1Q[μqhqflip(r;v)][w·(ZZ*)·Z·(1Z)·(hq(r;v)M)]},
where means the conjugate operator; a is the steepness of the Sigmoid function in Eq. (5); and hq(r;v) is the q’th optical kernel under the process variations v; hqflip (r;v) is the up–down and left–right flip of hq(r;v).

2.3.

Inverse Lithography Problem Definition and Regularization

The objective of inverse lithography is synthesizing an input mask to deliver a desired output pattern. In order to guarantee the manufacturability of synthesized mask, mask quadratic error and complexity should be considered. The quadratic metric RQ(M) and the complexity metric RTV(M), i.e., total variation, are usually adopted to quantify the corresponding performance. In this paper, we focus on the binary mask. So, the quadratic metric RQ(M) and the complexity metric RTV(M) are expressed, respectively,11,12 as

(20)

RQ(M)=Ω[1(2M1)2]dr,

(21)

RTV(M)=Mx1+My1=DM1+MDT1,
where Ω is the simulation area or the number of pixels in M, ·1 is the L1 norm, and D is an operator of the first derivative

(22)

D=[1101111011].

Therefore, combining the optimization objectives of the mask quadratic error, the complexity, and the statistical EDE, we state the inverse lithography problem as

  • Finding M*(r) to minimize: J(M), RQ(M) and RTV(M)

  • subject to: 0M1.

It is noted that this problem has three mutually exclusive minimization objectives. In the literature, they are usually combined with certain proportions λ1 and λ2 to be stated as a single-objective minimization problem:1011.12.13.14.15

  • Finding M*(r) to minimize: J(M)+λ1RQ(M)+λ2RTV(M)

  • where: λ1, λ20,

  • subject to: 0M1.

2.4.

Mask Filtering Method

In this section, we propose an alternative method to solve this multiobjective minimization problem. We first interpret gray-level transitions and small, unwanted block objects, such as isolated holes, protrusions, jagged edges, or other layouts that cannot be fabricated, as unwanted noise in the mask, and then we design a specific filter S[·] to remove or prevent this noise to satisfy manufacturing constraints

(23)

M˜=S[M].

After the filtering process, the quadratic metric RQ(M) and complexity metric RTV(M) of the filtered mask M˜ are rather small. Then, we calculate the cost function of this filtered mask

(24)

J(S[M])=ζ[ψ(v)·G(S[M];v)].

Thus, the multiobjective minimization problem is converted into a simpler single-objective minimization problem as

  • Finding M*(r) to minimize: J(S[M])

  • subject to: 0M1.

We employ an iterative method to solve this problem. In the iteration process, we ensure that the statistical EDE, i.e., J(S[M]), of the filtered mask is iteratively decreasing. It is noted that each obtained mask is filtered and satisfies all the optimization objectives except for the statistical EDE; namely, it satisfies the manufacturing constraints. As a result, we only need to reduce the statistical EDE of this filtered mask. This approach is called the mask filtering technique.

The filter operator S[·] can be designed based on different mask manufacturing rules. The most basic filter should filter the gray-level image to be a district 0 or 1 and guarantee the mask to be less complex. We, therefore, define a basic mask filter as

(25)

S[M]=sig[OM].

Here, the steepness of this Sigmoid function is aS and the threshold is tS. O is a Gaussian filter to relieve mask complexity,

(26)

O(r)=τ1·e(1/2)(rr0/σO)2,
where r0 is the center point and rr0 means the distance from r to r0. τ is the normalized weight of the Gaussian filter,

(27)

τ=Ω1e(1/2)(rr0/σO)2dr,
where Ω1 is the number of pixels in O(r). The gradient of S[·] with respect to M is given as

(28)

MS[M]=aS·Oflip{sig[OM]·[1sig(OM)]}.

The detailed derivation of Eq. (28) is given in the Appendix.

Combining Eqs. (19) and (28), the gradient of J(S[M]) with respect to M is

(29)

MJ=JSSM=SJ·MS.

With the gradient Eq. (29), we apply a steepest descent method to solve this problem.11 The optimization procedure is

  • Iteration 0: Since the value of the mask is bound constrained to [0, 1], we use the following parametric transformation as

    (30)

    M=1+cos(Θ)2,Θ(,).

    Then, given a desired output pattern Z*(r), we compute the initial input mask Θ0

    (31)

    M0(r)=κ1·[H(r)Z*(r)]+κ2,

    (32)

    Θ0=cos1(2M01),

    (33)

    S0=S[M0],
    where κ1 and κ2 are parameters to adjust the initial value of the mask; for example, κ1=0.90 and κ2=0.05 in this paper. We do that because M(i,j)=0 or 1 would degrade the gradient of location (i,j) to 0 and therefore, the optimization freedom would be reduced.15 H(r) is a Gaussian function to make the initial mask continuous so that the gradient with respect to the initial mask is smooth. H(r) is defined as

    (34)

    H(r)=η1·e(1/2)(rr0/σH)2,
    where r0 is the center point and rr0 means the distance from r to r0. Ω2 is the number of pixels in H(r) and η is the normalized weight

    (35)

    η=Ω2e(1/2)(rr0/σH)2dr.

    Finally, we calculate the initial gradient

    (36)

    Θ0J=JS0S0M0M0Θ0,
    where

    (37)

    MΘ=sin(Θ)2.

  • Iteration k:

    • Step 1: Search the step length γkR in the direction ΘkJ,

      (38)

      γk=argminγ[J(Θkγ·ΘkJ)].

    • Step 2: Update Θk+1, Mk+1, and Sk+1

      (39)

      Θk+1=Θkγk·ΘkJ,

      (40)

      Mk+1=1+cos(Θk+1)2,

      (41)

      Sk+1=S[Mk+1].

    • Step 3: Calculate the gradient for the next iteration,

      (42)

      Θk+1J=JSk+1Sk+1Mk+1Mk+1Θk+1.

      If Θk+1J<Λ or J(Θk+1)<Ξ or k>Ψ, go to Stop.

      Else, return to Step 1.

    • Stop: Obtain the optimized mask,

      (43)

      M*(r)=Sk+1.

In the above procedure, the iteration is terminated when Θk+1J<Λ or J(Θk+1)<Ξ or k>Ψ, where Λ is defined as the minimum value of the norm of velocity, Ξ is defined as the minimum value of the statistical EDE, and Ψ is the prescribed upper limit of the number of iterations. The termination criterion Θk+1J<Ξ means that the iteration stops when the gradient is zero or rather small.

3.

Simulations

Simulations were performed on a partially coherent imaging system with an annular source illumination whose outer radius was σout=0.7 and whose inner radius was σin=0.4. The wavelength in the simulations was set at 193 nm, and the numerical aperture (NA) was 1.35. The resist effect was approximated by a Sigmoid function with a=100 and t=0.7. The Gaussian filter O(r) consisted of 21×21pixels and σO=4. The parameters of the Sigmoid function in the proposed filter S[·] were aS=300 and tS=0.5. The parameter κ1 and κ2 of the initial mask in Eq. (31) were 0.90 and 0.05, respectively; H(r) consisted of 21×21pixels and σH=2. The window function w(r) had the same size as the mask image, and all the values were set at 1. Instead of computing the step length γk in Eq. (38) accurately, we set γk at a constant 0.3 in each iteration. Since this paper focuses on developing a new regularization framework, process variations will not be taken into consideration in the proposed simulations. That means v is the nominal process condition and therefore ψ(v)=1. All the simulations were carried out with in-house MATLAB codes on a HPZ800 (3.47 GHz Xeon) workstation using a Windows 7 (64 bit) operating system.

3.1.

Edge Distance Error

Figure 4 depicts an example of a desired pattern and its output pattern on the wafer. In this case, the true absolute area between the desired pattern contour and its output pattern contour is 1.853×104nm2, the perimeter of the desired pattern is 1.70×103nm, and therefore, the true EDE is 10.90 nm. Table 1 summarizes the relative error compared to the true EDE when using different pixel grid sizes. The EDE in Table 1 is calculated by Eq. (11) and the pattern error is calculated by Eq. (7). From Table 1, it is observed that the magnitude of pattern error varies with the pixel grid size, whereas EDE does not. When the pixel grid size is small enough (e.g., 0.5 nm), the EDE calculated by the proposed method is approximately equal to the true EDE. With the increase of pixel grid size, the accuracy of EDE remains acceptable. So, the EDE calculated by the proposed method can be used to guide mask synthesis.

Fig. 4

Comparison of the desired pattern and its output pattern on the wafer.

JM3_12_4_043003_f004.png

Table 1

Results of pattern error and edge distance error (EDE) when using different pixel grid sizes.

Pixel grid size (nm)Pattern errorEDE (nm)Relative EDE error (%)
0.57.410×10510.8970.28
11.831×10510.7681.2
1.58.017×10410.6102.7
2.52.816×10410.3525.0
31.912×10410.1247.1

3.2.

Mask Filter

As shown in Eq. (25), the proposed mask filter consists of two portions: a Gaussian convolution operation and a Sigmoid (or thresholding) operation. Figure 5 demonstrates these filtering operations, where O(r) is a defined Gaussian filter with a size of 21×21pixels and σO=4, M is an intermediate mask pattern with a size of 321×321pixels and a grid resolution of 2.5 nm, which is commonly encountered during the optimization process of ILT, and S[M] is the filtered pattern calculated by Eq. (25). As expected, the Gaussian convolution operation, OM, weakens the weight of the small details in M, and the Sigmoid operation leads to a sharper contour. As a result, the filtered pattern S[M] has a lower complexity, and its mask quadratic error (denoted as QE in Fig. 5) reduces from 4.28×104 to 531.

Fig. 5

Demonstration of the mask filtering operations. O(r) is defined as a Gaussian filter, M is an intermediate mask pattern, OM denotes its convolution with the Gaussian filter O, S[M] represents its filtered pattern, and Γ(·) is the corresponding output pattern on the wafer. The horizontal axis and vertical axis denote x position and y position of the patterns in nanometers, respectively.

JM3_12_4_043003_f005.png

Figure 6 presents another set of simulations for the proposed filter, where M is an input mask pattern with a size of 321×321pixels and a grid resolution of 2.5 nm. This input mask pattern is artificially introduced with some objects that are difficult to manufacture in practice, including some small isolated holes, protrusions, hollows, and irregular features shown inside the red circles. From the perspective of signal processing, these details can be considered as high-frequency noise in the mask and can be evaluated with total variation.11,12 As revealed in Fig. 6, the total variation (denoted as TV in Fig. 6) of M reduces from 3080 to 2416 via the Gaussian convolution operation, which removes these small details. Subsequently, by the Sigmoid operation, it leads to a close-to-binary mask with a total variation of 2677, which reduces total variation by 13.1% compared to the original mask M. On the other hand, it is interesting to find that the EDE of the output pattern of the mask M, the Gaussian filtered mask, and the filtered mask S[M] are almost the same. That is because the optical lithography system with a low-pass nature does not deliver high-frequency details to the output pattern on the wafer. Similar to the optical lithography system, the mask filter acts as a low-pass filter to remove these details that are produced in ILT, whereas it does not cause distortions on the output pattern on the wafer. As demonstrated in Figs. 5 and 6, the proposed filter reduces the mask complexity and achieves a close-to-binary mask, so that the filtered mask S[M] is reachable in real manufacture.

Fig. 6

Performance of the proposed mask filter. M is an input mask pattern, OM denotes its convolution with the Gaussian filter O, S[M] represents its filtered pattern, and Γ(·) is the corresponding output pattern on the wafer. The horizontal axis and vertical axis denote x position and y position of the patterns in nanometers, respectively.

JM3_12_4_043003_f006.png

3.3.

Results of Mask Filtering Technique

Figure 7 shows the simulated images by using the proposed method for a desired pattern with a CD of 45 nm. The optimization is terminated after 200 iterations. The desired pattern M*, which is commonly encountered in the design of static random access memory circuits, consists of 321×321pixels with a grid resolution of 2.5 nm. As expected, the optimized mask patterns by the proposed method achieve much smaller EDE compared to that obtained by simply inputting the desire pattern M* as the mask pattern. It is also observed that the optimized gray-mask MS is very close to the postprocessing mask MP and reaches an almost identical output pattern and EDE. This demonstrates that the validity of the proposed method is to synthesize a regular mask pattern and to reach a considerably low EDE.

Fig. 7

Simulation results for the desired pattern M* by the proposed method. MS is the optimized mask, MP is the binary mask by postprocessing of MS with a global threshold 0.5, I(·) represents the corresponding optical image, and Γ(·) is the output pattern on the wafer. The horizontal axis and vertical axis denote x position and y position of the patterns in nanometers, respectively.

JM3_12_4_043003_f007.png

Figure 8 presents some intermediate results obtained in the iteration process by the proposed method. M#0 denotes the initial mask and is calculated by using Eq. (31); M#n means the mask that is obtained after the n’th iteration and is calculated by using Eq. (41). The EDE means EDE between the output pattern of the M#n and the desired pattern. It is noted that each obtained intermediate mask by the proposed method is very close to binary and has a low mask complexity. This demonstrates that the proposed method can filter (regularize) the mask to eliminate the gray-level transitions and small, unwanted objects. In comparison, Fig. 9 also shows some intermediate results obtained in the iteration process by the conventional regularization method. The conventional regularization method takes different penalty terms and incorporates them into the cost function with the corresponding weight and then seeks the minimum of such a weighted cost function. In this case, we take the quadratic term, for example, and the corresponding weighted parameter λ is set at 0.1. From Fig. 9, it is observed that the intermediate result with this method possesses gray-level transitions. The EDE may satisfy a 5% CD error after 50 iterations, while the mask quadratic error is pretty high at this moment; it still needs extra iterations to reduce the mask quadratic error although the EDE achieves the demanded result. This is one of the drawbacks of the conventional regularization method. Comparing Fig. 8 to Fig. 9, the intermediate mask by using the proposed method has a lower level in both mask quadratic error and mask complexity, which is quite an improvement over the conventional regularization method. Since the convergence of EDE and mask quadratic error by the conventional regularization method is out of synchronization, it, therefore, needs several iterations to achieve a low level on both EDE and mask quadratic error, although in the proposed method, the iteration (optimizing process) can be stopped whenever EDE reaches the demanded result without worrying about the manufacturability.

Fig. 8

Some intermediate results obtained in the iteration process by the proposed method. The horizontal axis and vertical axis denote x position and y position of the patterns in nanometers, respectively.

JM3_12_4_043003_f008.png

Fig. 9

Some intermediate results obtained in the iteration process by the conventional regularization method. The horizontal axis and vertical axis denote x position and y position of the patterns in nanometers, respectively.

JM3_12_4_043003_f009.png

Figure 10 depicts the convergence properties with different methods. The results by the conventional regularization method with different weighted parameters λ demonstrate that a small weight causes a fast convergence on EDE but results in a slow convergence on mask quadratic error; a large weight results in a fast convergence on mask quadratic error while finally causing a higher EDE. That means a smaller λ will not achieve the regularization effects, whereas a larger λ may result in a large EDE. For this reason, it is difficult to choose an appropriate value of weighted parameter λ to get a win–win situation. This is the second drawback of the conventional regularization method. On the other hand, it is observed that the EDE by the proposed method converges rapidly while the mask quadratic error remains at a low level, which demonstrates that all the intermediate masks satisfy the mask quadratic error constrains.

Fig. 10

Convergence properties with different methods.

JM3_12_4_043003_f010.png

Another two sets of simulations under different illumination conditions are shown in Fig. 11. Simulation of the designed mask pattern M1 is performed on a partially coherent imaging system with an annular source illumination (σout/σin=0.7/0.4) and the NA of 0.85. The mask M1S, i.e., obtained by the proposed method consists of a size 401×401pixels with a grid resolution of 2.5 nm. Simulation of the designed mask pattern M2 is performed on a partially coherent imaging system with a quasar source illumination (σout/σin/deg=0.9/0.6/45°) and the NA is of 1.25. The mask M2S, i.e., obtained by the proposed method consists of a size 361×361pixels with a grid resolution of 2.5 nm. From Fig. 11, it is demonstrated that the proposed method can synthesize a mask pattern under different imaging conditions and shows the possibility of reaching a considerably low EDE.

Fig. 11

Simulation results for the desired patterns M1 and M2 by the proposed method. M1S and M2S are the optimized masks, and Г(·) represents the corresponding output pattern on the wafer. The horizontal axis and vertical axis denote x position and y position of the patterns in nanometers, respectively.

JM3_12_4_043003_f011.png

We also performed simulations for more complicated patterns by using the proposed method. Figure 12 depicts the results for one desired pattern M*, which is a contact layer of the benchmark AND-OR-INVERT gate circuit layout,32 consisting of 601×1081pixels with a grid resolution of 2.5 nm, i.e., the simulation area is 1500×2700nm2. The proposed method results in a smooth mask pattern with an EDE of 2.11 nm compared to 8.37 nm by simply inputting the desired pattern as the mask pattern. These results further demonstrate that the proposed method has the capability of achieving a small EDE and of ensuring the regularity of the synthesized mask.

Fig. 12

Simulation results for the desired pattern M* by the proposed method. MS is the optimized mask, I(·) represents the corresponding optical image, and Г(·) is the output pattern on the wafer. The horizontal axis and vertical axis denote x position and y position of the patterns in nanometers, respectively.

JM3_12_4_043003_f012.png

Table 2 summarizes the average runtime of each iteration using different methods with different mask patterns. As revealed in Table 2, the average runtime of each iteration by the proposed method is almost the same as that by the conventional method. That is because the proposed method just adds a computation of Eq. (28), whose runtime is far less than the total calculation time compared to the conventional regularization method. In other words, the proposed method enhances the mask manufacturability with an almost equal runtime. In this perspective, the proposed method is therefore more efficient than the conventional regularization method.

Table 2

The average runtime of each iteration by different methods with different mask patterns.

Mask patternRun time (seconds)
The conventional methodThe proposed method
M* in Fig. 71.175 s1.183 s
M1 in Fig. 111.944 s1.957 s
M2 in Fig. 111.517 s1.528 s
M* in Fig. 126.672 s6.717 s

4.

Conclusions

In this paper, we have demonstrated the application of a mask filtering technique and the metric EDE to solve the inverse lithography problem. The mask filtering technique interprets gray-level transitions and small, unwanted block objects as unwanted noise in the mask, and employs a filter to remove this noise to satisfy manufacturing constraints. The proposed filter consists of two portions: a Gaussian convolution operation to weaken the weight of the small details in mask and a thresholding operation to produce a sharper contour. The advantage of this approach lies in that it enhances the manufacturability of each intermediate mask without raising computational complexity and avoids choosing weighted parameters of various regularization terms.

In addition, we introduce a metric called EDE to guide mask synthesis and establish the correlation between pattern error and EPE. EDE is defined as the absolute area between the desired pattern contour and its output pattern contour divided by the perimeter of the desired pattern. It can be interpreted as the mean EPE and can be approximated by pattern error multiplied by a constant portion that only depends on the simulation resolution and desired pattern. Therefore, EDE has the same dimension as EPE and has a continuous expression as pattern error. The mask filtering technique and the metric EDE are expected to have direct applications in mask optimization and synthesis for optical lithography in semiconductor industry.

Appendices

Appendix:

Derivation of Eq. (28)

To derive Eq. (28), we first give some useful intermediate results such as

(44)

sig[x]x=11+ea(xt)x=a·[11+ea(xt)]2·[ea(xt)]=a·sig[x]·[1sig(x)]
and

(45)

[h(ρ)M(ρ)]M(r)=[r1M(r1)h(ρr1)]M(r)=h(ρr),
where r, r1, and ρ denote the spatial coordinates (x,y). Noticing that Ω1 is the number of pixels in O(r) and aS is the steepness of the Sigmoid function in S[M], finally we define and derive the gradient of the mask filter S[M] with respect to M as

(46)

MS[M]=ρΩ1sig[O(ρ)M(ρ)]M(r)={aS·sig[OM]·[1sig(OM)]}·ρΩ1sig[O(ρ)M(ρ)]M(r)={aS·sig[OM]·[1sig[OM]]}·[ρΩ1O(ρr)]=ρΩ1[Oflip(rρ)·{aS·sig[OM]·[1sig[OM]]}]=aS·Oflip{sig[OM]·[1sig[OM]]}.

Acknowledgments

This work was funded by the National Natural Science Foundation of China (Grant No. 91023032, 51005091, 51121002), the Specialized Research Fund for the Doctoral Program of Higher Education of China (Grant No. 20120142110019), the National Science and Technology Major Project of China (Grant No. 2012ZX02701001), and the National Instrument Development Specific Project of China (Grant No. 2011YQ160002).

References

1. 

A. K. Wong, Resolution Enhancement Techniques in Optical Lithography, SPIE Press, Bellingham, Washington (2001).Google Scholar

2. 

L. W. Liebmannet al., “TCAD development for lithography resolution enhancement,” IBM J. Res. Dev. 45(5), 651–665 (2001).IBMJAE0018-8646http://dx.doi.org/10.1147/rd.455.0651Google Scholar

3. 

F. Schellenberg, “Resolution enhancement technology: the past, the present, and extensions for the future,” Proc. SPIE 5377, 1–20 (2004).PSISDG0277-786Xhttp://dx.doi.org/10.1117/12.548923Google Scholar

4. 

N. B. Cobb, “Fast optical and process proximity correction algorithms for integrated circuit manufacturing,” Ph.D. Thesis, University of California at Berkeley (1998).Google Scholar

5. 

L. PangY. LiuD. Abrams, “Inverse lithography technology (ILT): a natural solution for model-based SRAF at 45 nm and 32 nm,” Proc. SPIE 6607, 660739 (2007).PSISDG0277-786Xhttp://dx.doi.org/10.1117/12.729028Google Scholar

6. 

Y. ShenN. WongE. Y. Lam, “Level-set-based inverse lithography for photomask synthesis,” Opt. Express 17(26), 23690–23701 (2009).OPEXFF1094-4087http://dx.doi.org/10.1364/OE.17.023690Google Scholar

7. 

Y. Shenet al., “Robust level-set-based inverse lithography,” Opt. Express 19(6), 5511–5521 (2011).OPEXFF1094-4087http://dx.doi.org/10.1364/OE.19.005511Google Scholar

8. 

W. Lvet al., “Level-set-based inverse lithography for mask synthesis using the conjugate gradient and an optimal time step,” J. Vac. Sci. Technol. B 31(4), 041605 (2013).JVTBD90734-211Xhttp://dx.doi.org/10.1116/1.4813781Google Scholar

9. 

S. ShenP. YuD. Z. Pan, “Enhanced DCT2-based inverse mask synthesis with initial SRAF insertion,” Proc. SPIE 7122, 712241 (2008).PSISDG0277-786Xhttp://dx.doi.org/10.1117/12.801409Google Scholar

10. 

Y. Granik, “Fast pixel-based mask optimization for inverse lithography,” J. Micro/Nanolith. MEMS MOEMS 5(4), 043002 (2006).JMMMGF1932-5134http://dx.doi.org/10.1117/1.2399537Google Scholar

11. 

A. PoonawalaP. Milanfar, “Mask design for optical microlithography – An inverse imaging problem,” IEEE Trans. Image Process. 16(3), 774–788 (2007).IIPRE41057-7149http://dx.doi.org/10.1109/TIP.2006.891332Google Scholar

12. 

A. PoonawalaP. Milanfar, “A pixel-based regularization approach to inverse lithography,” Microelectron. Eng. 84(12), 2837–2852 (2007).MIENEF0167-9317http://dx.doi.org/10.1016/j.mee.2007.02.005Google Scholar

13. 

X. MaG. R. Arce, “Binary mask optimization for inverse lithography with partially coherent illumination,” J. Opt. Soc. Am. A 25(12), 2960–2970 (2008).JOAOD60740-3232http://dx.doi.org/10.1364/JOSAA.25.002960Google Scholar

14. 

J. C. YuP. Yu, “Impacts of cost functions on inverse lithography patterning,” Opt. Express 18(22), 23331–23342 (2010).OPEXFF1094-4087http://dx.doi.org/10.1364/OE.18.023331Google Scholar

15. 

X. MaG. R. Arce, “Pixel-based OPC optimization based on conjugate gradients,” Opt. Express 19(3), 2165–2180 (2011).OPEXFF1094-4087http://dx.doi.org/10.1364/OE.19.002165Google Scholar

16. 

N. B. CobbY. Granik, “Using OPC to optimize for image slope and improve process window,” Proc. SPIE 5130, 838–846 (2003).PSISDG0277-786Xhttp://dx.doi.org/10.1117/12.504379Google Scholar

17. 

N. B. CobbY. Granik, “Model-based OPC using the MEEF matrix,” Proc. SPIE 4889, 1281–1292 (2002).PSISDG0277-786Xhttp://dx.doi.org/10.1117/12.467435Google Scholar

18. 

Y. Granik, “Generalized mask error enhancement factor theory,” J. Micro/Nanolith. MEMS MOEMS 4(2), 023001 (2005).JMMMGF1932-5134http://dx.doi.org/10.1117/1.1898066Google Scholar

19. 

W. LvQ. XiaS. Y. Liu, “Pixel-based inverse lithography using a mask filtering technique,” Proc. SPIE 8683, 868325 (2013).PSISDG0277-786Xhttp://dx.doi.org/10.1117/12.2011469Google Scholar

20. 

B. Bourdin, “Filters in topology optimization,” Int. J. Numer. Methods Eng. 50(9), 2143–2158 (2001).IJNMBH0029-5981http://dx.doi.org/10.1002/(ISSN)1097-0207Google Scholar

21. 

M. P. BendsøeO. Sigmund, Topology Optimization: Theory, Methods and Application, Springer, Berlin (2003).Google Scholar

22. 

P. YuS. X. ShiD. Z. Pan, “True process variation aware optical proximity correction with variational lithography modeling and model calibration,” J. Micro/Nanolith. MEMS MOEMS 6(3), 031004 (2007).JMMMGF1932-5134http://dx.doi.org/10.1117/1.2752814Google Scholar

23. 

N. JiaE. Y. Lam, “Machine learning for inverse lithography: using stochastic gradient descent for robust photomask synthesis,” J. Opt. 12(4), 045601 (2010).JOOPDB0150-536Xhttp://dx.doi.org/10.1088/2040-8978/12/4/045601Google Scholar

24. 

N. JiaE. Y. Lam, “Pixelated source mask optimization for process robustness in optical lithography,” Opt. Express 19(20), 19384–19398 (2011).OPEXFF1094-4087http://dx.doi.org/10.1364/OE.19.019384Google Scholar

25. 

S. Choyet al., “A robust computational algorithm for inverse photomask synthesis in optical projection lithography,” SIAM J. Imaging Sci. 5(2), 625–651 (2012).1936-4954http://dx.doi.org/10.1137/110830356Google Scholar

26. 

S. LiX. WangY. Bu, “Robust pixel-based source and mask optimization for inverse lithography,” Opt. Laser Technol. 45, 285–293 (2013).OLTCAS0030-3992http://dx.doi.org/10.1016/j.optlastec.2012.06.033Google Scholar

27. 

S. Y. Liuet al., “Convolution-variation separation method for efficient modeling of optical lithography,” Opt. Lett. 38(13), 2168–2170, (2013).OPLEDP0146-9592http://dx.doi.org/10.1364/OL.38.002168Google Scholar

28. 

A. K. Wong, Optical Imaging in Projection Microlithography, SPIE Press, Bellingham, WA (2005).Google Scholar

29. 

H. H. Hopkins, “On the diffraction theory of optical images,” Proc. R. Soc. A 217, 408–432 (1953).PRLAAZ0080-4630http://dx.doi.org/10.1098/rspa.1953.0071Google Scholar

30. 

Y. C. PatiT. Kailath, “Phase-shifting masks for microlithography: automated design and mask requirements,” J. Opt. Soc. Am. A 11(9), 2438–2452 (1994).JOAOD60740-3232http://dx.doi.org/10.1364/JOSAA.11.002438Google Scholar

31. 

P. Gonget al., “Fast aerial image simulations for partially coherent systems by transmission cross coefficient decomposition with analytical kernels,” J. Vac. Sci. Technol. B 30(6), 06FG03 (2012).JVTBD90734-211Xhttp://dx.doi.org/10.1116/1.4767442Google Scholar

32. 

The NanGate 45 nm Open Cell Library, v1.3, http://www.si2.org/openeda.si2.org/projects/nangatelib (1 July 2013).Google Scholar

Biography

JM3_12_4_043003_d001.png

Wen Lv is currently a PhD candidate at Huazhong University of Science and Technology under the guidance of Prof. Shiyuan Liu. He received his BS degree from the School of Mechanical Science and Engineering of the same university in 2011. His research involves various issues in optical lithography, including inverse lithography, fast optical image simulation, and mask writing technique. He is a student member of SPIE and IEEE.

JM3_12_4_043003_d002.png

Qi Xia is an associate professor of the School of Mechanical Science and Engineering at the Huazhong University of Science and Technology, China. He received his PhD degree in mechanical engineering from the Chinese University of Hong Kong (CUHK), China, in 2007. His current interests include structural and material design optimization for microelectromechanical sensors and actuators, mechatronics and automation. He is a member of IEEE.

JM3_12_4_043003_d003.png

Shiyuan Liu is a professor of mechanical engineering at Huazhong University of Science and Technology, leading his Nanoscale and Optical Metrology Group with research interest in metrology and instrumentation for nanomanufacturing. He also actively works in the area of optical lithography, including partially coherent imaging theory, wavefront aberration metrology, optical proximity correction, source mask optimization, and inverse lithography technology. He received his PhD in mechanical engineering from Huazhong University of Science and Technology in 1998. He is a member of SPIE, OSA, AVS, IEEE, and Chinese Society of Micro/Nano Technology (CSMNT). He holds 30 patents and has authored or coauthored more than 100 technical papers.

Wen Lv, Qi Xia, Shiyuan Liu, "Mask-filtering-based inverse lithography," Journal of Micro/Nanolithography, MEMS, and MOEMS 12(4), 043003 (8 November 2013). http://dx.doi.org/10.1117/1.JMM.12.4.043003
JOURNAL ARTICLE
14 PAGES


SHARE
KEYWORDS
Photomasks

Lithography

Manufacturing

Gaussian filters

Image processing

Semiconducting wafers

Filtering (signal processing)

RELATED CONTENT

Simulation method using the image filter method
Proceedings of SPIE (August 01 2002)
Tuning MEEF for CD control at 65 nm node based...
Proceedings of SPIE (December 27 2002)
SMO photomask inspection in the lithographic plane
Proceedings of SPIE (September 23 2009)
Some lithographic limits of back end lithography
Proceedings of SPIE (April 26 2001)

Back to Top