^{10 − 10}) is indispensable. We discuss a method for predicting stochastic defect probabilities from a histogram of feature sizes for patterns several orders of magnitude fewer than the number of features to inspect. Based on our previously introduced probabilistic model of stochastic pattern defect, the defect probability is expressed as the product sum of the probability for edge position and the probability that film defect covers the area between edges, and we describe the latter as a function of edge position. The defect probabilities in the order between

^{10 − 7}and

^{10 − 5}were predicted from 10

^{5}measurement data for real EUV-exposed wafers, suggesting the effectiveness of the model and its potential for defect inspection.

## 1.

## Introduction

Projection lithography using extreme ultraviolet (EUV) light at the 13.5-nm wavelength is expected to achieve production of integrated circuits (ICs) below 7-nm design rules.^{1} In pursuit of further miniaturization of semiconductor integrated circuit devices by EUV lithography, stochastic pattern defect problems have arisen.^{2}^{–}^{4} Stochastic pattern defects are fatal patterning failures such as bridging between neighboring pattern features or breakages of features, and its probability is extremely low (down to ${10}^{-12}$ or even below). Because cutting-edge integrated circuit devices today have more than ${10}^{12}$ critical features per a device layer on a 300-mm wafer, such a defect probability will result in an unacceptable level of defect density.

While suppressing the stochastic defect itself is indispensable for EUV lithography, monitoring and control of these defects is another crucial issue.^{4}^{–}^{7} When applying EUV lithography to IC manufacturing, design rules and nominal mask/process conditions should be set so that the stochastic defect probability is within a tolerable range (e.g., ${10}^{-12}$). Since stochastic defect probability is very sensitive to resist feature size or the mask and process conditions, however, small deviations from the nominal condition can cause catastrophic wafer failure^{3} (e.g., a change in exposure dose of a few percent can in some cases change the defect probability by an order of magnitude). Detecting changes in stochastic defect probability in this extremely low range will be necessary but is a challenge. For directly inspecting a huge number (e.g., ${10}^{12}$) features to detect below 10-nm size defects, present electron-beam-based inspection tools require unacceptably long inspection time,^{5} whereas the resolution capability is marginal for optical inspection tools.^{6} In contrast, it has been reported that conventional indices, such as critical dimension (CD) and line edge roughness (LER), have correlations with defect probabilities though they are empirical without theoretical ground.^{7} Here, we propose an approach to predict an extremely low probability of stochastic defect from local CD uniformity (LCDU) data or CD histogram for a limited number of pattern features, typically several orders of magnitude lower than a number of features to be inspected. We previously introduced the probabilistic model for stochastic defects generation based on two mechanisms, cascading shot noises and long-range scattered photoelectrons.^{8}^{,}^{9} In this paper, we apply this model to predict an extremely low probability of stochastic defect generation on real wafers.

## 2.

## Probabilistic Model of Pattern Defects

Before discussing the defect prediction, here, we briefly review our model.^{8}^{,}^{9} We start from generating numbers of physical/chemical events in a resist film, such as photon absorption, secondary electron generation, chemical reaction, and solubility flipping of resist polymer/molecule using coupled Monte-Carlo simulation, which combines simulations for optical imaging, photoelectron scattering, and chemical amplification with acid diffusion [Fig. 1(a)]. We divide the resist film by three-dimensional grids and count the number of reactions in each voxel produced by the grids. We assume that the solubility of a particular voxel flips if the number of reactions in that voxel exceeds a certain threshold, and further, count the number, ${n}_{\mathrm{SF}}$, of solubility-flipped voxels through thickness, which represents the degree of solubility change in a particular spot of resist film. From the histogram of this number ${n}_{\mathrm{SF}}$ under the same exposure dose, we obtain the probability density functions (PDFs) ${\mathit{pdf}}_{\mathrm{SF}}$ $(\overrightarrow{r},{n}_{\mathrm{SF}})$ for ${n}_{\mathrm{SF}}$ at location $\overrightarrow{r}$. Here, we focus on bridge-type defects in negative-tone resist processes. We define a local spot pattern and a local spot defect so that they are generated when the number ${n}_{\mathrm{SF}}$ of solubility-flipped polymer/molecule through the film thickness exceeds a certain threshold $N{c}_{\mathrm{SF}\_\mathrm{X}}$ ($X$ = main pattern or film defect). Thus, the probabilities of local spot pattern/defect $P{1}_{X}$ per unit area (e.g., $1\text{\hspace{0.17em}\hspace{0.17em}}{\mathrm{nm}}^{2}$) are expressed as

## Eq. (1)

$$P{1}_{X}({\overrightarrow{r}}_{i},N{c}_{\mathrm{SF}\_X})={\int}_{N{c}_{\mathrm{SF}\_X}}^{\infty}{\mathit{pdf}}_{\mathrm{SF}}({\overrightarrow{r}}_{i},{n}_{\mathrm{SF}})\mathrm{d}{n}_{\mathrm{SF}},$$## Eq. (2)

$${P}_{\text{defect}\text{\hspace{0.17em}}\mathrm{A}}({x}_{d})=\int {P}_{\text{edge}}({x}_{\text{edge}})\xb7P{2}_{\text{defect}}({x}_{d}|{x}_{\text{edge}})\mathrm{d}{x}_{\text{edge}},$$## Eq. (3)

$$P{2}_{\text{defect}}({x}_{d}|{x}_{\text{edge}})={\prod}_{{x}_{\text{edge}<x<{x}_{d}}}P{1}_{\text{defect}}(x,N{c}_{\mathrm{SF}\_\text{defect}}),$$Equation (2) shows that the probability of defect generation between ${x}_{d}$ and ${x}_{\text{edge}}$ depends on the horizontal location of edge ${x}_{\text{edge}}$. Although the actual edge location also varies in the depth direction along resist sidewall, the variations of edge location in the vertical direction are usually smaller than that in the horizontal direction (so-called LER), and we ignore the former in the present model. The above explanation assumed the defect generation mechanism A in Ref. 8 for simplicity, but the form of Eq. (2) holds also for mechanism B in the same reference. Optimization of exposure and material parameters to minimize defect probability showed clear trade-off relationship between defect probabilities and delineated pattern feature sizes as shown in Fig. 1(c), which is qualitatively consistent with experimental observations in Ref. 3. The exponential relationships between defect probabilities and exposure dosage required for obtaining designed size observed among varieties of resist materials^{4} are also explained by the model.^{9}

## 3.

## Method of Defect Probability Estimation

Here, we apply the above-mentioned model for predicting defect probability on real wafers. In our method, the stochastic defect probability is expressed by the product sum of two probabilities ${P}_{\text{edge}}({x}_{\text{edge}})$ and $P2(x|{x}_{\text{edge}})$ in Eq. (2). Our basic approach is to predict defect probability by evaluating ${P}_{\text{edge}}$ and $P2$ in Eq. (2), not by directly inspecting full-pattern features. Evaluating probability in the order of $P$ requires more than $1/P$ samples in general. Since both ${P}_{\text{edge}}$ and $P2$ are larger than ${P}_{\text{defect}}$ by orders of magnitude, we expect the same order of measurement time reduction. Here, ${P}_{\text{edge}}$ is a histogram of local edge position and directly measurable using SEM, and thus, we focus on how we evaluate $P2$.

Let us suppose that defect probability increases due to some process variations, and we need to detect this change. According to the above model, these variations change the defect probability through ${P}_{\text{edge}}$ and $P2$ in the following three pathways. First, process variations change the locations ${x}_{\text{edge}}$ of pattern edges and their distribution ${P}_{\text{edge}}$. Second, the change in ${x}_{\text{edge}}$ changes the value of $P2$ because $P2$ is a function of ${x}_{\text{edge}}$. Third, process variations change the function $P2$ itself because $P2$ is determined from chemical reaction density as explained from Eqs. (1) and (3).

We examined the changes in ${P}_{\text{edge}}$ and $P2$ along each pathway using our above-described defect probability model. Figure 2 shows the profiles of ${P}_{\text{edge}}$ (${x}_{\text{edge}}$), $P2$ (${x}_{\text{center}}|{x}_{\text{edge}}$), and ${P}_{\text{defect}}$ ($x$) for two exposure conditions, nominal and 20% overirradiation. Here, we assumed one of the exposure/material parameter sets optimized so as to minimize defect probability for 16-nm lines and spaces with 0.33 NA optics. Please see Ref. 8 for details. A 20% increase in irradiation dosage shifts the mean CD by 20% (corresponding to a 1.5-nm shift in edge position) with changing the histogram profiles [Fig. 2(a)]. While it also changes the profile of $P2$, this is small compared to its exponential dependence on ${x}_{\text{edge}}$ [Fig. 2(b)]. In contrast, a 20% increase in dose changes ${P}_{\text{defect}}$ by 2 orders of magnitude at the same location $x$ [Fig. 2(c)]. This is because the linear change in ${x}_{\text{edge}}$ is magnified by the exponential dependence of $P2$ on ${x}_{\text{edge}}$. Consequently, defect probability is exponentially dependent on the above amount of exposure dose variations through the first and second pathways. If we assume the shape of function $P2$ (dependences on ${x}_{\text{edge}}$ and $x$) unchanged within the above ranges of exposure variations as an approximation, we can calculate the value of $P2$ from measured ${x}_{\text{edge}}$, and further ${P}_{\text{defect}}$ as a product sum of ${P}_{\text{edge}}$ and $P2$. Note, however, that $P2$ is a function of imaging and resist materials/processes conditions in general, and the above assumption needs to be examined when these conditions are changed.

Practically, two approaches can be taken for determining $P2$. In the first analytical approach, we directly calculate $P2$ using the probabilistic defect model as explained in Fig. 2. This requires model calibration as in every conventional lithography simulations. The other is an empirical approach, where we determine $P2$ so as to satisfy Eq. (1) with observed ${P}_{\text{edge}}$ and ${P}_{\text{defect}}$. In Sec. 4, we examine the feasibility of our method using the latter approach.

## 4.

## Experimental Results and Discussions

We predict the defect probabilities in the order between ${10}^{-7}$ and ${10}^{-5}$ from ${10}^{5}$ measurement data on real EUV-exposed wafers. Mask patterns containing two-dimensional array of more than ${10}^{7}$ holes (24-nm diameter in 48-nm pitch) were exposed on a wafer ($\lambda =13.5\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{nm}$, $\mathrm{NA}=0.33$) with varying exposure dose to modulate defect probability. For each of the resist pattern groups exposed under 20 kinds of different exposure doses, each hole pattern size was measured by CD-SEM (Hitachi High-Technologies). The size of each feature was calculated from the area of ellipse best fitted to the shape defined by 50% threshold of signal intensity after applying Gauss filter to SEM images. With a 1-nm pixel size, about 50 pixels on edge contribute measurement, and the estimated error due to SEM noise is lower than 0.2 nm at the probe current ($>100\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{pA}$) used in the experiment.^{10} We judge features below 9.5 nm as defects and calculate histograms of measured CD excluding these defects. CD histograms [1-nm bin, Fig. 3(a)] and defect probabilities [red diamonds in Fig. 4] were obtained for $2\times {10}^{5}$ holes for the pattern groups #1 to #12 with relatively high ($>{10}^{-5}$) defect probability and for ${10}^{7}$ holes for the pattern groups #13 to #20 with relatively low ($<{10}^{-5}$) defect probability. The defect probabilities exponentially decrease from ${10}^{-3}$ in group #1 to ${10}^{-7}$ in group #19 with increasing the average diameter of holes from 16.2 to 19.1 nm. Thus, a 3-nm decrease in feature size increases the defect probability by 4 orders of magnitude.

Here, we focus on the relationship between CD variations and pattern defect probabilities without discussing their root causes. In this experiment, we observed no definitive mask defect that prints on wafers regardless of exposure dosage. Although some defects observed in this experiment can be mask origin, their probabilities exponentially increase with decreasing exposure dose (or delineated hole size) similarly to as expected for other root causes, such as photon shot noises and stochastic variations in resist reactions discussed previously. We regard them equally as defects due to local variations in the amount of reactions, include them in the ${P}_{\text{edge}}$ distribution, and apply the same $P2$ function in Eq. (2) no matter if their locations are fixed on the mask or random.

Our strategy is to determine the probability function $P2$ in Eq. (2) so that it best explains observed defect probabilities ${P}_{\text{defect}}$ and CD histogram ${P}_{\text{edge}}$ for every exposure conditions (pattern groups). In real application environments, it is desired to minimize the number of measurement points (time required for measurement) both in determining $P2$ and in predicting ${P}_{\text{defect}}$ for unknown samples. Here, however, we utilized all the data in the group #1 to #20 for determining $P2$.

As a rough approximation of our simulated profiles for $P2$ [Fig. 2(b)], we assume that $P2$ exponentially decreases with the distance from the edge of main pattern and describes it in the form of $P{2}_{0}\text{\hspace{0.17em}}\mathrm{exp}(-a\xb7{x}_{\text{width}})$. Here, we use the width of each feature (${x}_{\text{width}}={x}_{\text{right edge}}-{x}_{\text{left edge}}$) instead of ${x}_{\text{edge}}$ to eliminate the influence of variation in pattern center positions. We calculate $P2$ ($P{2}_{0}$ and $a$) so that $\mathrm{log}(\int {P}_{\text{edge}}\xb7P2\mathrm{d}{x}_{\text{width}})$ best fits to $\mathrm{log}({P}_{\text{defect}})$ for 19 groups (#1 to #19), and the obtained profile of $P2$ is shown in Fig. 3(c). Although $P2$ has no influence on calculated ${P}_{\text{defect}}$ in ${x}_{\text{width}}<9.5\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{nm}$ where we judge features as defects (${P}_{\text{edge}}=0$), $P2$ is set 1 for this region. From a statistical viewpoint, $P2$ can be regarded as the extreme-value cumulative distribution function that expresses the distribution of maximum distance for defects to continuously extend from the main pattern edge. Here, we leave the relationship between our assumption for $P2$ and varieties of functions used in this area open.

Next, we predicted the defect probabilities of groups #13 to #20 from ${10}^{5}$ CD measurement data in each group with the above obtained P2. To examine the repeatability of the method, we repeated random sampling of ${10}^{5}$ CDs from ${10}^{7}$ CDs 100 times. Since the defect probabilities for the above groups range between ${10}^{-7}$ and ${10}^{-5}$, each sampled CD data rarely contains defects (in average one defect in 10 samplings for ${P}_{\text{defect}}={10}^{-6}$). Predicted probabilities are shown by boxplots in Fig. 4, and they are in good agreement with the results of direct inspection of ${10}^{7}$ features (red diamonds).

For the probabilities above ${10}^{-5}$, the data used for prediction contain some defects, and the box plots are regarded as the results of regression rather than of prediction. Between ${10}^{-7}$ and ${10}^{-5}$, the data used for prediction usually contain no defect, and the predicted results (box plots) are verified by directly inspected results. Predicted results below ${10}^{-7}$ cannot be verified because it is beyond the limitation of direct measurement. These results show 2 orders of magnitude reduction in the time required for evaluating defect probability.

Predicted probabilities fitted into normal distribution are plotted for each of the seven groups in Fig. 4, and the prediction repeatability is in the range between 0.2 and 0.4 digit. Histograms of ${10}^{5}$ measured CDs are shown for three groups (#13, 16, and 19) by circles in Fig. 3(b) with those for ${10}^{7}$ measurement (solid lines). The frequencies of CDs in ${10}^{5}$ histograms begin to scatter in the tail regions, and this limits the precision of the prediction.

To examine the range of edge position contributing to defect generation, the integrands in Eq. (2) [the product of Figs. 3(a) and 3(c)] are shown in Fig. 3(d) for the histograms of full-pattern measurement in every pattern groups. Peaks of the integrands spread to the range below 10 nm. Although histograms should cover this range, this often requires an unacceptably large number of measurement points (and thus long measurement time) for real manufacturing environment with low stochastic defect probability. Next, we extrapolate the tail of histogram to cover the desired range for such cases.

It was reported that CD histograms often deviate from the normal distribution and show exponential or multiple Gaussian distributions in their tails,^{3}^{,}^{5}^{,}^{7} and its relation to image profiles has also been pointed out.^{11} This is observed also in our results [Fig. 3(a)]. Figure 5(a) shows histograms of ${10}^{5}$ measured CDs randomly sampled from ${10}^{7}$ CDs for 100 times (blue circles), histogram of for ${10}^{7}$ measurement (red lines), and its normal distribution fit (black dotted line). The observed distribution start deviating from normal distribution for ${P}_{\text{defect}}$ lower than ${10}^{-3}$ and approximately decreases exponentially with decreasing ${X}_{\text{width}}$. Thus, we extrapolate the tail of distribution for ${10}^{5}$ measured CDs using the exponential function.

To suppress the influence of data scattering near the tail of distribution, here, we reject the data at the smallest CD bin of histogram, calculate the slope (decay coefficient) by averaging the slope between the second and the third smallest CD bins and that between the second and fourth smallest CD bins, and connect the exponential function to the measured histogram at the second smallest CD bin. To examine the repeatability of the method, we repeated random sampling of ${10}^{5}$ CDs from ${10}^{7}$ CDs 100 times. Results of extrapolation are shown by black solid lines for the 100 samplings in Fig. 5(a). Predicted probabilities for groups #1 to #20 are shown by boxplots in Fig. 5(b), and they showed better agreement with the results of full-pattern inspection [red in Fig. 5(b)] than without using the extrapolation [Fig. 4]. The prediction repeatability is in the range between 0.2 and 0.3 digit.

Within the range of this study, it is reasonable to approximate ${P}_{\text{edge}}$, $P2$, and ${P}_{\text{defect}}$ by exponential functions in the tail region of ${P}_{\text{edge}}$. However, the distributions below ${10}^{-7}$ need to be examined with various possibilities for statistical functions for modeling them. Finally, we comment on the relation of the present method to the reported defect probability dependence on tail CDs (e.g., defined as CD corresponding to $3\sigma $ limit).^{7} Assuming the exponential function ${P}_{\text{edge}}\propto \mathrm{exp}(b\xb7{x}_{\text{edge}})$ for ${x}_{\text{edge}}$ in the tail region, suppose that the distribution of ${P}_{\text{edge}}$ shifts by $-\delta x$ to ${P}_{\text{edge}}^{\prime}\propto \mathrm{exp}([b({x}_{\text{edge}}+\delta x)]$ due to change in exposure dosage for example. Then, ${P}_{\text{defect}}$ changes to ${P}_{\text{defect}}^{\prime}=\mathrm{exp}(b\xb7\delta x){P}_{\text{defect}}({x}_{\text{edge}})[\mathrm{log}({P}_{\text{defect}}^{\prime})=b\delta x+\mathrm{log}({P}_{\text{defect}})]$ since the integrand of Eq. (2) is practically determined by the tail region. Thus, defect probability changes exponentially with the tail CD, and the present model explains the tail CD dependence of the defect probability.

In conclusion, applying the present method to plural spots on a chip or on a wafer visualizes the risk distribution of stochastic defects. Direct full inspection is needed only for the extracted risky area, and this is expected to reduce the required area of such a full inspection. Further, the verification results can be used for updating the model (function $P2$). In this study, we predict stochastic defect probabilities from large-size LCDU data for a specific resist material/process. Note that any change in resist materials/processes can affect stochastic defect probability through the function $P2$ as well as through the edge distributions (LCDU or LER).

## Acknowledgments

The authors acknowledge P. De Bisschop and IMEC for the sample preparation and for their support of this work.

## References

## Biography

**Hiroshi Fukuda** joined Hitachi Central Research Laboratory in 1985, where he has engaged in various fields of lithography as well as nanodevices, MEMS, and hard disk drives, including research activities at Stanford University and Hitachi Europe Ltd. He has been with Hitachi High-Technologies since 2012. He received his BS, MS, and PhD degrees from Tokyo Institute of Technology in 1983, 1985, and 1994, respectively. He has published more than 30/80 journal/conference papers and holds over 20 patents.