Projection lithography using extreme ultraviolet (EUV) light at the 13.5-nm wavelength is expected to achieve production of integrated circuits (ICs) below 7-nm design rules.1 In pursuit of further miniaturization of semiconductor integrated circuit devices by EUV lithography, stochastic pattern defect problems have arisen.2–4 Stochastic pattern defects are fatal patterning failures such as bridging between neighboring pattern features or breakages of features, and its probability is extremely low (down to or even below). Because cutting-edge integrated circuit devices today have more than critical features per a device layer on a 300-mm wafer, such a defect probability will result in an unacceptable level of defect density.
While suppressing the stochastic defect itself is indispensable for EUV lithography, monitoring and control of these defects is another crucial issue.4–7 When applying EUV lithography to IC manufacturing, design rules and nominal mask/process conditions should be set so that the stochastic defect probability is within a tolerable range (e.g., ). Since stochastic defect probability is very sensitive to resist feature size or the mask and process conditions, however, small deviations from the nominal condition can cause catastrophic wafer failure3 (e.g., a change in exposure dose of a few percent can in some cases change the defect probability by an order of magnitude). Detecting changes in stochastic defect probability in this extremely low range will be necessary but is a challenge. For directly inspecting a huge number (e.g., ) features to detect below 10-nm size defects, present electron-beam-based inspection tools require unacceptably long inspection time,5 whereas the resolution capability is marginal for optical inspection tools.6 In contrast, it has been reported that conventional indices, such as critical dimension (CD) and line edge roughness (LER), have correlations with defect probabilities though they are empirical without theoretical ground.7 Here, we propose an approach to predict an extremely low probability of stochastic defect from local CD uniformity (LCDU) data or CD histogram for a limited number of pattern features, typically several orders of magnitude lower than a number of features to be inspected. We previously introduced the probabilistic model for stochastic defects generation based on two mechanisms, cascading shot noises and long-range scattered photoelectrons.8,9 In this paper, we apply this model to predict an extremely low probability of stochastic defect generation on real wafers.
Probabilistic Model of Pattern Defects
Before discussing the defect prediction, here, we briefly review our model.8,9 We start from generating numbers of physical/chemical events in a resist film, such as photon absorption, secondary electron generation, chemical reaction, and solubility flipping of resist polymer/molecule using coupled Monte-Carlo simulation, which combines simulations for optical imaging, photoelectron scattering, and chemical amplification with acid diffusion [Fig. 1(a)]. We divide the resist film by three-dimensional grids and count the number of reactions in each voxel produced by the grids. We assume that the solubility of a particular voxel flips if the number of reactions in that voxel exceeds a certain threshold, and further, count the number, , of solubility-flipped voxels through thickness, which represents the degree of solubility change in a particular spot of resist film. From the histogram of this number under the same exposure dose, we obtain the probability density functions (PDFs) for at location . Here, we focus on bridge-type defects in negative-tone resist processes. We define a local spot pattern and a local spot defect so that they are generated when the number of solubility-flipped polymer/molecule through the film thickness exceeds a certain threshold ( = main pattern or film defect). Thus, the probabilities of local spot pattern/defect per unit area (e.g., ) are expressed as8) is obtained as the probability that the spot film defects cover the area between the main pattern edge at and the point representing defect area as Figure 1(b) illustrates how we obtain from and . A periodic structure with 32-nm pitch is assumed with the center of exposed and unexposed area located at and 16 nm, respectively, and the mask edge at .
Equation (2) shows that the probability of defect generation between and depends on the horizontal location of edge . Although the actual edge location also varies in the depth direction along resist sidewall, the variations of edge location in the vertical direction are usually smaller than that in the horizontal direction (so-called LER), and we ignore the former in the present model. The above explanation assumed the defect generation mechanism A in Ref. 8 for simplicity, but the form of Eq. (2) holds also for mechanism B in the same reference. Optimization of exposure and material parameters to minimize defect probability showed clear trade-off relationship between defect probabilities and delineated pattern feature sizes as shown in Fig. 1(c), which is qualitatively consistent with experimental observations in Ref. 3. The exponential relationships between defect probabilities and exposure dosage required for obtaining designed size observed among varieties of resist materials4 are also explained by the model.9
Method of Defect Probability Estimation
Here, we apply the above-mentioned model for predicting defect probability on real wafers. In our method, the stochastic defect probability is expressed by the product sum of two probabilities and in Eq. (2). Our basic approach is to predict defect probability by evaluating and in Eq. (2), not by directly inspecting full-pattern features. Evaluating probability in the order of requires more than samples in general. Since both and are larger than by orders of magnitude, we expect the same order of measurement time reduction. Here, is a histogram of local edge position and directly measurable using SEM, and thus, we focus on how we evaluate .
Let us suppose that defect probability increases due to some process variations, and we need to detect this change. According to the above model, these variations change the defect probability through and in the following three pathways. First, process variations change the locations of pattern edges and their distribution . Second, the change in changes the value of because is a function of . Third, process variations change the function itself because is determined from chemical reaction density as explained from Eqs. (1) and (3).
We examined the changes in and along each pathway using our above-described defect probability model. Figure 2 shows the profiles of (), (), and () for two exposure conditions, nominal and 20% overirradiation. Here, we assumed one of the exposure/material parameter sets optimized so as to minimize defect probability for 16-nm lines and spaces with 0.33 NA optics. Please see Ref. 8 for details. A 20% increase in irradiation dosage shifts the mean CD by 20% (corresponding to a 1.5-nm shift in edge position) with changing the histogram profiles [Fig. 2(a)]. While it also changes the profile of , this is small compared to its exponential dependence on [Fig. 2(b)]. In contrast, a 20% increase in dose changes by 2 orders of magnitude at the same location [Fig. 2(c)]. This is because the linear change in is magnified by the exponential dependence of on . Consequently, defect probability is exponentially dependent on the above amount of exposure dose variations through the first and second pathways. If we assume the shape of function (dependences on and ) unchanged within the above ranges of exposure variations as an approximation, we can calculate the value of from measured , and further as a product sum of and . Note, however, that is a function of imaging and resist materials/processes conditions in general, and the above assumption needs to be examined when these conditions are changed.
Practically, two approaches can be taken for determining . In the first analytical approach, we directly calculate using the probabilistic defect model as explained in Fig. 2. This requires model calibration as in every conventional lithography simulations. The other is an empirical approach, where we determine so as to satisfy Eq. (1) with observed and . In Sec. 4, we examine the feasibility of our method using the latter approach.
Experimental Results and Discussions
We predict the defect probabilities in the order between and from measurement data on real EUV-exposed wafers. Mask patterns containing two-dimensional array of more than holes (24-nm diameter in 48-nm pitch) were exposed on a wafer (, ) with varying exposure dose to modulate defect probability. For each of the resist pattern groups exposed under 20 kinds of different exposure doses, each hole pattern size was measured by CD-SEM (Hitachi High-Technologies). The size of each feature was calculated from the area of ellipse best fitted to the shape defined by 50% threshold of signal intensity after applying Gauss filter to SEM images. With a 1-nm pixel size, about 50 pixels on edge contribute measurement, and the estimated error due to SEM noise is lower than 0.2 nm at the probe current () used in the experiment.10 We judge features below 9.5 nm as defects and calculate histograms of measured CD excluding these defects. CD histograms [1-nm bin, Fig. 3(a)] and defect probabilities [red diamonds in Fig. 4] were obtained for holes for the pattern groups #1 to #12 with relatively high () defect probability and for holes for the pattern groups #13 to #20 with relatively low () defect probability. The defect probabilities exponentially decrease from in group #1 to in group #19 with increasing the average diameter of holes from 16.2 to 19.1 nm. Thus, a 3-nm decrease in feature size increases the defect probability by 4 orders of magnitude.
Here, we focus on the relationship between CD variations and pattern defect probabilities without discussing their root causes. In this experiment, we observed no definitive mask defect that prints on wafers regardless of exposure dosage. Although some defects observed in this experiment can be mask origin, their probabilities exponentially increase with decreasing exposure dose (or delineated hole size) similarly to as expected for other root causes, such as photon shot noises and stochastic variations in resist reactions discussed previously. We regard them equally as defects due to local variations in the amount of reactions, include them in the distribution, and apply the same function in Eq. (2) no matter if their locations are fixed on the mask or random.
Our strategy is to determine the probability function in Eq. (2) so that it best explains observed defect probabilities and CD histogram for every exposure conditions (pattern groups). In real application environments, it is desired to minimize the number of measurement points (time required for measurement) both in determining and in predicting for unknown samples. Here, however, we utilized all the data in the group #1 to #20 for determining .
As a rough approximation of our simulated profiles for [Fig. 2(b)], we assume that exponentially decreases with the distance from the edge of main pattern and describes it in the form of . Here, we use the width of each feature () instead of to eliminate the influence of variation in pattern center positions. We calculate ( and ) so that best fits to for 19 groups (#1 to #19), and the obtained profile of is shown in Fig. 3(c). Although has no influence on calculated in where we judge features as defects (), is set 1 for this region. From a statistical viewpoint, can be regarded as the extreme-value cumulative distribution function that expresses the distribution of maximum distance for defects to continuously extend from the main pattern edge. Here, we leave the relationship between our assumption for and varieties of functions used in this area open.
Next, we predicted the defect probabilities of groups #13 to #20 from CD measurement data in each group with the above obtained P2. To examine the repeatability of the method, we repeated random sampling of CDs from CDs 100 times. Since the defect probabilities for the above groups range between and , each sampled CD data rarely contains defects (in average one defect in 10 samplings for ). Predicted probabilities are shown by boxplots in Fig. 4, and they are in good agreement with the results of direct inspection of features (red diamonds).
For the probabilities above , the data used for prediction contain some defects, and the box plots are regarded as the results of regression rather than of prediction. Between and , the data used for prediction usually contain no defect, and the predicted results (box plots) are verified by directly inspected results. Predicted results below cannot be verified because it is beyond the limitation of direct measurement. These results show 2 orders of magnitude reduction in the time required for evaluating defect probability.
Predicted probabilities fitted into normal distribution are plotted for each of the seven groups in Fig. 4, and the prediction repeatability is in the range between 0.2 and 0.4 digit. Histograms of measured CDs are shown for three groups (#13, 16, and 19) by circles in Fig. 3(b) with those for measurement (solid lines). The frequencies of CDs in histograms begin to scatter in the tail regions, and this limits the precision of the prediction.
To examine the range of edge position contributing to defect generation, the integrands in Eq. (2) [the product of Figs. 3(a) and 3(c)] are shown in Fig. 3(d) for the histograms of full-pattern measurement in every pattern groups. Peaks of the integrands spread to the range below 10 nm. Although histograms should cover this range, this often requires an unacceptably large number of measurement points (and thus long measurement time) for real manufacturing environment with low stochastic defect probability. Next, we extrapolate the tail of histogram to cover the desired range for such cases.
It was reported that CD histograms often deviate from the normal distribution and show exponential or multiple Gaussian distributions in their tails,3,5,7 and its relation to image profiles has also been pointed out.11 This is observed also in our results [Fig. 3(a)]. Figure 5(a) shows histograms of measured CDs randomly sampled from CDs for 100 times (blue circles), histogram of for measurement (red lines), and its normal distribution fit (black dotted line). The observed distribution start deviating from normal distribution for lower than and approximately decreases exponentially with decreasing . Thus, we extrapolate the tail of distribution for measured CDs using the exponential function.
To suppress the influence of data scattering near the tail of distribution, here, we reject the data at the smallest CD bin of histogram, calculate the slope (decay coefficient) by averaging the slope between the second and the third smallest CD bins and that between the second and fourth smallest CD bins, and connect the exponential function to the measured histogram at the second smallest CD bin. To examine the repeatability of the method, we repeated random sampling of CDs from CDs 100 times. Results of extrapolation are shown by black solid lines for the 100 samplings in Fig. 5(a). Predicted probabilities for groups #1 to #20 are shown by boxplots in Fig. 5(b), and they showed better agreement with the results of full-pattern inspection [red in Fig. 5(b)] than without using the extrapolation [Fig. 4]. The prediction repeatability is in the range between 0.2 and 0.3 digit.
Within the range of this study, it is reasonable to approximate , , and by exponential functions in the tail region of . However, the distributions below need to be examined with various possibilities for statistical functions for modeling them. Finally, we comment on the relation of the present method to the reported defect probability dependence on tail CDs (e.g., defined as CD corresponding to limit).7 Assuming the exponential function for in the tail region, suppose that the distribution of shifts by to due to change in exposure dosage for example. Then, changes to since the integrand of Eq. (2) is practically determined by the tail region. Thus, defect probability changes exponentially with the tail CD, and the present model explains the tail CD dependence of the defect probability.
In conclusion, applying the present method to plural spots on a chip or on a wafer visualizes the risk distribution of stochastic defects. Direct full inspection is needed only for the extracted risky area, and this is expected to reduce the required area of such a full inspection. Further, the verification results can be used for updating the model (function ). In this study, we predict stochastic defect probabilities from large-size LCDU data for a specific resist material/process. Note that any change in resist materials/processes can affect stochastic defect probability through the function as well as through the edge distributions (LCDU or LER).
The authors acknowledge P. De Bisschop and IMEC for the sample preparation and for their support of this work.
Hiroshi Fukuda joined Hitachi Central Research Laboratory in 1985, where he has engaged in various fields of lithography as well as nanodevices, MEMS, and hard disk drives, including research activities at Stanford University and Hitachi Europe Ltd. He has been with Hitachi High-Technologies since 2012. He received his BS, MS, and PhD degrees from Tokyo Institute of Technology in 1983, 1985, and 1994, respectively. He has published more than 30/80 journal/conference papers and holds over 20 patents.