The feasibility of measuring overlay using small targets has been demonstrated in an earlier paper<sup>1</sup>. If the target is small ("smallness" being relative to the resolution of the imaging tool) then only the symmetry of its image changes with overlay offset. For our purposes the targets must be less than 5μm across, but ideally much smaller, so that they can be positioned within the active areas of real devices. These targets allow overlay variation to be tested in ways that are not possible using larger conventional target designs. In this paper we describe continued development of this technology.
In our previous experimental work the targets were limited to relatively large sizes (3x3μm) by the available process tools. In this paper we report experimental results from smaller targets (down to 1x1μm) fabricated using an e-beam writer.
We compare experimental results for the change of image asymmetry of these targets with overlay offset and with modeled simulations. The image of the targets depends on film properties and their design should be optimized to provide the maximum variation of image symmetry with overlay offset. Implementation of this technology on product wafers will be simplified by using an image model to optimize the target design for specific process layers. Our results show the necessary good agreement between experimental data and the model.
The determination of asymmetry from the images of targets as small as 1μm allows the measurement of overlay with total measurement uncertainty as low as 2nm.
Currently, overlay measurements are characterized by “recipe”, which defines both physical parameters such as focus, illumination et cetera, and also the software parameters such as algorithm to be used and regions of interest. Setting up these recipes requires both engineering time and wafer availability on an overlay tool, so reducing these requirements will result in higher tool productivity.
One of the significant challenges to automating this process is that the parameters are highly and complexly correlated. At the same time, a high level of traceability and transparency is required in the recipe creation process, so a technique that maintains its decisions in terms of well defined physical parameters is desirable. Running time should be short, given the system (automatic recipe creation) is being implemented to reduce overheads. Finally, a failure of the system to determine acceptable parameters should be obvious, so a certainty metric is also desirable. The complex, nonlinear interactions make solution by an expert system difficult at best, especially in the verification of the resulting decision network. The transparency requirements tend to preclude classical neural networks and similar techniques. Genetic algorithms and other “global minimization” techniques require too much computational power (given system footprint and cost requirements). A Bayesian network, however, provides a solution to these requirements. Such a network, with appropriate priors, can be used during recipe creation / optimization not just to select a good set of parameters, but also to guide the direction of search, by evaluating the network state while only incomplete information is available. As a Bayesian network maintains an estimate of the probability distribution of nodal values, a maximum-entropy approach can be utilized to obtain a working recipe in a minimum or near-minimum number of steps. In this paper we discuss the potential use of a Bayesian network in such a capacity, reducing the amount of engineering intervention. We discuss the benefits of this approach, especially improved repeatability and traceability of the learning process, and quantification of uncertainty in decisions made. We also consider the problems associated with this approach, especially in detailed construction of network topology, validation of the Bayesian network and the recipes it generates, and issues arising from the integration of a Bayesian network with a complex multithreaded application; these primarily relate to maintaining Bayesian network and system architecture integrity.
Pattern matching has long been a cornerstone of industrial inspection. For example, in order to obtain high accuracy, modern overlay metrology tool optics are optimized to ensure symmetry around the central axis. To obtain best performance, the metrology target should be as close as possible to that axis, hence a pattern recognition stage is usually used to verify target position before measurement. However most of the work performed to date has concentrated on situations where the imaging process could be described by simple ray-tracing, where the image is formed by albedo difference between surfaces rather than interference. However, current semiconductor technology requires optical identification of targets less than 30 microns (i.e. about 50 wavelengths) across, and of order 1 wavelength deep, and this description is no longer valid; interference and focusing effects become dominant. In this paper we examine these effects, and their impact on a number of different techniques. We compare image-based and CAD-derived models in the training of the pattern recognition system; CAD-derived models are of particular interest due to their use in “imageless” recipe creation techniques. Our chief metrics are precision and reliability. We show that for both types of pattern matching approach, submicron precision and high reliability is achievable even in very challenging optical environments. We show that, while generally inferior to image based models, that models derived from design data are more robust to changes caused by process variation, namely changes in illumination, contrast and focus.
Determining the focal position of an overlay target with respect to an objective lens is an important prerequisite of overlay metrology. At best, an out-of-focus image will provide less than optimal information for metrology; focal depth for a high-NA imaging system at the required magnification is of the order of 5 microns. In most cases poor focus will lead to poor measurement performance. In some cases, being out of focus will cause apparent contrast reversal and similar effects, due to optical wavelengths (i.e. about half a micron) being used; this can cause measurement failure on some algorithms. In the very worst case, being out of focus can cause pattern recognition to fail completely, leading to a missed measurement.
Previous systems to date have had one of two forms. In the first, a scan through focus is performed, selecting the optimal position using a direct, image-based focus metric, such as the high-frequency component of a Fourier transform. This always gives an optimal or near-optimal focus position, even under wide process variation, but can be time consuming, requiring a relatively large number of images to be captured for each site visited. It also requires the optimal position to be included in the range of the scan; if initial uncertainty is large, then the focus scan needs to be longer, taking even more time.
The second approach is to monitor some property which has a known relationship to focus. This is often calibrated with respect to a scan through focus. On subsequent measurements the output of this secondary system is taken as a focus position. This second system may be completely separate from the imaging system; the primary requirement is only that it is coupled to the imaging system. These systems are generally fast; only one measurement per site is required, and they are typically designed so that only limited image / signal processing is required. However, such techniques are less precise or accurate than performing a scan through focus, and they are also susceptible to effects caused by variations of the wafer under test, e.g. variations in stack depth.
A fast, precise system for measuring focus position, using the imaging optics, has been developed. This new system achieves better accuracy than previous indirect techniques, significantly faster than executing a scan through focus. Its output is linear with respect to focus position, and it has a very high dynamic range, providing a direct estimate of focal position even at large focus offset. It also has an advantage over indirect systems of being an integral part of the imaging system, eliminating calibration drift over extended periods. In this paper we discuss the mathematical background, optical arrangement and imaging algorithms. We present initial performance results, including data on repeatability and time taken to measure focus.