Role of wafer geometry in wafer chucking

Abstract. Wafer chucks are used in advanced lithography systems to hold and flatten wafers during exposure. To minimize defocus and overlay errors, it is important that the chuck provide sufficient pressure to completely chuck the wafer and remove flatness variations across a broad range of spatial wavelengths. Analytical and finite element models of the clamping process are presented here to understand the range of wafer geometry features that can be fully chucked with different clamping pressures. The analytical model provides a simple relationship to determine the maximum feature amplitude that can be chucked as a function of spatial wavelength and chucking pressure. Three-dimensional finite element simulations are used to examine the chucking of wafers with various geometries, including cases with simulated and measured shapes. The analytical and finite element results both demonstrate that geometry variations with short spatial wavelengths (e.g., high-frequency wafer shape features) present the greatest challenge to achieving complete chucking. The models and results presented here can be used to provide guidance on wafer geometry and chuck designs for advanced exposure tools.


Introduction
Wafer chucks are used to clamp wafers in various processes during semiconductor device manufacturing. In particular, wafer chucks are critical components in lithography scanners as they are used to not only hold the wafer, but also to flatten the wafer to minimize defocus problems resulting from wafer geometry, such as bow and warp. More recently, electrostatic chucks are being used to clamp and flatten reticles in extreme ultraviolet (EUV) lithography scanners during exposure. Similarly, electrostatic chucks will be used for clamping wafers during EUV lithography processes. Traditionally, vacuum chucks that provide approximately 80 to 90 kPa of clamping pressure have been used in photolithography tools. These pressures are generally sufficient to deform typical wafers with modest amounts of bow and warp flat and achieve complete chucking. If wafers are not chucked completely, overlay and defocus issues may arise in lithography processes. [1][2][3] While wafer chucking is not typically considered a key challenge, recent and future changes to lithography systems and processes have increased the importance of wafer chucking. These changes include: (1) the development of EUV lithography systems that use electrostatic chucks with lower clamping pressures, (2) a move to smaller feature sizes with tighter requirements on defocus and overlay that make complete chucking, down to the nanometer level, critical, and (3) a transition to larger diameter, 450-mm wafers that are thicker and thus stiffer and more difficult to chuck. As described in Ref. 4, which reports experiments that involved the chucking of wafers and masks, the clamping pressure that is typically generated by electrostatic chucks is significantly lower (5 to 20 kPa) than that produced by vacuum chucks (>80 kPa). The development of EUV lithography systems has driven increased interest in characterization of different electrostatic chucking mechanisms and their capabilities, but previous reports have largely focused on mask chucking. [5][6][7] As a result, there is a critical need to better understand the mechanics of the wafer chucking process and the role of wafer geometry in chuck performance.
In this paper, we report an analytical model to establish a basic understanding of wafer chucking and a finite element analysis-based parametric study of chucking wafers with various realistic geometries. The analytical model of the chucking process establishes the minimum clamping pressure required to completely chuck a wafer as a function of geometry and elastic properties of the wafer. The computational model examines chucking of wafers with realistic geometries through 3-D finite element simulations. The geometries of the wafers in the computational study are either generated by combining shape features of multiple wavelengths or are based on KLA-Tencor patterned wafer geometry (PWG) measurements of product wafers. The simple analytical model and the wafer-level simulations both show that high-frequency (short spatial wavelength) features are most likely to lead to chucking problems.

Analytical Model of Chucking
Wafer shape is defined as the geometry variation of the medial plane of the wafer in a free state, 8 thus the analytical model considers the chucking of a wafer with a constant thickness and a sinusoidal variation of the medial plane ( Fig. 1). A simple wafer with a single wavelength and variation in only one dimension is considered in order to develop a model that provides fundamental insight into the role of wavelength, amplitude, thickness and elastic properties in chucking. While real wafers have geometries that are substantially more complex, the essential scaling obtained from a 1-D analytical model provides insight into critical factors in chucking. As shown in Fig. 1, the wafer is assumed to be in contact with a rigid chuck and a uniform pressure is applied by the chucking load (e.g., vacuum or electrostatic). Before pressure is applied, the wafer shape results in a sinusoidal gap at the interface with maximum height A: For complete chucking, the applied pressure must be sufficient to completely close this interface gap. To calculate the pressure required to achieve complete chucking, we consider two analytical models for different wafer geometry regimes: long wavelengths (λ ≫ h) and short wavelengths (λ ≪ h).
In the long wavelength regime (λ ≫ h), a beam theorybased model is employed. Specifically, a shear-corrected beam theory model, 9 often referred to as Timoshenko beam theory, is employed. The governing equation for shearcorrected beam theory can be written as where ϕ and w are functions of x that represent the slope due to bending and the transverse deflection, respectively. The constant k is a function of the thickness, elastic properties, and cross-sectional shape of the beam. For a rectangular cross-section, k is defined as where E is the Young's modulus, G is the shear modulus, and ν is Poisson's ratio. For an isotropic material, G ¼ E∕ ½2ð1 þ νÞ, and Eq. (3) reduces to By setting w ¼ s, substituting Eq. (1) into Eq. (2), and then solving Eq. (2) for ϕ, and applying two boundary conditions The resulting pressure distribution required to clamp the wafer can be obtained from this as The required clamping pressure (the minimum pressure that can be applied to achieve complete chucking) is equal to the maximum pressure in Eq. (6) [i.e., PðxÞ at x ¼ 0], which is This is the pressure required to achieve complete chucking. From this equation, it is evident that required clamping pressure scales linearly with amplitude and increases with decreasing spatial wavelength. This expression can be simplified by assuming a specific value for Poisson's ratio. Here, we assume ν ¼ 0.18, which is typical of silicon, and Eq. (7) reduces to Rewriting this equation in terms of the maximum amplitude that can be chucked completely, we obtain In the short wavelength regime (λ ≪ h), we consider the deformation of an elastic halfspace with a wavy surface with a profile defined by Eq. (1). The general solution for this problem is well known and can be found in Ref. 10. From the general solution in Ref. 10, the pressure required to chuck the surface flat is obtained as Again, this can be simplified by assuming a typical value of Poisson's ratio (ν ¼ 0.18), leading to Rewriting this equation in terms of the maximum amplitude that can be chucked, yields

2-D Finite Element Model of Chucking a Wavy Wafer
To validate the analytical models in Sec. 2 and better understand the transition between the short and long wavelength regimes, a finite element model of the chucking problem in Fig. 1 was developed. The model was developed and solved in the commercial finite element software ANSYS. 11 The wafer was assumed to have isotropic elastic properties Fig. 1 Schematic of wafer geometry and loading considered in the analytical and 2-D finite element models. The wafer has uniform thickness, h, and the medial-plane of the wafer varies sinusoidally with position in one dimension. As a result of this wafer shape, an initial gap with height, A, and wavelength, λ, exists between the elastic wafer and rigid chuck. The wafer is loaded uniformly with applied pressure, P.
(E ¼ 150 GPa, ν ¼ 0.18) and was meshed with 8-noded 2-D continuum elements. A layer of node-to-node contact elements was defined on the lower surface of the wafer. The number of elements in the mesh varied for different geometries and convergence studies were used to ensure sufficient mesh density. In general, there were at least 20 elements in the thickness direction and the aspect ratio of the elements was less than 10∶1. A uniform pressure was applied on the top of the wafer. For each geometry (combination of amplitude and wavelength), the model was run multiple times at different pressures and the remaining gap between the wafer and chuck was determined by examining the displacements on the lower surface of the wafer. The relative remaining gap, R, was calculated at the point on the wafer surface where the gap is initially largest as R ¼ ðA − u y Þ∕A, where A is the initial height of the gap and u y is the displacement at the node on the lower surface of the wafer where the gap is initially the largest. The wafer was considered to be fully chucked when R < 0.01. The "chucking pressure" is defined as the lowest pressure that achieves R < 0.01.

3-D Finite Element Analysis of Chucking Wafers
with Arbitrary Geometries A 3-D finite element model was developed to simulate the chucking of wafers with realistic geometries. The wafer was meshed with 8-node continuum solid elements and the geometry was defined either from imported wafer geometry measurements or through artificial wafer geometries created by superimposing wafer geometry features with varying spatial wavelengths and amplitudes. All wafers were assumed to be uniform thickness with a value of 775 μm and have elastic properties of silicon (E ¼ 150 GPa, ν ¼ 0.18). A rigid chuck surface was defined and node-to-node contact elements were defined at the interface between the wafer and chuck. The contact elements were configured to have a stiffness that is substantially higher than that of the wafer surface, such that the chuck was effectively rigid. The chucking pressure was uniformly applied across the wafer; applied pressures varied from 5 to 80 kPa. From the finite element solution, the gap remaining at the interface between the wafer and chuck was calculated by adding the calculated displacements to the initial coordinates of the bottom surface of the wafer.

Localized Wafer Shape Quantification
Localized wafer shape quantification is an emerging concept and has benefits in the monitoring and control of various semiconductor manufacturing processes. In this work, we have evaluated local curvatures of initial wafer shape and compared them to the calculated gap remaining at the interface. The local xand y-curvature was computed at every pixel location on the wafer using the nearest neighborhood points. The local curvatures in the x and y directions were combined to obtain curvature in the radial direction. To transform pixel-level curvature values into meaningfully large length scales that are suitable for lithography and other applications, the full wafer curvature maps were divided into a grid of rectangular areas or sites. The dimensions of the sites (site-size) may be selected based on a specific process of interest. Local curvature metrics were then computed as peak-to-valley value, mean value, standard deviation value, and such within each site (10 × 10 mm 2 square sites were chosen for illustration in this paper). This curvature-based local shape metric may be classified under the category of local shape quantification metrics which include a scheme of other local shape metrics that have been developed recently at KLA-Tencor to monitor advanced semiconductor processes. The process dependent nature of the user-defined sites makes the metric easily adaptable to monitor and control other processes such as chemical mechanical polishing and rapid thermal processing. In addition to curvature, the high-frequency components of the wafer may also be characterized in the nanotopography (NT) regime. 12 The NT of wafer surface (front and/or back) is derived by applying a double Gaussian filter to the surface, which filters out long wavelength components while retaining the short wavelength components of the wafer surface (λ ≤ 20 mm). Local-site NT quantified as peak-to-valley, mean, and standard deviation values within user-defined rectangular sites serve as additional metrics which may be effective at flagging local regions of the wafer that may sustain chucking problems.
6 Results and Discussion 6.1 Chucking a Wafer with Waviness in 1-D Figure 2 shows the maximum amplitude that can be chucked as a function of spatial wavelength for a 775-μm thick Si wafer that is being clamped with 80 kPa of pressure. Included in Fig. 2 are the results of the analytical models described in Sec. 2 as well as the results of the 2-D finite element model described in Sec. 3. The finite element and analytical models agree reasonably well over the range of spatial wavelengths examined and show that there are two regimes-long wavelength and short wavelength-that depend on wavelength in different ways. The difference in the slope between the two regimes is due to a change in deformation mode; at wavelengths less than about twice the wafer thickness, the gap is accommodated by bulk deformation of the wafer, while at larger wavelengths the deformation is bending dominated. Most wafer shape features have spatial wavelengths longer than 1 mm, and thus, will be primarily accommodated by bending deformation during chucking. As the deformation is bending dominated, there is a strong dependence on wavelength and the maximum amplitude of the feature that can be chucked scales with the wavelength to the fourth power [see Fig. 2 and Eq. (9)]. As a result of this fourth-order dependence, small-wavelength (high-frequency) wafer geometry features are substantially more difficult to chuck. For example, the results in Fig. 2 show that for spatial wavelengths of 30 mm, a gap as a large as 20 μm can be completely chucked, while at spatial wavelengths of 3 mm, the maximum gap that can be chucked is approximately 2.5 nm. Therefore, in order to identify wafers with potential chucking problems, it is essential that wafer metrology tools have the ability to characterize higherorder shape features.
To facilitate the use of the results in Fig. 2 for quick assessments of chuckability (the ability to chuck a wafer fully flat), an equation was fit to the finite element results shown. The form of the fit is based on the expressions derived in the analytical modeling in Sec. 2. The amplitude that can be chucked as a function of spatial wavelength, applied pressure, wafer thickness, and Young's modulus is where the Poisson's ratio is assumed to be 0.18. This fit covers the entire range of spatial wavelengths presented in Fig. 2 and describes both the bulk and bending deformation regimes. This expression can be used to approximately describe the magnitudes of shape features that are significant in chucking processes.

3-D Finite Element Simulations of Chucking Simulated Wafer Shapes
The 3-D finite element model was first used to assess the chucking of wafers with simple simulated shapes. Specifically, a 1-D sinusoidal variation was incorporated in the center of the wafer and the shape amplitude was tapered such that the wafer was nominally flat at the edge. This model allows for comparisons between the results of a simple strip with 1-D sinusoidal waviness discussed in Sec. 6.1 and the 3-D wafer model. Results are shown in Fig. 3 for a series of wafers with a fixed wavelength (λ ¼ 20 mm) and pressure (P ¼ 80 kPa) and varying amplitude. At the larger amplitudes of 10 and 5 μm, the gap remaining after chucking is significant, while the wafer chucks nearly completely when the amplitude is 2 μm. Equation (13), which describes the 1-D wavy results in Fig. 2, predicts the maximum amplitude that can be chucked of 4 μm; thus, the 3-D finite element results are generally consistent with the simple 2-D analysis. This suggests that the simple prediction of chucking amplitude from Eq. (13) can be used as a first-order estimate of whether or not a feature will chuck. A second set of simulated wafer shapes is shown in Fig. 4. Both wafers in this set have a wafer-scale bow with an amplitude of tens of micrometers, as would be caused by a residually stressed film on one surface, as well as features at shorter spatial wavelengths that fall in the NT regime. 13 Specifically, both wafers in Fig. 4 have a wafer-scale bow with amplitude of 40 μm, but have different NT: wafer (a) has a wavelength of 10 mm and amplitude of 50 nm, while wafer (b) has a wavelength of 10 mm and amplitude of 100 nm. Note that the NT features are not visible in the wafer shape maps (Fig. 4, left) as the amplitude of the wafer-scale variations is much larger than the NT. We see that wafer (a) chucks nearly completely (gap <1 nm), while wafer (b) fails to chuck in multiple areas. The height of the NT features in the two cases is 100 nm or less, suggesting that high-resolution wafer geometry measurements are needed to capture such features. Also, as the NT features are at a spatial wavelength of 10 mm, the measurement tool would need to have a minimum spatial resolution of 5 mm (and preferably smaller) to detect such wafer geometry features.

3-D Finite Element Simulations of Chucking Real Wafers
Here, the 3-D finite element model was used to examine the chucking of example real wafers. The wafers, 300 mm in diameter, were measured on a KLA-Tencor PWG tool and were selected to represent several important cases. Wafer 1 has higher-order shape features near the edges. Wafer 2 is a typical wafer with a simple "bow" shape and moderate shape variations at shorter spatial wavelengths. Wafers 3 and 4 are wafers with local geometry features on the wafer back surface due to processing. This set of wafers represents several realistic cases of wafer geometry that could potentially impact chucking. The free shapes of wafers 1 and 2 are shown in Fig. 5. Note the steep change in wafer shape at the edges of wafer 1, which is visible in the contour plot and 2-D profile.  To demonstrate the effect of clamping pressure, chucking of wafer 1 was simulated at several different pressures. The maximum gap between the wafer and chuck as a function of pressure is shown in Fig. 6. Clearly chucking pressure has a significant effect, with nearly a five-fold increase in the gap as the pressure is decreased from 80 to 5 kPa. It is significant that, for this wafer, complete chucking is not even achieved at 80 kPa, which is the pressure of a standard vacuum chuck. Furthermore, the large remaining gap at the interface at lower pressures <15 kPa suggests that this wafer would not be suitable for processing in EUV lithography systems that are expected to use electrostatic chucks with lower clamping pressures. In contrast, wafer 2 chucks completely at all pressures from 5 to 80 kPa. The fact that wafers 1 and 2 exhibit such different chucking behaviors even though the wafer-scale maps (Fig. 5) appear qualitatively similar demonstrates that simple inspection of wafer-scale shape is not suitable for making judgments on the chuckability of wafers. This same idea is observed in the simulated wafer results shown in Fig. 4. The difference in chucking between wafers 1 and 2, as well as the chucking mechanics discussed earlier in this paper, suggests that a local metric is needed to assess chuckability.
The need to consider local wafer geometry metrics is further reinforced by realizing that incomplete chucking is usually limited to relatively small areas on a wafer. Figure 7 shows the peak-to-valley (PV)-gap per site (10 × 10 mm 2 sites) for wafer 1 at clamping pressures of 5 and 80 kPa.  At both pressures, nearly complete chucking is achieved over the center portion of the wafers, and the regions with chucking problems are located near the wafer edges. As expected, the nonchucked regions and magnitudes of the gaps are larger in the lower-pressure, 5 kPa case. We also note that the region with the largest remaining gap is located in lower left corner of the wafer. As seen in Fig. 5, this region of the wafer also has a large local shape variation.
To examine a suitable local metric for chuckability, local chucking behavior was compared to various standard local wafer geometry metrics (e.g., site flatness).This analysis revealed that local curvature-based descriptors of the wafer geometry exhibit reasonable correlation with the locations of poor chucking. The correlation between local curvature (described in Sec. 5) and the remaining gap after chucking for wafer 1 at 80 kPa is shown in Fig. 8. This correlation was performed on a site-by-site basis where curvature and gap are calculated at each 10 × 10 mm site on the wafer. A large number of sites across the wafer chuck completely; thus, sites with a remaining gap of less than 2 nm (shown in Fig. 8 in red) were excluded from the correlation calculation. For both the 5 and 80 kPa cases, there is a clear positive correlation between the remaining gap and site curvature. The correlation is not perfect, but it is clear that higher local curvature is associated with a larger remaining gap after chucking. The connection between curvature and poor chucking is expected based on the analysis in Sec. 2. For the ideal surface considered, the curvature is proportional to A∕λ 2 [obtained by taking the second derivative of Eq. (1)]; thus, curvature increases with increasing amplitude and decreasing spatial wavelength. As seen in Eqs. (8) and (10), the pressure required to fully chuck a wafer scales with A∕λ 4 and A∕λ in the long and short wavelength regimes, respectively. Thus, the pressure required to chuck, would increase as curvature increases. While the curvature and required chucking pressure do not both scale with amplitude and wavelength in the same manner, one must remember that the wafer geometry assumed in the analytical model is highly simplified, and changes to the assumed geometry will lead to different dependencies on amplitude and wavelength.
The final wafers examined, wafers 3 and 4, have local geometry features on the backside of the wafers that are a result of multiple wafer processing steps. The overall shapes of the wafers are shown in Fig. 9(a) and 9(d). Figure 9(b) and 9(e) show the local curvature for specific regions of the two Fig. 6 Maximum remaining gap as a function of clamping pressure for wafer 1. The gap decreases with increasing clamping pressure, but the wafer fails to chuck completely, even at 80 kPa. Fig. 7 Maps of the peak-to-valley (PV) of remaining gap per site (10 × 10 mm sites) for wafer 1 at 5 and 80 kPa. The wafer chucks nearly completely over the center of the wafer, but fails to chuck near the edges.  Fig. 7) and the curvature for the corresponding site at a clamping pressure of 80 kPa. Sites that have a gap less than 2 nm (shown in red in the plots) are not included in the correlation as they are considered to be fully chucked.
wafers. The chucking of these wafers were simulated using 3-D finite element analysis and the remaining gap for the same regions of the wafers shown in Fig. 9(b) and 9(e) are shown in Fig. 9(c) and 9(f). In general, reasonable correlation is observed between the geometry features seen in the curvature map and the gap remaining after chucking. Figure 9(g) and 9(h) show the correlation between curvature and remaining gap for wafers 3 and 4. In a previous study, it was experimentally verified that local backside features of magnitude 16 nm interact with lithography scanner chuck to result in a defocus of 20 nm. 14 From the current study, it is clear that wafer backside features interact with chuck to cause contact gaps that may manifest as defocus on the wafer frontside. These example wafers, again, demonstrate that short wavelength features that are not visible in wafer-scale shape maps are crucial in chucking processes.  (2) Turner, Ramkhalawon, and Sinha: Role of wafer geometry in wafer chucking

Conclusions
In this paper, we have established the essential mechanics of semiconductor wafer chucking. Analytical models that predict the maximum amplitude that can be chucked as a function of spatial wavelength and clamping pressure have been established, and can be used to identify wafers that may pose problems in chucking for lithographic processes. Key findings from the analytical and finite element models presented are that the spatial wavelength of the features is crucial in determining chucking performance, and that features with shorter spatial wavelengths (<20 mm) are substantially more difficult to chuck and most likely to lead to incomplete chucking. Finite element simulations of a number of simulated and measured wafer geometries demonstrate the importance of short wavelength features. Such short wavelength features may be identified through filtering or calculation of local curvature-based metrics. The importance of shortwavelength geometry features means that high-resolution (in-plane and out-of-plane) wafer geometry measurements are crucial for identifying wafers with regions that may not chuck completely (z-resolution of a several nanometers and lateral resolution of hundreds of microns are needed).