Critical dimension scanning electron microscopes (CD-SEMs) are widely used as the tool for measuring the actual size, shape, and roughness of device patterns in the semiconductor manufacturing. In order to obtain an image with a high sig-nal-to-noise ratio (SNR), the usage of higher electron beam current or the averaging operation of scanning frames (long observation time) are carried out. Such methods, however, could increase the sample charging effects.
In recent years, the application of deep learning to translation problems from images to images has attracted attention. Isola et al. have proposed the conditional generative adversarial networks (cGANs) as a general-purpose solution of trans-lation problems. The cGANs are shown in figure 1 . The generative adversarial network (GAN), that has become the basis of the cGANs, consists of a generator G and a discriminator D. The generator G translates input images to plausible images. The discriminator D tries to distinguish real images from the image synthesized by the generator G while the gen-erator aims to generate plausible images as possible to fool the discriminator. Just as GANs, the cGANs learn a conditional generative model. This makes cGANs suitable for image-to-image translation task. For the generator, a “U-Net”-based architecture is used. For the discriminator, a convolutional “PatchGAN” classifier is used, which only penalizes structure at the scale of image patches. They have shown that this approach is effective on a wider range of problems.
In this paper, we try to improve the quality (high resolution and high SNR) of a low-quality SEM image using the cGAN. In the training process, two kinds of images, low-quality images observed with low magnification and low SNR and corresponding high-quality images with high magnification and high SNR, are used. In the estimation process, an in-put low-quality SEM image is translated (outputted) to the high-quality image. Here we report the simulation results. Figure 2 shows a part of the training dataset that includes isolated single line. The high (upper row) and low (lower row) quality SEM images consist of 512 x 512 and 256 x 256 pixels, respectively. In the training, 3600 pairs (high and low) were used. Figure 3 shows an example of results. (a) is the low-quality input image, (b) the output image by the cGAN, and (c) the corresponding high-quality image (not used in the training). It is seen that the quality of the input image is improved to a degree close to the quality of the high-quality image while maintaining the outline of line edge roughness.
 P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, arXiv preprint arXiv:1611.07004 (2016).