Translator Disclaimer
Presentation + Paper
22 February 2021 A quantified approach of dataset selection for training ML models on hard-to-classify patterns
Author Affiliations +
Abstract
In the semiconductor fabrication process, yield is negatively impacted by defects that appear systematically within specific patterns of the physical layout design. Those defective patterns are popularly known as hotspots, and they can arise due to various causes. There are several known approaches of hotspot detection. One approach for hotspot detection is Machine Learning (ML), where known hotspot and non-hotspot patterns are used for training the model to be used afterwards in prediction of new hotspots. The objective in ML approaches is to maximize the hit rate (i.e. finding all potential hotspots) and to minimize the false alarm rate (i.e. reduce the overhead of false positives). The model’s ability to correctly classify between hotspots and non-hotspots depends on the coverage of the training data set. The real-world challenge in training a ML system to classify hotspots/non-hotspots is the imbalanced nature of the problem, where the known hotspot patterns are always in the minority class. Another challenge specific to the problem of hotspot classification is the difficulty to correctly classify non-hotspots that are similar to hotspots. These “hard-to-classify” patterns are ones with high mask error enhancement factor (MEEF), as small variations in the pattern can make it change between hotspot and non-hotspot. These two challenges cause conventional methods of handling imbalanced training datasets to be inadequate to the problem of hotspot detection. This paper will present a flow for quantified training dataset selection approach and put extra focus on the patterns that are hard to classify due to close similarity with known hotspots. Improved model accuracy is illustrated when adopting the quantified sampling approach compared to conventional sampling approaches.
Conference Presentation
© (2021) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Mohamed Ismail, Mohamed Bahnas, Tiago Reimann, Ilhami Torunoglu, and Kareem Madkour "A quantified approach of dataset selection for training ML models on hard-to-classify patterns", Proc. SPIE 11614, Design-Process-Technology Co-optimization XV, 116140A (22 February 2021); https://doi.org/10.1117/12.2586265
PROCEEDINGS
8 PAGES + PRESENTATION

SHARE
Advertisement
Advertisement
RELATED CONTENT


Back to Top