Nanophotonic components manipulate the light-matter interaction at the wavelength scale to perform complex optical functions. The scope of photonics continues to expand from the stronghold of optical communications  to a wide range of other applications in sensing and life science , metrology , astrophotonics and quantum computing. However, photonic component design has largely remained a time-consuming process that relies on theoretical knowledge and designer intuition. Historically, an initial design is first proposed based on known physical effects and draws from basic building blocks, and then the device performance is usually evaluated using computationally expensive electromagnetic simulations. The design optimization is carried out through sweeps of key design parameters. The choice of the sweep structure and the range of the variables are guided by knowledge and intuition. This approach is limited to structures governed by only a few parameters, and for which the evaluation process can be decomposed into several weakly-interdependent or nested parameter sweeps in a sequential manner. As the structures of nanophotonic devices become more complex, the parameters governing the structures are often strongly interdependent and the number of variables is large. Such is the case for devices employing metamaterials or geometries generated by inverse design [4,5]. In these scenarios, sequential optimization is no longer possible and simultaneous optimization of multiple variables is required.
The complexity of photonics design optimization problems has been increasing for some time. Optimization tools such as genetic algorithms and particle swarm are increasing used to search for high performance designs by varying many design parameters simultaneously. More recently, artificial neural network has been explored to speed up the search process by circumventing the computationally expensive simulation steps. So far, most of these efforts on handling highdimensional design space focus on finding a single optimized design with regard to a pre-selected performance target as the optimization objective. The outcome gives little insight into the relative influence and interaction of all design variables in determining device performance. The process can be regarded as a ‘black-box’. While there can be many designs which meet the same primary performance criteria, they are regarded as isolated solutions. A global perspective on the design of any given photonic device is missing.
In this paper, we first briefly review the current state-of-the-art in the different approaches for tackling high-dimensional design problems in nanophotonics. We then present our work that applies the powerful pattern-recognition capability of machine learning to facilitate the identification and visualization of patterns representing the interplay of all the design parameters, allowing the photonic designer to understand and fully balance various competing concerns that are common in practical implementations. We demonstrate this knowledge-assisted optimization approach that not only vastly reduces the computation cost by limiting the investigation to a small subspace containing good designs, the outcome also inspires new design ideas.
‘BLACK-BOX’ OPTIMIZATION METHODS
Research on optimization of high dimensional design problems is wide-ranging, covering nearly every science and engineering discipline. Here we attempt to categorize the main approaches currently used in nanophotonic design (Fig. 1). We emphasize this is not an exhaustive coverage, nor a reflection on the chronological development. Rather, this is a sparse sampling of recent work, with the intent to reflect the state-of-the-art.
Photonic design processes can be divided into the forward design and inverse design categories. Traditionally, photonic device design is carried out in the forward direction. A device structure is inspired by known physical effects and a library of basic building blocks that has the potential to meet the desired functionality and performance target. When a device is controlled by a small set of parameters, exhaustive evaluation by sweeping across all the variables is feasible. When the number of variables becomes large, such approaches become prohibitive if the evaluation relies on computationally expensive photonic simulations using e.g. FEM or FDTD methods, or the relationship between variables becomes too complex for human interpretation. Global optimizations methods become useful tools to tackle such problems and generate design candidates optimized with regard to a selected performance target (often called an objective function). Gradient-based optimization methods such as simulated annealing are one such tool set . More commonly used in the nanophotonic fields are genetic algorithm and particle swarm algorithm which are gradient-free methods inspired by natural phenomena [7,8]. In these cases, the device evaluation is generally based on numerical simulations which are computationally expensive. The optimization tools can significantly reduce the number of simulation runs compared to brute-force sweeps. Due to the impressive success in deep learning tasks such as automated language translation and image recognition, artificial neural network (ANN) has received surging interest for evaluating photonic structures efficiently [9,10]. Artificial neural network generates a complex surrogate model of the original structure based on a (large) set of training data. Subsequent device evaluations become a fast query process. Artificial neural network is not an optimization method per se. In fact, ANN can be employed in conjunction with any optimization method, with the benefit of circumventing time-consuming numerical simulations. We consider it a separate category as it is distinctly different.
Inverse design, in other words ‘design-by-specification’, is a powerful approach when the desired outcome lacks an intuition-based structural starting point. There are two representative approaches. The first approach is gradient-based optimization, applied to a structure that first allows continuous distribution of material permittivity within a range [11–14]. Another approach makes use of discrete variables where a structure is divided into pixels on a regular grid, and the optimal structure is explored through combinatorial optimization methods [14,15], many of which are similarly nature inspired (including genetic algorithm and particle swarm). Similar to the forward design, artificial neural network has also been shown useful for the inverse design .
All the optimization approaches discussed above can be considered ‘black box’ methods in the sense that mathematical expressions between the input and the outputs are unknown to the user, whether such relations are based on first-principles or empirical formulations. Furthermore, single or isolated designs are obtained at the end of the optimization process. Often it is the contention that such optimization methods require little user intervention or domain knowledge, and therefore having the advantage of being generally applicable. On the other hand, one of the main issues of those algorithms is that the number of simulation runs remains very large. Second, the simulation/optimization process must be completely repeated when even a small change to the objective is required. Third, they offer little physical insight on the influence and inter-dependence of the design parameters in determining the device performance. Although it is recognized that there may be many designs offering similar performance in terms of the primary criterion (i.e. degenerate designs), their relationship has not been investigated to the best of our knowledge.
Recently, we have taken the first steps to incorporate machine learning (ML) methods including supervised learning, dimensionality reduction techniques, and global optimization, into the photonic component design process [17–19]. The ultimate objective is to create a methodology for identifying the subspaces in parameter space that encompass all good designs (with respect to a performance objective), and then building a complete and validated global map of subspaces, using readily available resources and within a reasonable amount of time. These subspaces can be represented by simple vectors with reduced dimensionality, and they are generally a small fraction of the entire design space. Consequently, further study can be efficiently carried out by focusing on only these lower-dimension subspaces. All good designs can be fully characterized with regard to many performance metrics. Careful balancing of different performance considerations – an indispensable task prior to fabrication, integration and system design – becomes possible.
Realization of next-generation nanophotonic devices with potentially new physics enabled through light-matter interaction at the nanoscale requires significant knowledge about the role of different design parameters in the functionality of a nanostructure, and potential limitations of a particular proposed structure. We show that results and visualization of the subspace mapping can give intuitive understanding in this regard, and inspire new designs.
GLOBAL MAPPING OF THE DESIGN SPACE
Our strategy for mapping out a high dimensional design space is shown in Fig. 2. Details of the strategy, implementation and validation have been reported elsewhere [17–19]. Here we only describe the main steps of the process, and then focus on highlighting the impact such mapping strategy brings to the fore. A vertical grating coupler is taken as the study case. Vertical grating couplers are well suited for high density interfaces between integrated photonic circuits and fibers or lasers. They are however more challenging to design since the desired upward diffraction must be optimized while simultaneously suppressing the second order diffraction that couples back into the waveguide. Recently Watanabe et al., proposed and demonstrated a coupler in silicon-on-insulator with five segments per period, as shown in Fig. 3 . An optimized design, obtained using particle swarm optimization, provides a good fiber-chip coupling efficiency and a fairly low level of back-reflections. The modest complexity of this structure serves as good starting point to develop our strategy.
Our exploration strategy can be divided into 3 steps (Fig. 2). First, global optimization is used to search for designs with maximizing the vertical coupling efficiency (CE) as the objective. A set of designs with coupling efficiency surpassing a selected target (74% in this case) are considered good designs and retained. We have used an in-house optimization algorithm for this step, but other global optimization tools including genetic algorithm and particle swarm can also be used. For the second and key step, machine learning pattern recognition tools are used to find the relationship between these good designs. The goal of this step is to reduce the dimensionality of the design space. Dimensionality reduction transforms a set of correlated variables into a smaller set of new uncorrelated variables that retain most of the original information. This leads to reducing both the range of variables in the evaluation domain and the number of variables in the optimization domain. This way, the evaluation can be focused in the more attractive region, vastly reducing the computational load. In this study, we used principal component analysis (PCA), an unsupervised machine learning pattern recognition technique that has been used widely and successfully across various engineering and science disciplines and is implemented in most scientific computing platforms (e.g. Matlab, R, etc.) . It finds a sequence of best linear approximations to the dataset (based on minimizing the least squared errors) and the results explicitly show how many orthogonal linear projections are needed to represent the dataset within a certain level of accuracy. If a lower dimensional sub-space is found and validated to contain all good designs, the rest of the design space can be excluded from further investigation. We complete the investigation by performing an exhaustive evaluation for all the designs included in the lower-dimensional sub-space to elucidate the interplay between different structural and performance parameters, including the coupling efficiency, back-reflections, minimum feature size and fabrication tolerance.
The original design space is defined by the 5 segment lengths (L1, L2… L5) shown in Fig. 3. Through the PCA analysis, we found a compact representation of the sub-space of all good designs that can be characterized by two principal components V1αβ and V2aβ. Each good design candidate is now represented by coefficients α and β as Lk = αkV1αβ+βκV2αβ + Cαβ. Such a reduction in the number of dimensions makes it feasible to adopt a more classical design approach and perform an exhaustive exploration of the reduced parameter space. Fig. 4 summarizes some performance parameters that can be easily obtained once the 2D subspace of good designs is identified.
Below we discuss each aspect in more detail. The maps of the coupling efficiency and back reflection are shown in Fig. 5 as a function of the principal component coefficients, for all designs with a coupling efficiency larger than 0.7. Each division of the axis represents 100 nm in Manhattan distance. There are clearly a wide range of structures that all offer high coupling efficiency. On the other hand, the attainable reflection can be quite different, ranging from -15 dB to -44 dB for a wavelength of 1550 nm. Table I lists the structural and performance parameters for three selected designs as marked in Fig. 5.
Structural and performance parameters of the three grating designs as marked in Fig. 5 on the hyperplane. These three designs offer similar coupling efficiency (higher than 74%), but the achievable back-reflections are quite different.
|Design||L1||L2||L3||L4||L5||Vertical coupling efficiency||Back reflection [dB]|
With even the most advanced fabrication technologies today, fidelity in pattern transfer is still a challenge and dimensional variation is inevitable. It is of particular interest to investigate the robustness of design candidates against such uncertainties, which often have strong impact on the device performance. On the other hand, fully evaluating the statistical behavior of a device requires a large number of computations, with Monte Carlo simulation as the standard ‘brute-force’ method. Stochastic techniques such as polynomial chaos expansion have emerged as efficient alternatives to assess the performance robustness and expected fabrication yield [19,21,22]. Even with these advances, carrying out robustness assessment during the device optimization process requires prohibitive computation resources as well as being wasteful, since most evaluations would be carried out on structures that do not meet the primary objective. Our approach makes such analysis feasible. Once the subspace of good designs in terms of the primary objective is identified, analysis of the fabrication robustness can be focused on these structures. Fig. 6 shows a few examples of such analysis, setting the criteria as coupling efficiency larger than 0.7 and back-reflection less than -25 dB. Details of the method and analysis have been reported in ref. . As can be observed from Fig. 6, all three designs offer higher than 50% yield in terms of the coupling efficiency. For some applications, such as coupling the photonic circuit to a laser, low back reflection is a paramount requirement. In this case, the yields are very limited for achieving a low back-reflection of -25 dB. This finding indicates the periodic grating design investigated here is insufficient for such applications, and design modifications such as apodization are necessary.
Another aspect important for design implementation is the minimum feature size requirement. To have the potential for volume production, designs should be re-producible by deep UV photolithography, for which 100 nm is considered to be the lower limit. Even though immersion deep UV lithography can handle smaller features, it is beyond reach for most research and manufacturing facilities. Pattern transfer for features less than 100 nm is generally carried out using electron-beam lithography, a sequential and therefore time consuming process. From Table I, we observed that the minimum feature sizes for these selected gratings are approximately 80 nm. Is it possible to find designs to break this limit, while keeping acceptable performance? This information is easily obtained as a query process on the hyperplane as all design dimensions can be retrieved from the relation Lk = αkV1αβ+βκV2αβ + Cαβ. This process does not require further photonic simulations. The outcome is shown in Fig. 7(a). We observe the largest possible minimum feature size is below 88 nm. Investigating the distribution of the minimum feature size, together with machine learning assisted search, we discovered a new class of designs illustrated in Fig. 7(b). By incorporating subwavelength grating (SWG) metamaterial to reduce the effective medium refractive index, we are able to simplify the design to consist of only 4 segments. It is further possible to have grating structures with a minimum feature size larger than 100 nm in both the propagation and transverse directions, while maintaining a similar level of performance. Further details of these designs will be reported elsewhere.
Nanophotonic research is encountering increasingly complex structures to meet the demands in functionality and performance. This trend is accompanied by challenges involving high-dimensional design optimization. Black-box global optimizers such as particle swarm and genetic algorithm have gained popularity but their outcome only provides isolated designs. They do not allow the analysis and visualization of the impact of design parameters on the device performance, which is often crucial for gaining a deeper insight into the design structure. Furthermore, a lack of such analysis and insight hinders the designer’s ability to effectively balance multiple criteria determined by the specific applications.
In our work we introduce the powerful capabilities of machine learning tools to bring new approaches to the device design flow. Here we demonstrate that dimensionality reduction techniques such as principal component analysis can reduce the large number of correlated design variables to a smaller set of orthogonal variables, significantly simplifying and clarifying the design problem. In the study case of a vertical grating coupler, all good designs with high coupling efficiency (>74%) can be projected onto a 2D hyperplane.
By so doing, exhaustive mapping of this subspace becomes achievable with modest computation resources. Multiple performance metrics, including the coupling efficiency, back-reflection, fabrication robustness and yield can be clearly visualized. The intuitive understanding gained through this procedure gives the designer guidance to navigate the complex design space. The mapping exercise also gives insight into the achievable minimum feature size and where the design bottlenecks are. These findings and machine learning assisted search algorithm enabled the discovery of new designs that increased the minimum feature size to above 100 nm by incorporating subwavelength grating metamaterial. This is an important improvement since deep UV lithography, a volume manufacturing tool, can be used for fabrication.