Digital twin modeling and controlling of optical power evolution enabling autonomous-driving optical networks: a Bayesian approach

Abstract Optical networks are evolving toward ultrawide bandwidth and autonomous operation. In this scenario, it is crucial to accurately model and control optical power evolutions (OPEs) through optical amplifiers (OAs), as they directly affect the signal-to-noise ratio and fiber nonlinearities. However, a fundamental contradiction arises between the complex physical phenomena in optical transmission and the required precision in network control. Traditional theoretical methods underperform due to ideal assumptions, while data-driven approaches entail exorbitant costs associated with acquiring massive amounts of data to achieve the desired level of accuracy. In this work, we propose a Bayesian inference framework (BIF) to construct the digital twin of OAs and control OPE in a data-efficient manner. Only the informative data are collected to balance the exploration and exploitation of the data space, thus enabling efficient autonomous-driving optical networks (ADONs). Simulations and experiments demonstrate that the BIF can reduce the data size for modeling erbium-doped fiber amplifiers by 80% and Raman amplifiers by 60%. Within 30 iterations, the optimal controlling performance can be achieved to realize target signal/gain profiles in links with different types of OAs. The results show that the BIF paves the way to accurately model and control OPE for future ADONs.


Introduction
Optical fiber has been widely utilized in the fields of communication 1 and sensing, [2][3][4] playing an critical role in daily-life communications, the military, scientific research, and other fields.Specifically, in the field of communication, fiberoptic communication carries a majority of global data traffic due to its low attenuation and large capacity. 5 In recent years, due to the information explosion caused by high-definition video streaming, cloud computing, artificial intelligence, and so forth, global communication traffic has grown exponentially. 6,7o sustain such dramatic traffic growth, optical transmission technologies such as coherent transceivers with multilevel modulation formats 8 and advanced digital signal processing 9 have been developed, pushing the fiber capacity to approach the Shannon limit 10 for a given spectrum bandwidth.
Today, to further increase the fiber capacity, multiband 11 and spatial-division multiplexing 12 systems have been introduced to exploit more spectrum resources.The multiband solution is more appealing, as it upgrades existing fiber infrastructures in a cost-effective manner.Recently, commercial systems have been extended from C-band to C+L-band, scaling the available bandwidth from ∼4 to ∼12 THz. 13 On the other hand, autonomous-driving optical networks (ADONs) 14,15 are being investigated and developed to improve network performance and lower operational expenditures (OpEx).By leveraging extensive physical layer data, ADON is expected to achieve automated service provisioning, power optimization, and failure management, eliminating the need for human intervention.
In a fiber-optic communication system, the optical power of signals evolves over fiber and also varies at different wavelengths, exhibiting a complex two-dimensional process.Modeling and controlling the optical power evolution (OPE) are crucial for enabling multiband ADONs. 16First, optical power determines the optical signal-to-noise ratio (OSNR) and fiber nonlinear effects, 17 both of which significantly affect the signal transmission quality and achievable information rate.This is especially crucial for multiband systems, since wider bandwidth usually involves more severe Kerr nonlinearity 18 and inter-channel stimulated Raman scattering (ISRS). 19Furthermore, in future ADONs, the static fiber channel should be upgraded into a programmable paradigm, which allows for dynamic configuration of the optical powers for signals with different transmission requirements, enabling more flexible and efficient utilization of network resources.
In multiband ADONs, OPE is mainly influenced by fiber propagation and amplification process.Since the fiber propagation process, including attenuation and ISRS, can be accurately calculated, [20][21][22] the main challenge in modeling and controlling OPE lies in optical amplifiers (OAs).However, modeling and controlling an erbium-doped fiber amplifier (EDFA) 23 in current C-band systems are already difficult tasks [24][25][26][27] due to the complex wavelength-dependent gain characteristics of EDFAs in dynamic link conditions. 28,29In multiband systems, the complexity further increases due to the adoption of multiple homogeneous and/or heterogeneous OAs, 30 which are used to provide broader gain bandwidth and mitigate severe ISRS.In this case, different types of OAs amplify signals through various nonlinear effects, such as the stimulated emission or SRS.These nonlinear effects can be expressed by a set of ordinary differential equations (ODEs) with no closed-form solutions.Consequently, modeling and controlling OPE with diverse OAs pose considerable challenges.
In the past decades, extensive research has been carried out to achieve precise modeling and controlling of OPE through OAs in fiber-optic communication systems.The first direction is to rely on human intelligence. 31,32However, such approaches are limited in accuracy, as they fail to consider the distinctive characteristics of each OA caused by manufacturing discrepancies and operating conditions. 33,34For instance, with the center of mass model, 35 the gain profile of an EDFA can be modeled with merely one measurement of a baseline spectrum.However, this approach only achieves a root-mean-squared error (RMSE) of about 0.4 dB. 26Such a level of accuracy is insufficient for assisting the ADON.
Recently, the concept of the digital twin (DT) has emerged to mirror the real-time status of each optical device based on collected data.This is especially crucial for OA modeling due to the diverse designs, manufactory discrepancies, uncertain parameters, and device aging impacts in OAs.7][38] However, a bottleneck in implementing data-driven models is the requirement for large training data sets to achieve high levels of accuracy.For example, with about 12,000 pieces of measured data, the RMSE of the gain model for an EDFA can be reduced to 0.1 dB. 26In some cases, the training data set includes 40,000 data samples. 24In a real system, the procedure for collecting such a large amount of data can be costly and time-consuming.Even though some methods utilize transfer learning to reduce the needed data size, 36 the data size of the pretraining is still high, which is hard to be achieved in a real system with various types of OAs from multiple vendors.Therefore, a reliable method that can assist the accurate modeling and controlling for OA with data as little as possible is desired for future ADONs.
In this paper, we propose a Bayesian inference framework (BIF) to construct the DT of OAs and control OPE in a reliable and data-efficient manner.In this framework, a few initially collected data are used to train a Gaussian process regression (GPR) 39 surrogate model, which can provide the mean and variance of the estimation.By designing the acquisition functions based on the surrogate model, only the most informative data is sequentially collected.This approach can help obtain accurate digital models for modeling OAs and finding the optimal system configurations for controlling OA systems.Compared with the traditional methods for modeling and controlling OAs, the BIF can achieve higher accuracy and significantly reduce the amount of needed data.In this paper, the performance of the BIF is evaluated in both the EDFA system and RA system through simulations or experiments.In terms of modeling, the BIF can reduce more than 80% and 60% data to model the EDFA and RA, respectively.For the online controlling, the target gain/signal power spectra can be realized within 30 iterations.The optimal performance can be achieved with an RMSE of less than 0.5 dB in most cases.In the next section, the architecture of the BIF is introduced.The real-time experiment and simulation investigations for modeling and controlling are demonstrated.

Architecture of BIF for OA Modeling and Controlling
When modeling and controlling OAs, the aim of BIF is to optimize the sampling or operating strategies to achieve the best modeling or controlling performance.Based on prior observations, the BIF sequentially selects the next to-be-measured signal spectra or amplifier configurations.As shown in Fig. 1(a), first, a training data set containing fewer than five sets of precollected spectra with the corresponding initial amplifier configurations is constructed.Afterwards, a surrogate model is trained based on the GPR to quantify the optimization performance.According to the output of the GPR, the next to-besampled data are selected by an acquisition function and then measured automatically.After adding the newly measured data to the training data set, the surrogate GPR model can be updated to decide the next data for sampling.If the BIF is used for modeling, an accurate model for the OA can be obtained iteratively.
If the BIF is used for online controlling, the optimal system configuration to achieve the target signal/gain spectra can be realized.
When constructing the GPR surrogate model, the training dataset can be written as D ¼ ½x i ; y i , where x i and y i represent the i'th input and output of the GPR model, respectively.The mapping between the input and output, denoted as fðxÞ, is described by the Gaussian process (GP) of fðxÞ ∼ GPðmðxÞ; kðx; x 0 ÞÞ; (1 where mðxÞ is the mean function of the GP and kðx; x 0 Þ is the covariance function, which is the 'kernel' for evaluating the similarity of each sample.mðxÞ and kðx; x 0 Þ can be learned from data during training.For a new input, which is written as x Ã , the estimation of the output, denoted as fðx Ã Þ, follows the joint Gaussian process of where K is the covariance matrix of the training dataset.k and k T are the covariance between the training set x and x Ã .β is the hyperparameter representing the noise of the measurement.Therefore, the estimated mean μ Ã and variance Based on the GPR surrogate model, the next to-be-sampled data are decided by an acquisition function.Since the optimization target is different among tasks, the sampling strategies are different, resulting in the customized design of the acquisition functions. When modeling OAs, the surrogate GPR model is constructed as a DT of an OA.The input of the GPR-based DT is the input signal spectra and amplifier configuration parameters.The output is the gain spectra of the OA under test.To achieve data-efficient modeling, as shown in Fig. 1(d), the BIF focuses on the exploration (sampling the places with a high σ 2 Ã ) of the whole feature space.Therefore, the acquisition target is to sample the most informative data, which can be categorized as a problem of Bayesian active learning. 40In our work, the acquisition function is designed by uncertainty sampling, which means sampling where the uncertainty is high.The sampling strategy can be written as arg max x σ 2 ðxÞ; ( where σðxÞ represents the estimation variance of the sample x.
In this way, the most uncertain candidate spectra are selected for the next round of measurement.After several iterations, the surrogate model can converge to a high accuracy.
2][43][44] In this case, the surrogate model is the objective function, which quantifies the controlling performance.The input of the surrogate model is the OA configuration.The output is the estimated controlling performance such as the error between the current system value and the target.The commonly used acquisition functions are expected improvement (EI), 41,45 probability of improvement (PI), 41 and upper confidence bound. 46In our study, we find that these acquisition functions achieve a similar performance, and we choose the EI for the following evaluations.The acquisition formulation of the EI can be written as where . μðxÞ and σðxÞ represent the current estimated mean and variance value of the sample x, respectively.
ΦðZÞ is the cumulative distribution function, which represents the probability distributions of the improvement.ϕðZÞ is the probability distribution function following the standard normal distribution.ξ is a hyperparameter controlling the balance between searching the global optimal and exploring the whole data space.Based on EI, the acquisition strategies can be represented as arg max x EIðxÞ: (7)   After iterative searches, the most suitable system configuration, i.e., x, is obtained.

Constructing DT of EDFAs Based on BIF
To further improve the performance of OA modeling, other methods have been proposed previously with some data sets. 47,48For example, the transfer-learning-based modeling scheme 36 proposes to initially train a basic model, which is then transferred to the specific EDFA.In addition, the hybrid modeling method 49 integrates the estimations of the analytical models to the input for higher precision.These methods have successfully improved the modeling performance by designing the training scheme and the input features.However, the design of the data selection scheme is not thoroughly investigated.For modeling OAs, we aim to build an accurate DT with data as little as possible.In this case, the BIF pays more attention to the exploration, and the uncertainty-based acquisition function is utilized.To evaluate the performance of the BIF, an experimental validation for modeling EDFA is conducted.
First, we show the performance of modeling a commercial EDFA based on the BIF in an experiment.As shown in Fig. 2(a), an automatic EDFA measuring system is built.An ASE noise source is used for simulating the flat full C-band spectrum.After filtered by a Finisar WaveShaper 4000A optical spectrum processor, 80 channels with 50 GHz spacing in the C-band from 192.1 to 196.1 THz are generated.Among them, 40 odd-numbered channels are selected to establish signals while other channels are filtered out.Two optical spectrum analyzers (OSAs) are used to measure the input and output power spectra of the EDFA.The operating mode of the EDFA under test is set as the automatic gain control (AGC) mode with a gain of 16 dB.When generating the repository of the gain spectra, the 40 odd-numbered channels are assumed to be occupied or idle randomly.A random deviation of each signal power from −2 dB to 2 dB with a step size of 1 dB is involved by setting the attenuation of the WaveShaper.In total, 9578 to-be-measured input spectra are generated as the candidate repository.Additionally, to evaluate the forward modeling performance, a testing data set containing 1002 pairs of input and output spectra is generated randomly by the automatic measuring system.
When building the DT for the EDFA, the input is a vector representing the power spectrum of the signals before amplification.The output is the corresponding gain spectrum.In this experiment, we compare the modeling performance with the traditional models based on NNs that are trained with data sampled randomly as the baseline.To investigate the performance, the RMSEs of each model on the testing data set are calculated.Considering the impact of various initial random data sets, each training process undergoes five iterations to mitigate performance fluctuations.The mean RMSEs are then plotted in Fig. 3(a), while the error bars represent the performance fluctuations across different training iterations.For the NN-based models with random data selection, the estimation errors are large when the data size is small.The best accuracy achieved with 500 training instances is about 0.12 dB.In contrast, the proposed BIF-based model converges at a high speed and can reduce the RMSE to about 0.1 dB with fewer than 200 instances, demonstrating its significant learning ability.To achieve the same RMSE on the same testing data set, the proposed method can largely reduce the training data size by 80%, making it possible to prepare a customized tiny data set for building the precise gain model for each EDFA.Moreover, the performance of the proposed model is relatively stable because the error bar is smaller when the data size is large.The violin plot of the errors of the models trained by different methods with different amounts of data is plotted in Fig. 3(b).The maximum and minimum errors are plotted.The results show that the model trained by the proposed BIF has lower estimation errors and converges faster, demonstrating its strong ability of selecting data and learning.
As mentioned before, to achieve the data-efficient forward modeling, the BIF employs the data-efficient GPR modeling method and data selection strategy.Here we further investigate the contributions of the GPR algorithm and the data selection strategy individually.In Fig. 3

Constructing DT of RAs Based on BIF
Besides EDFA modeling, the performance of the BIF for RA modeling is also investigated through simulations.We consider a more complex situation by modeling the generalized signal-to- noise ratio (GSNR) of arbitrary signals under a certain pump configuration of an RA.The GSNR can be expressed as where P Rx S , P Rx ASE , and P Rx NLI denote the power of signal, ASE noise, and NLI noise before the receiver, respectively.In such a situation, both the ASE noise and fiber nonlinearity are modeled, which is more challenging, since they are strongly related to both the signal and pump configurations.Similar to the EDFA modeling, we apply the exploration-preferred BIF using synthetic data for evaluation.
The data set for training is generated by simulations based on the GNPy, 50 which is a commonly-utilized Python tool for calculating the fiber nonlinearity based on Gaussian noise model. 22,51The simulation setup is shown in Fig. 2(b).Arbitrary C+L-band signal spectra are generated by randomly selecting a flat launch power from −3 to 4 dBm, with a ripple of AE3 dB of each channel.For each signal, the baud rate is 142.8 GBaud and the channel spacing is 150 GHz.The transmitted fiber is the standard single-mode fiber (SSMF).The pump number in the RA is 6, of which the wavelengths are 1513, 1496, 1477, 1458, 1432, and 1420 nm, respectively.As with the EDFA modeling, the pump powers are fixed to set a relatively flat gain spectrum of 10 dB.The pump powers are 40, 30, 20, 120, 300, and 300 mW, respectively.In total, 3997 data are generated.We use 500 data as the testing data set and 3497 are used as the data repository for the BIF-based data selection.
Similar to EDFA modeling, the compared baseline model is based on NN and trained by a data set generated randomly.For both the NN-based model and the BIF-based model, the input features are the 80-dimensional vector representing the signal power of each channel.The outputs are the GSNR of each signal.Considering the influence caused by the randomness of the initial data set, the training process for each method is conducted 5 times to reduce the performance fluctuations.The mean RMSEs of the models obtained under different data sizes are plotted in Fig. 3(c).The differences among these training processes are shown as the error bar.The results show that the proposed method can achieve higher accuracy with different data sizes.To achieve a similar accuracy, the BIF can reduce more than 60% of the training data, demonstrating its efficient learning ability.Additionally, the error bar of the BIF-based model is much smaller, demonstrating its higher stability.The violin plot of the estimation error is shown in Fig. 3(d) with the maximum and minimum errors.The results show that the BIFbased model can converge faster with smaller extreme errors, proving its better learning ability.

Controlling EDFAs Based on BIF
For ADONs, efficiently shaping the signal/gain power spectrum is desired to assist dynamic network optimizations.To achieve this, BIF-based online controlling is proposed.For EDFA, the flat or tilted signal spectrum after amplification can be realized by adjusting the input signal power spectrum.For RA, the target gain spectrum can be realized by adjusting the pump powers.For both EDFA and RA systems, we perform experiments to demonstrate the effectiveness of the BIF for controlling the OPE.
First, Fig. 4(a) shows the experimental verification with a C-band EDFA.We employ an experimental setup similar to the one used in the previous section for modeling EDFA.The EDFA is configured in the AGC mode with a gain of 17 dB.During online controlling, we adjust the signal spectrum prior to amplification to obtain the target signal spectrum after amplification.Specifically, we adjust the attenuation factors of the WaveShaper as the control parameters, with one attenuation factor for every five consecutive WDM signals.So, in total, there are eight parameters to control.This type of adjustment can be realized in real systems by controlling the wavelength-selective switch (WSS) at the beginning of each optical multiplex section (OMS).
Figure 4(b) shows the amplified signal power spectra with and without the BIF-based online control.The first line shows the measured spectra controlled with traditional simple adjustments.Specifically, the mean value and tilt values of the measured signal spectrum are calculated through linear fitting.Subsequently, the differences in mean and tilt values between the measured and the target spectra are calculated and adjusted by the optical spectrum processor.The second line shows the spectra achieved by BIF.Three scenarios are considered with different target spectra.The corresponding RMSEs are shown in Fig. 4(c).Both the signal spectra and the error histograms can prove that, compared with the traditional spectrum controlling method, the BIF can achieve a better performance.To illustrate the changes of the spectrum during online control, Fig. 4(d) shows the changes of the RMSEs between the measured spectra and the target spectra when the target spectrum is flat with a value of -3 dB.Results show that the BIF can quickly adjust the signal spectrum within 30 iterations, demonstrating the efficiency of the BIF for online controlling.

Controlling RAs Based on BIF
The performance of the BIF for online controlling is also evaluated in systems with RA.For the OA controlling schemes, previous controlling strategies for controlling RAs can be categorized into two types.3][54] For this type of method, one round of generation could include tens of candidates, necessitating several rounds of measurements for one iteration.6][57][58][59] However, this type of method requires a pretrained model that needs a substantial number of measured spectra (hundreds to thousands).In our work, the BIF is utilized to adjust the power of each pump to obtain the target gain profile in a data-efficient manner.The experimental setup of the RA system is shown in Fig. 5(a).First, an ASE source is used to emulate the C+L-band signal spectra.After attenuation, the total signal power is set as 15.5 dBm.The transmitted fiber length is 82.8 km.In our experiment, we consider the counter-Raman amplification, which means only the backward pumps are utilized for amplification.The RA has four pumps, of which the wavelengths are 1428, 1454, 1490, and 1509 nm.Since the power of each Raman pump cannot be directly controlled, we control the pump power by adjusting the current of the digital-to-analog converter (DAC).Then, the on-off gain spectra are collected by an OSA.All the control and data processing are through a host computer.
In Fig. 5(d), 460 sets of pump configurations are generated randomly, and the corresponding on-off gain spectra are plotted.As shown in Fig. 5(d), the on-off gain constructed by different pump powers can range from 0 to 10 dB with different shapes.This result proves that, if directly searching for the pump configuration, the best configuration may be hard to achieve.Therefore, an efficient online controlling method is desired to realize various gain spectra without human intervention.
By applying the BIF in this use case, the size of the precollected initial data set is set as five.Afterwards, the exploitationpreferred BIF is employed with the EI optimizer.The RMSE of each iteration when setting a flat 7-dB gain spectrum as the target gain is plotted in Fig. 5(e).The results show that the BIF can quickly find the correct direction to adjust the pump power combinations and then converge to a low RMSE of ∼0.3 dB.In Fig. 5(f), the on-off gain spectra of each iteration are plotted.We observe that the gain spectra gradually converge to the target gain, which demonstrates the effectiveness of the proposed method.In most cases, the BIF can approach the target spectrum within 10 iterations and then slightly fine-tune the pump configurations to obtain the optimum performance.
As shown in Figs.5(g)-5(j), we plot some gain spectra during the online controlling as examples.First, the gain spectra are far away from the target gain spectrum before the fifth iteration.But then it quickly gets closer to the target in the tenth iteration by increasing the powers of all pumps.Afterwards, it starts fine-tuning the pump configuration and gradually achieves the optimal design.In Figs.5(b) and 5(c), the online controlling performance based on the BIF under different target gain spectra is shown.The BIF can work well in multiple scenarios, and the convergence speed is relatively stable.
To further investigate the generalization of the proposed BIF, we validate the online controlling performance by considering situations with different pump numbers and wavelengths.We conduct simulations considering four, five, and six pumps in an RA system based on the GNPy.In simulations, more scenarios with different types of target gain spectra are investigated.First, the flat on-off gain spectra of 6, 8, 10, 12, and 14 dB are set for evaluation.In addition, the tilted gain spectra for compensating the SRS are set as the target gain.As shown in Fig. 6, the dashed lines are the target spectra and the solid lines are the spectra obtained by the BIF.The results show that the BIF can identify the best pump configurations to generate the desired gain spectra with various pump configurations.

Discussion
In this work, the Bayesian inference is utilized for modeling and controlling the OPE in optical fiber communication systems.The incorporation of Bayesian probability enriches the model's output by estimating both the mean and variance.Therefore, more comprehensive information is available for data selection.Additionally, the online iterative sampling scheme ensures that each piece of data is utilized to guide the subsequent data collection.Consequently, the efficiency of the data collection is enhanced to reduce the needed data size.Moreover, the BIF allows for flexible design of diverse data collecting objectives for both DT modeling and OA control.It shows the potential in constructing a DT during OA controlling, thereby facilitating future autonomous network operations.
The next critical step involves modeling the OPE during signal transmission across cascaded fiber spans and OAs.Considering a practical long-haul transmission system, the optical power is attenuated by fibers, connectors, and other devices, such as WSSs, and amplified by OAs.Therefore, modeling the OPE over a long-haul link requires the accurate modeling of each optical device and its cascaded effects.
Moreover, the complexity of OPE control arises due to the heterogeneous parameters from various devices in these longhaul links.This complexity is further compounded in scenarios with different wavelength loadings in each OMS.The control sequence and step size among different parameters in different OMSs should be carefully designed.
Additionally, frequent network operations impose higher reliability requirements.The control of OPE should not disrupt existing services, highlighting the need for a precise assessment of reliability in both modeling and controlling processes.
The BIF holds the potential to address the above challenges effectively.Its inherent ability to estimate probabilities makes it well suited for reliable assessment and data selection.Therefore, it can contribute to achieving efficient and highly reliable autonomous OPE modeling and control, aligning with the demands of future ADONs.Different colors denote different iterations.The dashed black line is the target gain spectrum.Some obtained gain spectra during the online controlling, i.e., the red points in panel (e) are plotted in (g), (h), (i), and (j), respectively.The pump DAC values of each iteration are plotted in the second line accordingly.

Conclusion
Constructing the DT and controlling OPE is crucial for enabling multiband ADONs.To accomplish these goals, modeling and controlling OAs are the main challenges.In this work, we propose BIF to model and control OAs in a data-efficient manner.The BIF employs a selective data collection strategy that effectively balances exploration and exploitation in the search space.
Simulations and experiments have demonstrated the effectiveness of the BIF in modeling OAs.Compared to traditional NN-based models that use randomly selected data, the proposed BIF significantly reduces the data requirements for accurate modeling.Specifically, it can reduce the required data by 80% for EDFA and 60% for RA.This reduction in data requirements enhances the feasibility of deploying data-driven models in commercial OAs.
In terms of controlling, the BIF assists network controllers in adjusting OA configurations and transmitted signal profiles to achieve a target profile.Within a maximum of 30 iterations, the BIF successfully realizes the desired signal/gain profiles with RMSEs of <0.5 dB in most cases.Importantly, the proposed BIF is not limited to specific link conditions or OA types, making it applicable to a wide range of scenarios in various ADON systems.

Simulation Details for RA Systems
The power propagation along a fiber with RA can be described by ODEs, 60 which can be written as where P p and P s represent the Raman pump power and signal power, respectively.z is the transmission distance.C R is the Raman gain efficiency.f s and f p are the frequency of the signal and pump, respectively.α s and α p are the attenuation of the signal and pump, respectively.The on-off gain can be calculated by G ON−OFF ¼ P s;pumps ON ðLÞ P s;pumps OFF ðLÞ ; where P s;pumps ON ðLÞ and P s;pumps OFF ðLÞ represent the signal power at the distance of L with Raman pumps on and off, respectively.The simulation verifications are conducted using the GNPy. 50he fiber in simulations is SSMF with an attenuation of 0.2 dB∕km, a nonlinearity coefficient of 1.3 W −1 km −1 , and a chromatic dispersion coefficient of 16.7 ps nm −1 km −1 .

Training Details for EDFA DT Models
In the use case of EDFA modeling, two types of models are trained.First, the baseline NN-based model has two hidden fully connected layers with 40 neurons in each layer.The activation functions are sigmoid. 61The optimizer is Adam. 62During training, 80% of the available data are used for training, and 20% are used for validation.The total epoch number is set as 10 6 , and early stop is employed with a patience of 100.For the proposed GPR-based model, the utilized kernel is the radial basis function (RBF). 61The noise hyperparameter, i.e., alpha, is set as 10 −5 .To conduct a fair comparison, the two models share the same input and output.Specifically, the input feature is a vector representing the power value of each WDM signal before amplification.Since 40 channels are considered, the feature vector has 40 dimensions.The output is the gain value of the signal in each WDM channel.Min-max normalization is conducted for both the input features and labels because we found that this type of preprocessing can achieve the best performance.RMSE is calculated for accuracy evaluation.Moreover, since some of the channels are idle, the estimations of these channels are deleted for RMSE calculation.

Training Details for RA DT Models
The training processes of RA modeling are similar to those for EDFA modeling but with different input/output data dimensions and hyperparameters.For training data, the input features include the signal power of each WDM channel, represented as an 80-dimensional vector.The outputs are the GSNRs of each WDM signal, represented as an 80-dimensional vector as well.The NN-based baseline models have two hidden layers with 80 neurons in each layer.The activation function is sigmoid.The optimizer is Adam.80% of the data are used for training and 20% of the data are used for validation.The total epoch number is set as 10 6 , and early stop is employed with a patience of 1000.The GPR-based surrogate model employs an RBF kernel.Min-max normalization is utilized to process both the input and output of the data set.

Parameters for Controlling the EDFA System
The controlling objectives are the attenuation factors of the WaveShaper for every five consecutive WDM signals.For the total 40 channels, there are eight parameters.The online controlling is realized by utilizing the Bayesian-optimization Python tool. 63During online controlling based on the BIF, the surrogate GPR model has a noise hyperparameter of 10 −4 .The number of initial sampling data is 2, and an EI optimizer with a ξ of 10 −9 is employed.Domain reduction 64 with a minimum window length of 0.2 is used to speed up convergence.

Parameters for Controlling the RA System
To control Raman pump powers, the configured parameters are the DAC values of four pumps.The Bayesian-optimization Python tool 63 is utilized for conducting online controlling, and the initial sampling number is 5.The hyperparameter of the surrogate GPR model is 10 −5 .The EI optimizer with a ξ of 10 −5 is employed.Domain reduction with a minimum window length of 0.01 is used.
Xiaomin Liu received her BE degree in information engineering from Shanghai Jiao Tong University (SJTU) in 2020.She is currently pursuing a PhD in the Department of Electronic Engineering, SJTU.Her current research interests include modeling, monitoring, and optimization in optical networks.
Yihao Zhang received his BS degree in information engineering from Shanghai Jiao Tong University (SJTU), Shanghai, China, in 2021.He is currently pursuing a PhD in electronic and information engineering at SJTU.His research interests include the optical amplifier modeling and optimization, optical networks modeling and optimization, and fiber nonlinearity modeling.
Yuli Chen received his BS in material science and engineering from Shanghai Jiao Tong University (SJTU), Shanghai, China, in 2023.He is currently working toward his MS degree in electronic and information engineering at SJTU.His research interests include optical amplifier modeling and optimization, optical networks modeling and optimization, and optical performance monitoring.

Fig. 1
Fig. 1 The architecture of the proposed BIF for modeling and controlling OAs.(a) The general workflow of the BIF.(b) The employed data collecting method in real systems and simulations.(c) The input and output of the surrogate GPR model.(d) The exploration-preferred acquisition workflow of the OA modeling.(e) The exploitation-preferred acquisition workflow of the OA controlling.

Fig. 2
Fig. 2 The diagram of collecting training data for OA modeling.(a) The experimental system for measuring EDFA spectra.(b) The simulation workflow for calculating the GSNR of an RA system.
(e), we plot the RMSEs of the GPR-based and NN-based models using the data selected by the BIF or randomly.Four cases are considered: (1) the baseline NN, (2) the NN with the BIF-based data selection, (3) the GPR model without the BIF-based data selection, and (4) the proposed BIF-based model.The error histograms of models trained with different methods on the testing data set are shown in Fig. 4. Results show that compared with NN, the GPR-based model shows a better learning ability.By using the training data selected by the BIF, both the NN and GPR can have a higher accuracy compared with the model learned from the randomly selected data.Considering the training time, we train these models with 48 GB of 2400 MHz RAM and an Intel Core i9-9900k 3.6 GHz CPU.For the NN-based and GPR-based EDFA models, the needed training time with 500 training data is 28 and 15 s, respectively.The training time of the NN is primarily based on the configuration of the training procedure, including factors such as the batch size, the number of epochs, and the patience threshold.The training time of the GPR model mainly depends on the size of the covariance matrix, of which the complexity is OðN 3 Þ.If the training data size is large, the calculation time of the GPR can be long.Nevertheless, since the BIF significantly reduces the needed data size, the training time of the GPR in our experiment can be effectively managed within a reasonable range.

Fig. 3
Fig. 3 The modeling performance of the proposed BIF for EDFA and RA.(a) and (c) The RMSEs of the models under different data sizes for EDFA modeling and RA modeling, respectively.(b) and (d) The violin plots of the modeling errors under different data sizes for EDFA modeling and RA modeling, respectively.The maximum and minimum errors are plotted.(e) The error histograms of the EDFA gain models trained with 70 data with or without BIF-based modeling design and data selection.

Fig. 4
Fig. 4 The experimental setup for online controlling in an EDFA system.(a) The controlling workflow of the automatic EDFA measuring system.(b) The signal power of each channel after amplification with different controlling targets.The solid lines are the results obtained by BIF and the dashed lines are the target signal spectra.Figures in the first line show the measured spectra when only adjusting the mean power and the tilt of the transmitted signal spectra.The second line shows the measured spectra when utilizing BIF.(c) The RMSEs' comparison between the traditional adjustment and BIF-based online controlling.(d) The changes of RMSEs between the measured spectra and the target spectra during BIF-based online controlling when the target gain is flat and set each channel as −2 dBm.

Fig. 5
Fig. 5 The experimental setup for controlling the gain spectrum of an RA.(a) The controlling workflow of the automatic RA controlling and gain measurement system.(b) The obtained flat gain spectra of the RA based on proposed BIF.(c) The obtained tilted gain spectra of the RA based on proposed BIF.(d) The randomly generated gain spectra of the RA under various combinations of pump powers.(e) The RMSEs for each iteration during the online controlling.(f) The measured gain spectra after each iteration during the online controlling.The gray lines are the precollected gain spectra, and the solid lines are the collected gain spectra during the online controlling.Different colors denote different iterations.The dashed black line is the target gain spectrum.Some obtained gain spectra during the online controlling, i.e., the red points in panel (e) are plotted in (g), (h), (i), and (j), respectively.The pump DAC values of each iteration are plotted in the second line accordingly.

Fig. 6
Fig. 6 The simulation performance of the BIF for online controlling with different on-off gain spectra.(a)-(c) The performance of the BIF with flat target gain.The simulated RAs have four, five, and six pumps, respectively.The solid lines are the experimental results and the dashed lines are the target spectra.(d)-(f) The performance of the BIF with tilt target gain.The simulated RAs have four, five, and six pumps, respectively.The solid lines are the results obtained by BIF, and the dashed lines are the target gain spectra.