Nonconvex optimization for optimum retrieval of the transmission matrix of a multimode fiber

Abstract. Transmission matrix (TM) allows light control through complex media, such as multimode fibers (MMFs), gaining great attention in areas, such as biophotonics, over the past decade. Efforts have been taken to retrieve a complex-valued TM directly from intensity measurements with several representative phase-retrieval algorithms, which still see limitations of slow or suboptimum recovery, especially under noisy environments. Here, we propose a modified nonconvex optimization approach. Through numerical evaluations, it shows that the optimum focusing efficiency is approached with less running time or sampling ratio. The comparative tests under different signal-to-noise levels further indicate its improved robustness. Experimentally, the superior focusing performance of our algorithm is collectively validated by single- and multispot focusing; especially with a sampling ratio of 8, it achieves a 93.6% efficiency of the gold-standard holography method. Based on the recovered TM, image transmission through an MMF is realized with high fidelity. Due to parallel operation and GPU acceleration, our nonconvex approach retrieves a 8685  ×  1024 TM (sampling ratio is 8) with 42.3 s on average on a regular computer. The proposed method provides optimum efficiency and fast execution for TM retrieval that avoids the need for an external reference beam, which will facilitate applications of deep-tissue optical imaging, manipulation, and treatment.


Introduction
Different from ordinary ballistic optics, light propagation in complex media is highly disordered 1, 2 due to the multiple scattering occurring in media like biological tissues or mode dispersion in multimode fiber (MMF). Finding an order out of such disorders has been long pursued. Over the past decade, enormous progresses have been made via wavefront shaping 3, 4, 5, 6, 7, 8 and especially the transmission matrix (TM) method 9, 10, 11, 12, 13 in controlling light to focus and image through complex media. The TM of a disordered medium describes the complex output responses for an arbitrary point-source input, which is regarded as the transfer function of the medium under the shift-invariance assumption 11. The measurement of TM offers a versatile tool to control light delivery in spite of scattering 6, 10, as well as recovering object information from acquired speckle patterns 14,15. The TM method has spurred a wide range of MMF-based applications, such as focusing 16, 17, glare suppression 18, 19, endoscopic imaging 20, 21, 22, manipulation 23, optogenetics 24, and communication 14, 25, 26. TM measurement of a scattering medium was first introduced by Popoff et.al. 9, 10 using coaxial holography with internal reference. Since then, various forms of TM measurement have been developed. Typically, the TM is measured by recording the complex output fields under a sequence of input modulations. The modulation basis is usually orthogonal, which can be of diverse forms, including Hadamard matrix 9, 10, Fourier transform matrix 27, point source 28, 29, and random phase 13. Regardless of the form, the measured TM relates all input modes to each output mode by linear superposition. Depending on the type of input modulation and output measurement, the TM could be complex-valued 9, 29, real-valued 30, 31 or even binary 32. Among them, complex TM is used most as it supports both amplitude and phase modulation of light, which, however, usually entails holographic setup. Off-axis holography based on either phase-shifting 33 or spatial filtering 34 can acquire the complex TM accurately. Nevertheless, effective off-axis interferometry with an additional reference beam and high system stability it demands could be unpracticable in some scenarios. As an example, the coherence length of pulse laser could be too short to be used for interferometry-based TM measurement. With coaxial holography, the above issues might be alleviated, but the dark spot problem with the measured TM caused by speckle reference field 35 is still unsatisfying.
Recent efforts have sought to accurately retrieve the complex TM from intensity-only measurements by using advanced phase retrieval algorithms 13,27,36,37,38,39,40,41,42, which started with a Bayesian inference approach (i.e., prVBEM) 13. This was followed by prSAMP 36 and prVAMP 37. Although robust to noise, a prior knowledge of noise statistics is a must for these approaches. Semidefinite programming (SDP) that uses convex relation has also been introduced for solving the TM retrieval problem 38, but it usually requires N ln N (N is the input size) measurements and tends to be computationally inefficient. Additionally, works based on extended Kalman filter (EKM) 39 or generalized Gerchberg-Saxton (GGS) algorithm 40 claim retrieving TM with measurements could be enough. That said, EKM is computationally burdened and hard for parallelization. GGS is efficient in computation, but its performance is still suboptimum in real practice. Most recently, the area also sees the birth of a smoothed Gerchberg-Saxton algorithm 42 and a nonlinear optimization method 27 for TM retrieval.
To overcome the aforementioned limitations, in this study a state-of-the-art nonconvex optimization approach is adopted and modified for TM retrieval with optimum performance. Compared with existing TM retrieval algorithms, the proposed nonconvex method can achieve optimal efficiency numerically with less running time or sampling rate. In the experiment, by focus-scanning across the field-of-view (FOV) of an MMF with the acquired TM, the performance of the proposed method is validated to approach the golden standard, i.e., off-axis holography with a sampling rate of 8. Moreover, with the assistance of parallel operation and GPU acceleration, multiple rows of TM can be recovered rapidly. Our method for TM retrieval is featured with optimum efficiency and fast implementation in a reference-less and robust setting, which holds potential for many deep-tissue imaging and focusing applications with the usage of MMF.

Formulation of the TM retrieval problem
The theoretical model of retrieving a TM under a sequence of input modulations is formulated as follows. Suppose the number of discrete modulation units (i.e., input size) and speckle field pixels (i.e., output size) is N and M , respectively. Given P calibration patterns such that the probe matrix X ∈ C N ×P and the amplitude measurements Y ∈ R M ×P + , the TM A ∈ C M ×N that needs to be estimated shall follow where takes the absolute value for the elements inside. By taking the conjugate transpose of both sides of Eq. (1), we have where (·) H is the element-wise conjugate transpose operator. Column-wisely, Y H = [y 1 , y 2 , · · · , y M ], where y i ∈ R P + that constitutes the measurements associated with the i th output mode; A H = [a 1 , a 2 , · · · , a M ] where a i ∈ C N that denotes the i th row of TM (after conjugate transpose), i = 1, · · · , M . In this case, the TM retrieval problem is decomposed into M independent sub-problems given by Due to the operation of taking absolute values in Eq. (3), the above problem of estimating one row of TM is nonlinear and falls in the category of phase retrieval.
Phase retrieval problem has been studied intensively in mathematics as it is commonly encountered in practice, with representative algorithms including alternating projection 43 (e.g., Gerchberg-Saxton and Fineup), SDP 44 (e.g., PhaseLift, PhaseMax, PhaseCut), approximate message passing (e.g., GAMP 45,VAMP 46), and nonconvex optimization 47, 48, 49, 50, 51 etc. Among these methods, nonconvex approaches are proven to be superior and have been developed rapidly in the past years. There are mainly two categories of nonconvex approaches, the intensity-flow 48, 52 and amplitude-flow models 49, 50, 51, with the latter being better in both empirical success rate and convergence property. In particular, the amplitude-flow models have been proven to converge linearly to the true solution under O(n) Gaussian measurements for a signal with dimension 51.

The modified RAF algorithm
Herein, the cutting-edge reweighted amplitude flow (RAF) algorithm 51 is adopted for the TM retrieval problem. Solving Eq. (3) can be recast as an optimization problem where L (a i ) is an amplitude-based least square error (LSE) loss function. While most nonconvex algorithms contain two stages, i.e., spectral initialization and gradient descent, RAF applies reweighting techniques in both stages that accelerates the signal recovery significantly. Considering Eq. (4), the signal, i.e., one row of TM a (the row index i is omitted for brevity, same below) is first estimated with the weighted maximum correlation initialization. A subset of the row vectors in the probe matrix (i.e., S ⊂ {1, · · · , P } of the row vectors in the probe matrix (X H = x H 1 ; x H 2 ; · · · x H p ) that correspond to the |S| largest entries among the measurements y = {y j } 1≤j≤P are selected. These are called direction vectors, as they are more correlated to the true signal. The signal estimate can be constructed by maximizing its correlation to the direction vectors x H j |j ∈ S such that By weighting more to the selected x H j vectors corresponding to larger y j values with the weights y α j , ∀j ∈ S (exponent α = 0.5, by default), the solutionã 0 of Eq. (5) is the unit-norm principle eigenvector of the Hermitian matrix: whereỹ α j = y α j ,j ∈ S 0, otherwise .ã 0 is then scaled to obtain the signal initial guess a 0 = P j=1 y 2 j /P ·ã 0 . The initialized signal a 0 is further refined by reweighted "gradient-like" iterations. The gradient of the LSE loss in Eq. (4) with respect to a is where • denotes element-wise multiplication. Let t be the iteration index, then the gradient descent is described by where µ is the step size. One can reweight the gradients in Eq. (7) that have larger X H a t ⊘ y (⊘ represents element-wise division), which are deemed more reliable in directing to the true signal. The adaptive weight vector is: where β is a pre-selected parameter. The above descriptions show the reweighted gradient flow for TM retrieval.
Inspired by Ref. 40, herein we modify the gradient-descent process of the original RAF algorithm, which is divided into two steps. In Step 1, the normalized measurement vector y is replaced with its square y 2 for gradient computation, which enlarges the gradient and functions as the artificial heat data. In Step 2, still the amplitude measurement y is used.
Step 1 contains the first 2/3 total iterations, which is set empirically (the rationale is referred to Appendix A). The modified algorithm is simple yet surprisingly effective, which is named RAF 2-1 and shown to reduce the measurement error to a much lower level than the original RAF (see Appendix B). The pseudocode of retrieving one row of TM with RAF 2-1 is summarized in Algorithm 1. Note that multiple rows of TM can easily be retrieved in a parallel way.

Numerical evaluation
Numerical evaluations were conducted at first to assess the efficiency of the proposed RAF 2-1, with comparisons with several representative TM retrieval algorithms including the pioneering prVBEM and the recent GGS 2-1 that outperformed previous ones. Note that unless otherwise specified, all the following simulations were conducted using a PC with an Intel Xeon CPU (3.50 GHz, 6 cores) and 64 GB RAM in the environment of MATLAB 2022a. For each algorithm, the performance is evaluated by the efficiency of focusing with the retrieved TM, which is indicated by the peak-to-background ratio (PBR). This is performed by taking the conjugate of one row of TM to construct the phase mask for focusing light onto the corresponding position. The TM was modelled using the speckle field produced by random phase mask with zero-padding in the Fourier domain, whose elements obeyed a circular Gaussian distribution. According to Ref. 3, the theoretical PBR of focusing is linearly proportional to the input size N , as given by The focusing efficiency a certain TM retrieval algorithm can achieve is typically determined by the iterations (or the running time) it has taken and the sampling rate (γ = P/N ).
We first examined the focusing performance of different TM retrieval algorithms under a range of running times when the sampling rate was fixed to be γ = 4. The input phase mask had the size of 24 × 24. A total of 20 × 20 foci were produced sequentially that corresponded to 400 rows of TM to be retrieved, with the average PBRs of the foci as the focusing PBR. Fig. 1a presents the average focusing results achieved with a running time ranging from 1 to 60 s based on 30 repeated tests. It can be seen RAF 2-1 reaches the optimum efficiency after running for ∼ 8 s, while GGS 2-1 requires much longer running time (∼ 40 s) and prVBEM cannot fully approach the optimum within 60 s. This indicates our nonconvex method is superior in algorithm convergence given the same condition. Fig. 1b additionally shows the number of iterations versus running times, which reveals the seconds per iteration for prVBEM, GGS 2-1, and RAF 2-1 are roughly 5:1:1 in such a case. Hence, the nonconvex approach is at least as highly efficient as GGS 2-1 in computation time, and both are much better than prVBEM.
The influence of sampling rate was also explored for TM retrieval algorithms, by fixing the running time to be 20 s when N = 576. As seen in Fig 1c, all algorithms can achieve a higher efficiency of focusing with a larger γ, as more measurements allow more accurate phase retrieval. Moreover, our RAF 2-1 outperforms its competing peers as it averagely realizes more than 95% efficiency when γ = 3 and 100 % when γ = 4. By comparison, GGS 2-1 requires a sampling rate of 5, while prVBEM requires 7 for the optimal performance under the same condition.
After the study on the effects of running time and sampling rate, the focusing performances of different algorithms under a series of input sizes were further investigated. For fair comparison, the sampling rate was fixed at γ = 4. The running time for all algorithms was the same at each case of N , while also increased accordingly with a larger N = 576 from 144 to 1600, so that the number of iterations taken by prVBEM remained the same as the case of N = 576 with a running time of 20 s. We can observe from Fig. 1d that RAF 2-1 always approach the optimum at any input size, while GGS 2-1 and especially prVBEM gradually see a larger and larger distance from the optimum with increased N . This is reasonable as the accurate recovery of TM is more demanding for the running time and particularly the sampling rate when at a larger dimension. Since γ = 4 is fixed, prVBEM may step into a plateau and decays slightly in the focusing efficiency starting from N = 784, as it could not handle a larger signal recovery problem with limited (and insufficient) measurements.
Since measurements are inevitably contaminated with noise in practice, a good noise-robustness is preferred for a TM retrieval algorithm. Thus, the algorithm performances were also evaluated under a range of signal-to-noise (SNR) levels using simulated noisy measurements. In the simulation, a multiplicative noise is added to the output field intensity I ∈ R P + . The SNR is defined as where ε = I noise − I, which denotes the noise vector, I noise is the noisy measurement vector, and ⟨·⟩ takes the average for the elements inside. For the focusing results of multiple foci, uniformity is an important metric to measure the focus quality. The focusing uniformity (µ) is given by where K is a set of the indexes of foci. With parameter settings that N = 576, γ = 4 and a running time of 20 s, the results of the normalized PBR and the uniformity of 400 foci produced using different TM retrieval algorithms are given in Fig 2(a-b). Note the original RAF was also included to highlight the improved anti-noise capability by our modification. It is found that a maximum improvement of ∼ 9% in terms of the focusing efficiency can be realized by RAF 2-1 over RAF under noise. Such a difference between RAF 2-1 and RAF weakens when SNR increases, and their focusing efficiencies are almost the same when SNR is about 30. This explains why RAF was excluded in the previous noiseless tests. Besides, an obvious improvement of the focus uniformity is achieved by RAF 2-1, which is at best ∼ 18% higher than RAF under noisy conditions, and ∼ 5% better when SNR is sufficiently large. Overall, algorithm performances in both focusing PBR and uniformity follows the order: RAF 2-1 > RAF > GGS 2-1 > prVBEM. The difference between GGS 2-1 and RAF is relatively small, whereas prVBEM falls behind GGS 2-1 considerably.

Experiment
The experimental configuration for retrieving the TM of an MMF is illustrated in Fig. 3. A beam from a 532 nm continuous-wave laser (EXLSR-532-300-CDRH, Spectra Physics, USA) was expanded by a 40x objective lens and collimated by a lens (L1). The linearly polarized beam was then modulated into circularly polarized by a quarter-wave plate (λ/4) before impinging onto a digital micromirror device (DMD, DLP9500, Texas Instruments Inc, USA). Based on the Lee hologram technique, the DMD, combined with a 4f system composed of L2, iris, and L3, allowed for phase modulation at a high-speed pattern rate of up to 23 KHz, rendering fast data acquisition for TM calibration. The phase-encoded and shrunk beam was subsequently coupled into a MMF of 0.22 numerical aperture (NA) and diameter of 105 µm (SUH105, Xinrui, China) by a collimator (C1). The transmitted light was imaged with a collimator (C2), too. The beam then passed through a lens (L4) for convergence and a polarizer before being captured by a CMOS camera (BFS-U3-04S2M, FLIR, USA). For TM retrieval, there was a sequence of speckle intensity measurements under the input modulations of random phase.
In experiment, the performances of using different TM retrieval algorithms to control light delivery despite scattering were compared from the aspects of single-spot and multi-spot focusing through MMF. For single-spot focusing, a total of 20 × 20 foci were generated sequentially on the working plane of the MMF, which meant 400 rows of TM were to be retrieved. The sampling rate was set to be 5 for all algorithms to ensure the quality of TM retrieval, given that the acquired speckle data suffered from noises such as shot noise, dark current noise, and readout-noise. Fig. 4a presents the histogram distribution of focusing PBR with different algorithms. Since the experimentally acquired TM of the MMF also obeyed the circular Gaussian distribution, it could be reasonable to use Eq. (10 for normalizing the empirical focusing PBRs and calculating the focusing efficiency. The average focusing efficiencies (denoted by the crosses in the boxplots) were 16.45%, 26.01%, 37.46%, and 55.17% for prVBEM, GGS 2-1, RAF, and RAF 2-1 respectively. Surprisingly, RAF 2-1 achieved a considerably higher efficiency than RAF, i.e., ∼ 17.72%, in comparison to the simulation results in Fig. 2a. The reasons may include the noise type difference in the simulation and experiment, the larger TM dimension in the experiment (more challenging for recovery so that the gap is more obvious), etc. According to the boxplots of Fig 4a, quite a few foci approached or even surpassed the theoretical PBR for GGS 2-1, RAF, and especially RAF 2-1. However, their overall focusing efficiencies of 400 different foci on the working plane of MMF still saw a distance from the ideal level, using the retrieved TMs when γ = 5.
In addition to single-spot focusing, multi-spot focusing experiment was also conducted under the same condition. This was achieved by superposing multiple phase-conjugate rows of the retrieved TM to construct a phase mask. The results of light focusing onto a pentagram pattern composed of 20 spots by different algorithms are shown in Fig. 4b  that only RAF 2-1 produced a high-quality pentagram pattern by focusing light on all the pre-selected positions, thanks to its superior performance of TM retrieval.
The accuracy of TM retrieval by our RAF 2-1 was further compared with the off-axis holography, i.e., the golden standard for the measurement of TM. To do so, we scanned across the whole fiber region on the working plane of the MMF, so that a map of focusing PBR could be synthesized, which was used to fully evaluate the accuracy of TM. The fiber region was determined by identifying the largest connected region in the binarized image taken when all pixels on the DMD were turned on. Using circular fitting of the fiber region, the center and radius of the fiber region were obtained, which were then used to create a binary mask of the fiber region. In the experiment, there were 8685 pixels inside the circular fiber region of the 140 × 140 output field, which correspond to 8685 rows of TM to be retrieved. Fig. 5 summarizes the results of focusing PBR maps with the TM measured by off-axis holography and the TMs retrieved by RAF 2-1 under a range of γ from 3 to 8. The mean PBR by the golden standard method is 608.4. Compared with the theoretical PBR (i.e., η = 804 ), it is reasonable given that the focusing quality degraded in the fiber fringe area due to the inhomogeneous mode excitation inside the MMF. As for RAF 2-1, there are many dark spots in the PBR map synthesized under small γ, indicating poor accuracy of the corresponding rows of TM being retrieved. With a larger γ, the PBR map becomes more homogeneous with less dark spots, which means an overall improvement of the TM accuracy. Notably, when γ = 8, the PBR map by RAF 2-1 is comparable to that of the holography, with a mean PBR of 569.4 reaching ∼ 93.6% efficiency of the golden standard experimentally. The key advantage is RAF 2-1 does not require holographic setup for the calibration of TM, compatible with more applications. Additionally, with parallel operation and GPU implementation, the TM retrieval process by RAF 2-1 was fast enough. For example, under γ = 8, retrieving an 8685×1024 TM by RAF 2-1 averagely took 42.3 s, when using a computer with an Intel Xeon CPU E5-1650 v3 @3.50 GHz, a NVIDIA RTX2080 Ti GPU, and 128 GB RAM.
Using the retrieved TM by RAF 2-1, one can further reconstruct an input object from intensity-only speckle measurement with one more phase retrieval. The reconstruction result relies heavily on the quality of the recovered TM, which acts as the measurement matrix. Fig. 6a shows the results of reconstructing a 32 × 32 phase object of a Chinese character (meaning "light"), by taking 100 iterations with the TM of MMF retrieved by RAF 2-1 when γ increased from 3 to 8. Pearson correlation coefficient (PCC) is used to quantify the similarity between the reconstructed image I R and the ground truth I G , which is given by

Discussion and conclusion
There have been various phase retrieval algorithms used for solving the TM retrieval problem, as introduced earlier.
RAF, as one of the best in the nonconvex family, has been reported previously [53] to be highly competitive for image restoration from speckle measurement. To the best of our knowledge, we first adopted it for non-holographic calibration of TM [41]. More importantly, our modified version, RAF 2-1, with an additional step of gradient heating, has shown remarkable improvement in the robustness against noise and the TM retrieval accuracy in both simulations and experiments. The numerical evaluation of RAF and RAF 2-1 can be further seen in Appendix B. Besides the above modification, we resort to speeding up the convergence of RAF for TM retrieval. Efforts include employing adaptive step size in the gradient descent process or other gradient acceleration schemes, such as Limited memory-BFGS (L-BFGS) [54] and nonlinear conjugate gradient (NCG) [55] methods. However, the improvements are not very impressive, with details referred to Appendix C.
There are also several limitations in the study. In the experimental setup, the MMF output field was relayed by a collimator instead of an objective lens. Consequently, the working plane of the MMF was immovable, which had a certain distance (about tens of micrometers) away from the fiber end. That said, the setup was sufficient for retrieving a reliable TM and focusing on the working plane for demonstration. An objective lens is needed only for measuring the TMs corresponding to different working planes. In addition, since there is phase ambiguity for the formulated LSE objective function in Eq. (4), a phase offset exists for each row of the retrieved TM. However, it does not affect the intensity of the generated 2D foci. Further phase correction [27] is indispensable when it comes to MMF 3D volumetric focusing and imaging.
In conclusion, we have proposed a modified nonconvex approach, RAF 2-1, for retrieving the TM of MMF based on speckle intensity measurements. Theoretically, RAF 2-1 can achieve optimum focusing efficiency with less running time or sampling rate than the previously reported TM retrieval methods. The experimental results of light control through a MMF confirm a comparable performance of RAF 2-1 to the golden standard holography method for TM measurement. RAF 2-1 is also computationally efficient that took averagely 42.3 s to recover an 8685×1024 TM (γ = 8) on a regular computer under parallel operation and GPU implementation. Endowed with the advantages of optimum efficiency, fast execution, and a reference-less setup, RAF 2-1 allows for broad applications in MMF-based imaging, manipulation, and treatment etc.
Appendix A: The best iteration ratio for the two-step gradient iteration process of RAF 2-1 There are two steps in the gradient iteration process regarding the proposed RAF 2-1. In order to determine the number of iterations in Steps 1 and 2 (with the total number fixed) for the best performance, numerical experiments were conducted to compare the focusing efficiency achieved by RAF 2-1 under a series of iteration ratios. Moreover, since GGS 2-1 also involved the two-step gradient descent, it inspires this work and used for performance comparison. Therefore, the best ratio of iteration of GGS should also be determined. Fig. 7 gives the results of both RAF 2-1 and GGS 2-1, with a total iteration of 300. It can be observed for RAF 2-1, there are minor differences of focusing efficiency among different iteration ratios, while the one at 2/3 is chosen as the best iteration ratio due to a slight advantage. As for GGS 2-1, the best focusing efficiency is around an iteration ratio of 9/10, which is also consistent with the original research that adopted 287 and 34 iterations in Steps 1 and 2 for GGS 2-1, respectively.
Appendix B: Numerical evaluation of RAF and RAF 2-1 As mentioned in the Methods section, the modified version, RAF 2-1, is more effective for TM retrieval. To evaluate how the performance of RAF 2-1 is better than the original RAF, a numerical experiment in a noiseless condition was performed for retrieving the TM that corresponds to 400 output modes. The curves of the averaged errors after normalization are presented in Fig. 8. The measurement error for the i th row of TM is defined as whereâ i is the estimate of the i th row of TM and other notations are with the same meaning as in the Methods section. It can be observed that the error of RAF 2-1 can finally decline to a level of as low as 10 −4 , much lower than that of RAF. This indicates a more accurate result of TM retrieval by RAF 2-1.
Appendix C: Accelerated gradient descent for RAF As discussed earlier, the ways to accelerate the convergence of RAF were also studied, by using an adaptive step size and a more advanced gradient descent solver. First, we adopted the Barzilai-Borwein method for non-monotonic  backtracking line-search of step size, which was compared with the fixed one (µ = 3). As seen in Fig. 9a, the measurement error of using adaptive µ drops slightly more rapidly than that of fixed µ within the first 20 s of running time, while the latter eventually declines to a lower level. This shows the adaptive step size method is not very effective, although it could be better with parameter finetuning. As for the gradient descent solver, besides the regular steep descent using the negative first derivative (i.e., the gradient) as the descent direction, acceleration methods such as NCG and L-BFGS were employed for comparison. Since NCG and especially L-BFGS entail more computations per iteration than SD, for fair comparison, the curves of error as a function of running time (instead of iterations) for different optimization methods were compared, as shown in Fig. 9b. We can see that NCG has the fastest convergence with the same running time. The reason that L-BFGS method is even slower in the declining trend of error is attributed to the far more seconds per iteration it requires. In fact, the average number of iterations taken by SD, NCG, and L-BFGS are 671, 656, and 411, respectively. The convergence for L-BFGS could be the fastest if compared from the perspective of error declining versus iterations.

Disclosures
The authors declare that there are no conflicts of interest fo this work.

Data Availability
The code demo of RAF 2-1 is available in https://github.com/Ford666/RAF21_TM_retrieval, and the data that support the findings of this study could be available by reasonable request to the corresponding author.