Field programmable gate array-based coherent lidar employing the ordinal statistics method for fast Doppler frequency determination

Abstract. We report the development of a portable coherent Doppler lidar (CDL) system incorporating onboard field programmable gate array for real-time data processing. An ordinal statistic-based mean Doppler frequency estimation algorithm with high accuracy and low computational cost is proposed. The basic characteristics of the proposed method were specified with the use of mathematical modeling through the algorithm, realized as MATLAB computer code that includes parts simulating lidar signals and data processing. A comparison of the proposed method with established and well-used determination methods of the maximum Doppler frequency is accomplished. The ordinal statistics method is easy to implement and facilitates obtaining computational results swiftly. To demonstrate the performance of this lidar system, an intercomparison field campaign between lidar and ultrasonic anemometer is carried out. The excellent degree of correlation experimentally confirmed the feasibility of developing affordable, low-power consumption, compact, and robust CDL systems for various industrial and scientific applications.


Introduction
Coherent Doppler lidar (CDL), a powerful tool for wind velocity measurements, has demonstrated its efficacy in many applications, including wind shear monitoring, 1,2 turbulence tracking, 3,4 aircraft wake vortices characterization, 5,6 and extreme weather forecasting. 7 For wind energy application, our understanding of the flow around wind turbines is mainly achieved through field measurements. 8,9 The CDL is one of the sensors for measuring atmospheric wind speed; it can provide more detailed information about the impact of turbulent flows approaching, hitting, and propagating through each wind turbine, which facilitates precisely pitching the blades before wind gusts front arrives and may eventually form a new basis for real-time adaptive control of wind turbines. 10 High-quality wind measurements are critical to the potential location of wind power plants. [11][12][13] The CDL has matured rapidly in recent years owing to the dramatic growth of worldwide wind energy. Even though CDLs are currently commercially available with tens of meters spatial and seconds temporal resolution as well as decimeters per second precision, they remain relatively bulky and expensive, which greatly limits their industrial and scientific applications. Therefore, to address the need for an effective industrial lidar solution, developing cost-affordable, power consumption low, size compact, and robust CDLs is still urgently required. 14,15 2 Theory In this study, a portable CDL system with just 180 × 230 × 260 mm dimensions, 15-kg weight, and no more than 200-watts power consumption for wind profiling in the atmospheric boundary layer is reported. The lidar can work using only a battery. The system adopts a fast frequency processing algorithm to improve the processing speed of the Doppler frequency shift. Compared with the traditional processing method, the processing time of the radial wind speed is reduced by three times. A schematic diagram of the pulsed CDL system is shown in Fig. 1(a), together with a photograph shown in Figs. 1(b) and 1(c). The lidar employs an ultranarrow master oscillator power amplifier structured fiber laser with a pulse energy of ∼40 μJ at a repetition rate of 10 kHz. By chopping the continuous wave seed laser utilizing an acousto-optic modulator, Gaussian-shaped laser pulses are generated with ∼200-ns duration and shifted by 80 MHz in the frequency domain. These laser pulses are amplified through an erbium-doped fiber amplifier and then transmitted to the atmosphere via a monostatic transceiver with a diameter of 50 mm. After reflection from the mirror, the laser enters a rotatable wedge lens, which refracts the laser to tilt the optical path, and by rotating the wedge lens, the direction of emission can be continuously adjusted. The backscattered lidar signal is guided to a 2 × 2 fiber coupler to optically mix with the local oscillator. The heterodyne signals are connected to a balanced detector and subsequently digitalized at a 1-GHz sampling rate using an analog-to-digital converter (ADC). By performing a fast Fourier transform (FFT), the line of sight wind velocity can be accurately retrieved from the proper Doppler frequency estimator applied to the incoherently accumulated power spectral densities in the frequency domain.
A notable feature of this CDL system is that the ADC card is equipped with an onboard field programmable gate array (FPGA) for real-time data preprocessing. Recent developments have demonstrated its feasibility for CDL to accelerate data processing. [16][17][18] The CDLs usually employ high-frequency repetition rate laser sources and high-speed digitizers with sampling rates even going up to gigahertz, which leads to an extremely large amount of data and imposes high requirements on the digital processing hardware. For example, for a 12-bit ADC with a sampling rate of 1 GHz utilized in this work, the data transfer rate from the data acquisition card to the host PC would be 1.5 Gbyte/s. Furthermore, more than 129.6 Tbyte additional hard-disk space is required for only 1 day of raw data storage, which makes data processing on the host PC impossible. In the as-presented lidar system, raw data preprocessing is carried out on the hardware level using an onboard FPGA, which is programmed to calculate the FFT of time-resolved signals in real-time and then accumulate the resulting power spectra for further wind speed retrieval immediately after the lidar signals are acquired by the ADC. The superiority of this configuration is that the data transferred from the ADC to the host PC are significantly reduced to greatly reduce the burden of the host PC. The FPGA-based data processing block diagram is illustrated in Fig. 2. The whole implementation is based on a Kintex-7 xc7k325t FPGA chip from Xilinx company on board a 12-bit, 1 G samples per second ADC card. To precisely position the acquired lidar signal for incoherent accumulation, the laser, ADC, and FPGA must be synchronized to a common reference clock. The clock source employed in this work is a differential output programmable oscillator SiT9102 using a phase-locked loop to provide AE10 ppm frequency stability clock signals with sub-picosecond phase jitter. SiT9102 generates 200-MHz and 80-MHz clock signals for the FPGA chip as the master clock of data processing and ADC as the communication reference signal, respectively. The clock generation module on the FPGA makes use of the logic gates to derive a synchronous 10-kHz trigger signal for the pulsed fiber laser. In this way, the laser transmitter, data acquisition, and data processing of the CDL system run synchronously.
After passing through a first in first out buffer in parallel, the raw data are streamed from ADC to FPGA via an advanced extensible interface (AXI) bus for pipelined FFT with 400 points, corresponding to a spatial resolution of 60 m. However, to take advantage of the speed and efficiency of the FFT algorithm made for FPGA, the FFT size is set to 512 points to match the data length to a power of 2 by zero padding, i.e., each bin contains 400 actual samples and an additional 112 zeros. The power spectra of each range gate can be estimated from the complex output of the FFT by calculating the sum of squares of its real and imaginary parts and then incoherently accumulated over 10,000 laser shots, which corresponds to an integration time of 1 s. The accumulated power spectra are stored in an onboard RAM memory and then transferred from the FPGA to the host PC across the peripheral component interconnect (PCI) express bus for postprocessing. Based on the proposed data preprocessing scheme, the data transfer rate is reduced to only 500 kByte/s. Data postprocessing on the host PC is mainly concerned with retrieving the radial wind velocity from the mean Doppler frequency of the lidar signal. Various estimators have been successfully used to determine the mean Doppler frequency, such as the pulse-pair estimator, 19 maximum-likelihood (ML) estimator, 20 periodogram ML (PML) estimator, 21 Gaussian fitting estimator, and many more. 22 Theoretically, the ideal performance of an unbiased estimator is given by the Cramer-Rao bound. 23 Spectral-domain estimators may fail where the spectral feature of the signal is buried in the noise. For covariance-based estimators, the results are usually biased owing to the effects of uncorrelated noise (white noise). 24 More recently, a convolutional neural network trained on real and simulated coherent lidar signals to achieve a robust estimate of the Doppler frequency shift method has been proposed and demonstrated for low signal-tonoise ratio regimes. 18

Algorithm Based on the Ordinal Statistics Method
Here, we suggest an ordinal statistic-based novel estimator with a low computational cost for fast Doppler frequency determination, which was first proposed by Akhmetyanov and Mishina for estimating the maximum position of a bell-shaped function, as shown in Fig. 3. 25 The basic principle of the method is to simply take the arithmetic mean of two characteristic positions (denoted as P1 and P2 in Fig. 3) as the maximum estimation, which are symmetrically located at the rising and falling edge of the discrete power spectrum.
Typically, the undistorted power spectra of CDL can be expressed in analytical form as a Gaussian function, which is reasonable for low altitudes both theoretically and experimentally. 26 For higher altitudes or a turbulent atmosphere, the non-Gaussian component may become significant in fluctuations of the Doppler lidar signal. 27 However, it does not matter too much because the proposed method is well suited for various bell-shaped distributions. We take a Gaussian distribution as an example here, and the power spectra of CDL are expressed as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 1 ; 1 1 6 ; 3 2 7 (1) where v 0 is the center frequency of the Gaussian distribution, which characterizes the position of the function maximum, i.e., the mean Doppler frequency, and Δv is the spectra width. To minimize the estimation uncertainty of v 0 caused using biased φ 1 and φ 2 to deduce v 1 and v 2 , P1 and P2 are chosen to be situated at the points with maximum steepness. Upon basic mathematical analysis, it is easy to conclude that for a Gaussian distribution, the largest steepness occurs at v ¼ v 0 AE Δv. In this case, the values of the power spectra at P1 and P2 are given as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 2 ; 1 1 6 ; 6 6 3 where φ max ¼ φðv 0 Þ. Although the power spectra of CDL are usually discrete, one can easily obtain v 1 and v 2 from φ 1 and φ 2 by linear interpolation between adjacent frequency points. Therefore, a general procedure for fast Doppler frequency estimation by means of an ordinal statistics algorithm is given as follows.
Search for the maximum value φ max from the accumulated power spectra of each range gate and identify the flag value φ 1;2 using Eq. (2).
Determine the elements of the power spectra series (ordinal statistics) satisfying the inequalities E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 3 ; 1 1 6 ; 5 3 4 where m ≤ l ≤ n.
Calculate the symmetrical frequencies of v 1 and v 2 by linear interpolation from fðv m−1 ; φ m−1 Þ; ðv m ; φ m Þg and fðv n ; φ n Þ; ðv nþ1 ; φ nþ1 Þg, respectively; then, the Doppler frequency can be estimated as The choice of the φ 1;2 level is based on the fact that the value of the bell-shaped function, distorted by noise, exceeds the values of φ m and, respectively, the values of φ m−1 and φ nþ1 are minimal. For the φ 1;2 ¼ y level, it is necessary to find the intersection points with the bell-shaped function, the values of which are interpolated at the intervals ðx m − 1; x m Þ and ðx n ; x n þ 1Þ, respectively, by straight lines y ¼ a 1 x þ b 1 and y ¼ a 2 x þ b 2 . Therefore, solutions should be found for the following systems of equations: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 5 ; 1 1 6 ; 3 0 9 From these systems, the found estimates of x 1 and x 2 are used to estimate the position of the maximum of the bell-shaped function given in the following equation: E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 6 ; 1 1 6 ; 2 6 0 Thus, the algorithm for estimating the average wind speed, following the method of ordinal statistics, is determined by Eqs. (4), (5), and (6).

Comparison
The general algorithm for CDL data processing is as follows. To find the maximum of the bell-shaped function, mathematical modeling of the signal, measured by the CDL, was carried out with subsequent processing by the considered methods.
In addition, compare the execution times of comparing the method side-by-side to the equivalent calculation maximum of frequency. The calculating used MATLAB R2019 a 9.6.0.1072779 running on a Sony VIVO Windows 10 i7 2.39Ghz tower. Processed big data was realized in a cloud cluster using MATLAB capabilities, so the running speed math to the ability of compact CDL. MATLAB has an integrated editor, code analysis, debugging, the ability to edit variables, etc. Table 1 summarizes the results of finding the maximum frequency and estimating the time of determining the maximum. Table 1 shows that the estimation results by Gaussian approximation and ordinal statistics methods are characterized by the smallest spread. Regarding the lower computational costs, when the ordinal statistics method is implemented, it is recommended to use it as the main method for estimating wind speed in a continuous CDL with conical scanning. Owing to the discreteness of the amplitude-frequency response, using the cubic smoothing spline method initially appears to be a methodological error.
The data accumulation time is 1 s, use 1 s of data for each FFT. For comparison of the results, a discretization frequency of 500 MHz, 512 points of an FFT, and 100 sampling points per remote gate with a resolution of 30 m are used.
While investigating the estimation process, it should be noted that the computational costs of the algorithm consist of seven addition-subtraction operations, six multiplication-division operations, and 2N enumeration-type operations (N is the data length) for each range gate. For a traditional Doppler frequency estimator such as the centroid method, the Doppler frequency is typically determined by averaging all points with equal weight given as The calculation cost includes 2ðN − 1Þ operations of the addition-subtraction type and N þ 1 operations of the multiplication-division type. It is obvious that with the increase in N, the computational cost of the central method increases disproportionately faster than that of the proposed method.
To evaluate the performance of the CDL employing the ordinal statistics method, comparison experiments between lidar and meteorological masts were carried out from January 11 to 13, 2022. The 72-hr continuous observations of horizontal wind speed and direction are illustrated in Fig. 4. At an altitude of 1.2 km, the planetary boundary layer ends here, so the Mie backscatter signal disappears, however, at higher altitudes, such as 1.8 and 2.4 km, the signal reappears as the laser hits the bottom of the cloud, thus producing a stronger Mie backscatter signal.
The lidar is placed approximately 50 m from the meteorological mast. The meteorological mast consists of 13 ultrasonic anemometers unevenly distributed from 10-to 350-m elevations. As an example, the measurements at a height of 350 m are chosen for the evaluation tests. The ultrasonic anemometers provide horizontal wind velocity and direction with an update rate of 1 min, whereas for the CDL working with the velocity azimuth display scanning strategy, it requires several minutes to complete a whole scan cycle to obtain the horizontal wind information, according to the advantage of speed and efficiency of the FFT algorithm made for FPGA. Eventually, we process the datasets of both instruments into a 10 min average. Figure 5 plots the comparison results of 72 hrs of horizontal winds between the lidar and the ultrasonic anemometer at 350-m altitudes, together with their difference, which uses the right-hand y-axis scale. Both the horizontal wind speed and the wind direction agree quite well during most of the time of the campaign except for the chosen period from 12:00 to 18:00 LT on January 12, 2012, where a large bias occurred in the wind direction owing to the near zero wind speed. The mean and standard deviations of the difference plots are −0.28 and 0.32 m/s for wind velocity and 0.57 deg and 7.55 deg for wind direction, respectively. Excellent agreement of the two instruments is represented in an alternative way in the regression analysis, as shown in Fig. 6. It should be mentioned that the data sets used here involve measurements from all 13 different altitudes, rather than 350 m only. A total of 290 data pairs were observed for comparison, and these samples are sufficiently statistically representative to  illustrate the accuracy of this algorithm. For both the wind velocity and wind direction, the least squares fit of the CDL versus ultrasonic anemometer to y ¼ kx þ b yields slopes close to unity and R of almost 100%, demonstrating that the lidar operates with acceptable accuracy.
To implement each algorithm on the FPGA, we need to adjust the algorithm to meet the pairing characteristics of the FPGA.
Gaussian fitting algorithm: transformed into the least square solution and realized by the improved Cholesky decomposition method.
Centroid method: the core Eq. (7), using the pairing characteristics of FPGA, the upper and lower parts are divided and calculated simultaneously. After one multiplication module is eliminated from the numerator part, the rest of the process is consistent with the denominator part; then, the two parts are transformed into the same addition tree module to replace sequential accumulation.
Cubic spline method: analyze from the algorithm principle. After obtaining the quadratic differential value m, obtain Am ¼ b in which the A matrix is a strictly diagonal matrix, and the equation has a unique solution. Using the lower-upper (LU) decomposition method, m can be obtained and then brought into each subinterval to carry out the cubic polynomial of each subinterval.
The ordinal statistics method is grounded based on the calculation method; then, the peak part can be related in pairs. Because the number of data samples is exactly 256, the comparison can be completed in 8 cycles, after the difference is partly to build the subtraction module for parallel calculation. Solving a 1 ; b 1 , a 2 ; b 2 is essentially the same process and can be calculated in the meantime. Finally, when solving the equations, because the equations are only twodimensional, the inverse and transpose matrixes can be calculated directly, and the solution can be obtained directly by the least square method.
The above algorithm is realized on Vivado 2017.04 of Xilinx company. The working clock of the FPGA setting is 200 MHz, and the single-precision floating-point IP core is based on the Institute of Electrical and Electronics Engineers (IEEE) 754 standard. Using these settings, each operation in the algorithm was realized. Under the condition of meeting the working time, the delay time of the floating-point operation IP core is described in the Table 2: A comparison of the theoretical clock cycles of the four algorithms is demonstrated in Table 3.

Conclusion
In summary, we have demonstrated the performance of an FPGA-based CDL system employing the ordinal statistics method for fast Doppler frequency determination. The hardware configuration significantly reduced the burden of the host PC and enabled real-time data processing for wind measurements. A mean Doppler frequency estimator with high estimation accuracy, stable performance, and low computational cost is highly suggested. In the test campaign, the intercomparison results between the lidar and the ultrasonic anemometer experimentally confirmed that the lidar is operating with satisfactory performance. Therefore, we believe the proposed solution has a strong potential for developing affordable, low-power consumption, compact, and robust CDL systems for various industrial and scientific applications. Zheng Qin is a student at the University of Science and Technology of China.
Xianghui Xue is a professor at the University of Science and Technology of China. He is a renowned scientist in areas such as atmospheric remote sensing based on optical and radio techniques and middle atmospheric modeling.
Tingdi Chen is an engineer at the University of Science and Technology of China.
He is a renowned scientist in areas such as stratosphere, mesosphere, and lower thermosphere