Fast ℓ1-regularized space-time adaptive processing using alternating direction method of multipliers

Abstract. Motivated by the sparsity of filter coefficients in full-dimension space-time adaptive processing (STAP) algorithms, this paper proposes a fast ℓ1-regularized STAP algorithm based on the alternating direction method of multipliers to accelerate the convergence and reduce the calculations. The proposed algorithm uses a splitting variable to obtain an equivalent optimization formulation, which is addressed with an augmented Lagrangian method. Using the alternating recursive algorithm, the method can rapidly result in a low minimum mean-square error without a large number of calculations. Through theoretical analysis and experimental verification, we demonstrate that the proposed algorithm provides a better output signal-to-clutter-noise ratio performance than other algorithms.


Introduction
Space-time adaptive processing (STAP) can effectively suppress strong ground/sea clutter and improve the moving target indication performance for airborne/spaceborne radar systems. 1 In full-dimension STAP algorithms, however, a large number of independent and identically distributed (I.I.D.) training snapshots are required to yield an average signal-to-clutter-noise ratio (SCNR) loss of ∼3 dB. 2 Moreover, full-dimension STAP algorithms have a high system complexity and require many memory elements. 3 In practical applications, it is generally difficult to satisfy these requirements.
To date, many algorithms have been proposed to overcome the drawbacks of full-dimension STAP algorithms. Reduced-rank STAP algorithms can reduce the clutter space while maintaining the performance of fully STAP algorithms. 4,5 Consequently, the required number of snapshots can be reduced. However, eigenvalue decomposition is used, which is computationally expensive. To reduce the computational expense and the number of training snapshots simultaneously, some typical reduced-dimension STAP algorithms have been proposed, such as the joint domain localized approach, auxiliary channel processing, etc. [6][7][8] However, the nonadaptive selection of the reduced-dimension projection matrix, which relies on intuitive experience, results in a performance degradation to a certain extent. 2 The sparsity of the filter coefficients in STAP has recently been studied, and the theoretical framework for sparsity-based STAP algorithms using the l 1 -regularized constraint, which is the so-called least absolute shrinkage and selection operator (LASSO), has been established. [9][10][11][12] The classical algorithms for solving the LASSO problem adopt convex optimization, e.g., the interior point algorithm, to obtain a sparse solution. The complexity of the algorithms can be very high when the size of the problem is large, which is not pragmatic in practice. To effectively solve the optimization problem, the l 1 -regularized recursive least-squares STAP (RLS-STAP) algorithm, 13 the l 1 -regularized least-mean-square STAP algorithm, 14 and the homotopy-STAP algorithm 15 have been proposed. Compared with conventional STAP methods, sparsity-based STAP techniques have been shown to provide high resolution and exhibit better performance than conventional STAP algorithms. 16 The alternating direction method of multipliers (ADMM) is a technique used to combine the decomposability of dual ascent with the rapid convergence speed of the method of multipliers. 17,18 This technique is well suited for solving the optimization problems of the l 1 constraint, particularly large-scale problems. 19 The ADMM technique can converge within a few tens of iterations, which is acceptable in practical use. 20 In this study, according to the optimal criterion of minimizing the mean-square error, we propose an algorithm based on the ADMM technique to solve the l 1 -regularized STAP problem. The proposed method provides better performance with a small number of I.I.D. training snapshots and without a large number of calculations.
The reminder of this paper is organized as follows. The system model of the generalized side-lobe canceler (GSC) form of the sparsity-based STAP is introduced in Sec. 2. In Sec. 3, the theory of the ADMM algorithm is introduced, and the l 1 -regularized ADMM-STAP algorithm is proposed. The associated optimization problem is formulated and solved analytically. The performance improvement of the proposed algorithm is shown in Sec. 4. Section 5 provides the conclusion.
Notation: In this paper, a variable, a column vector, and a matrix are represented by a lowercase letter, a lowercase bold letter, and a capital bold letter, respectively. The operations of transposition, complex conjugation, and conjugate transposition are denoted by ð·Þ T , ð·Þ Ã , and ð·Þ H , respectively. The symbol ⊗; denotes the Kronecker product, and the symbol k · k n denotes the l n -norm operator. EðxÞ denotes the expected value of x, jxj indicates the absolute value of x, and ðxÞ þ ≜ maxð0; xÞ. signð·Þ is the component-wise sign function. 13 2 Background and Problem Formulation

System Model
The STAP technique is known for its ability to suppress clutter energy interference while detecting moving targets. Consider an airborne radar system equipped with a uniform linear array (ULA) consisting of N receiving elements, as shown in Fig. 1. The radar transmits K Fig. 1 Radar platform flies at speed v p along the azimuth direction (x -axis). Without loss of generality, the center of elements is defined as the origin of coordinates. h p is the flight height, and ϕ represents the AOA of the clutter patch in the isorange clutter ring.
identical pulses at a constant pulse repetition frequency (PRF) f r ≜ 1∕T r during a coherent processing interval (CPI), where T r is the pulse repetition interval. The received signal from the range bin of interest is represented as x ¼ x t þ x c þ n, where x t is the target vector, x c is the clutter vector, and n is the thermal noise vector with noise power σ 2 n on each channel and pulse. The space-time clutter vector can be represented as 21 E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 1 ; 1 1 6 ; 6 7 4 σ c;n vðf d;n ; f s;n Þ; (1) where N c denotes the number of clutter patches in the range bin of interest and σ c;n denotes the random complex reflection coefficient. f d;n ≜ 2v p T r sin ϕ n ∕λ and f s;n ≜ d sin ϕ n ∕λ are the Doppler frequency and spatial frequency for the n'th clutter patch, respectively, where λ is the wavelength and d is the innersensor spacing of the ULA. vðf d;n ; f s;n Þ ∈ C NK×1 is the space-time steering vector, which is defined as a Kronecker product of the temporal and spatial steering vectors, i.e., vðf d ; The target vector is x t ¼ σ t vðf d;t ; f s;t Þ, where f d;t ≜ 2v p T r sin ϕ t ∕λ þ 2v t T r ∕λ and f s;t ≜ d sin ϕ t ∕λ. v t is the radial velocity of the moving target, and ϕ t represents the angle of arrival (AOA) of the target. Note that in the following, vðf d;t ; f s;t Þ is rewritten as v t for convenience.
To clearly illustrate how the STAP method works, the GSC form of the STAP method is shown in Fig. 2. B ∈ C NK×ðNK−1Þ is the signal blocking matrix, which satisfies B H v t ¼ 0 and is the clutter covariance matrix, and r bd ¼ Eðbd Ã 0 Þ is the cross-correlation vector between d 0 and b. The output clutter power can be computed as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 5 ; 1 1 6 ; 2 6 8 where R x ¼ Eðxx H Þ is the input covariance matrix. The output SCNR can be expressed as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 6 ; 1 1 6 ; 2 2 2 Maximizing the output SCNR is equivalent to maximizing the detection probability. However, R b and r bd are unknown in practice, and the secondary training snapshots are required to estimate these parameters. 15 The best performance can be achieved if there are sufficient I.I.D. training snapshots. However, in many practical cases, it is impossible to obtain sufficient snapshots, and the performance degrades significantly.

Sparsity-Based STAP
According to the STAP theory, it has been shown that the rank of clutter covariance is far lower than the DOFs of the system. 22,23 Consequently, some RR-STAP and RD-STAP algorithms have been used to reduce the filter length, i.e., the filter coefficient vector obtained by full-dimension STAP is sparse. 14 Hence, in the GSC form of the sparsity-based STAP algorithm (see Fig. 2), the denote a sparse vector. Then, we obtain E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 7 ; 1 1 6 ; 5 6 3 The output of the sparsity-based STAP is E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 8 ; 1 1 6 ; 5 1 8 Hence, the output clutter power for the sparsity-based STAP can be computed as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 0 9 ; 1 1 6 ; 4 6 7 is the weight error vector caused by the sparsity constraint. Note that the target signal power is not affected by the sparsity constraint. The output SCNR can be expressed as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 0 ; 1 1 6 ; 4 0 4 Hence, the aim is to minimize the mean-square error ε H R b ε. The objective function of the minimization problem can be rewritten as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 1 ; 1 1 6 ; 3 3 3 ω b is sparse, i.e., most of its elements are considerably smaller than the others. Hence, the minimization problem can be expressed as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 2 ; 1 1 6 ; 2 7 7 where λ is the regularization parameter for regulating the sparseness ofω b . However, the l 0 -norm problem is nonconvex. Consequently, it is intractable even for optimization problems with a moderate size. Equation (12) can be further programmed as an LASSO algorithm E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 3 ; 1 1 6 ; 2 0 9 In contrast to Eq. (12), Eq. (13) is convex and can be solved by convex optimization algorithms, such as the interior point method (IPM). The complexity of IPM-STAP can be very high when the size of the problem is large, which is not pragmatic in practice.

Variable Splitting
In general, the ADMM algorithm can converge rapidly when a modest-accuracy result is acceptable. Fortunately, this is the case for the parameter estimation problem in the STAP application that we are considering. For statistical problems, solving a parameter estimation problem to a very high accuracy often yields little improvement. 19 The ADMM-STAP algorithm is based on the algorithm of variable splitting, i.e., we split the variableω b into a pair of variables, say, ω b and z, and add a constraint that the two variables are equal. Moreover, the objective function is split as the sum of two functions, and then we minimize the sum of the two functions. Explicitly, Eq. (13) can be rewritten in the ADMM form The problems of Eqs. (13) and (14) are clearly equivalent. In many cases, it is easier to solve the constrained problem Eq. (14) than the original unconstrained problem. As in the method of multipliers, the augmented Lagrangian function is formed as 19,20 E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 5 ; 1 1 6 ; 5 1 9 where ρ > 0 is the augmented Lagrangian parameter and y is a vector of Lagrange multipliers.

l 1 -Regularized ADMM-STAP
Define the residual and the scaled dual variable as r ¼ω b − z and d ¼ ð1∕ρÞy, respectively. Then, we have E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 6 ; 1 1 6 ; 4 0 9 Subsequently, the ADMM-STAP algorithm can be rewritten in a convenient form E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 7 ; 1 1 6 ; 3 6 1ω where r ðkÞ ¼ω ðkÞ b − z ðkÞ is the residual at the k'th iteration and d ðkÞ ¼ d ð0Þ þ P k j¼1 r ðjÞ is the summation of the residuals. In the first line of Eq. (17), the objective is to minimize a strictly convex quadratic function, and the solution can be easily obtained as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 8 ; 1 1 6 ; 2 3 7ω As mentioned, R b and r bd are unknown in practice, and they can be estimated as R b ¼ P L l¼1 bðlÞb H ðlÞ∕L and r bd ¼ P L l¼1 bðlÞd Ã 0 ðlÞ∕L, where L denotes the number of snapshots that are used. Moreover, bðlÞ ¼ B H xðlÞ and d 0 ðlÞ ¼ v H t xðlÞ, where xðlÞ denotes the l'th space-time snapshot. [13][14][15] The solution of Eq. (18) can be obtained directly, i.e., noniteratively. However, it is impractical because the inversion of ðR b þ ρIÞ has a high computational complexity of O½ðNK − 1Þ 3 . Note that, according to Fig. 3, the clutter covariance matrix constructed by the training snapshots with regard to the current detecting snapshot can be written as E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 1 9 ; 1 1 6 ; 1 0 1 where R ⌢ b is constructed by the training snapshots with regard to the previous detecting snapshot. Denote P ð0Þ ¼ ðR ⌢ b þ ρIÞ −1 ; then, according to the matrix inversion lemma, 24 we obtain E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 0 ; 1 1 6 ; 4 3 5 It is clear that P ð4Þ ¼ ðR b þ ρIÞ −1 . Hence, the computational complexity can be reduced to O½8ðNK − 1Þ 2 . A full analysis of the computational complexity is presented in Table 1.
In the second line of Eq. (17), the z-update can be represented as Although the absolute value function is not differentiable, a simple closed-form solution can easily be obtained. Explicitly, the solution is where S λ ðzÞ is the soft-thresholding operator. The soft-thresholding operator is essentially a shrinkage operator, which moves a point toward zero.
In the ADMM-STAP algorithm,ω b and z are updated alternately, which accounts for the term alternating direction. The reasonable stopping criteria are that the primal and dual residuals must be small,

RLS-STAP
where ε pri and ε dual are thresholds that are chosen by absolute and relative criteria E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 4 ; 1 1 6 ; 4 6 0 A reasonable value for ε rel is 10 −4 − 10 −3 , and the choice of ε abs depends on the scale of the typical variable values. The detailed iterative procedure of ADMM-STAP is shown in Fig. 4.

Analysis of Convergence
A proof of the convergence result is presented in this section. First, we begin our proof by presenting the following theorem. Theorem 1 (Eckstein-Bertsekas): 25 Consider the problem E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 5 ; 1 1 6 ; 3 0 9 min u f 1 ðuÞ þ f 2 ðvÞ s:t: v ¼ Gu ; (25) in the case where the functions f 1 ð·Þ and f 2 ð·Þ are closed, proper, and convex and G has a full column rank. Let fη k ≥ 0; k ¼ 0; 1; · · · g and fγ k ≥ 0; k ¼ 0; 1; · · · g be two sequences such that E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 6 ; 1 1 6 ; 2 4 5 Assume that there are three sequences fu k ; k ¼ 0;1; · · · g, fv k ; k ¼ 0;1; · · · g, and ft k ; k ¼ 0;1; · · · g that satisfy E Q -T A R G E T ; t e m p : i n t r a l i n k -; e 0 2 7 ; 1 1 6 ; 1 7 3 Then, if Eq. (25) has an optimal solution u † , the sequence fu k g converges to this solution, i.e., u k → u † .
First, since Eq. (14) is a particular instance when G ¼ I, the full-rank condition in Theorem 1 can be satisfied. Second, it is clear that R bωb and f 2 ðzÞ ¼ 2λkzk 1 in Eq. (14) are closed, proper, and convex. Moreover, the sequences fω ðkÞ b g, fz ðkÞ g, and fu ðkÞ g generated by Eq. (17) satisfy the conditions of Eq. (27) in a strict sense (η k ¼ γ k ¼ 0). Hence, the convergence is guaranteed.

Analysis of Computational Complexity
A comparison of the computational complexities of four STAP algorithms, namely, the conventional sample matrix inversion (SMI) STAP, 2 l 1 -regularized RLS-STAP, 14 l 1 -regularized online coordinate descent (OCD) STAP, 26 and the proposed ADMM-STAP algorithms, is presented in Table 1. The computational complexity is measured by the number of complex multiplications and additions. As shown in Table 1, the ADMM-STAP algorithm has a computational complexity of O½ðM þ 8ÞðNK − 1Þ 2 , where M is the number of iterations. According to the simulation in Sec. 4, the algorithm can converge to an acceptable solution within a few tens of iterations, i.e., M þ 8 would be less than 4L and NK − 1. Hence, the ADMM-STAP algorithm has the lowest level of computational complexity.

Simulation Results
The simulation parameters for the ground moving target indication application are listed in Table 2: a radar system equipped with a side-looking ULA is employed, and the elements are spaced half a wavelength apart, i.e., d ¼ λ∕2. Additive noise is modeled as spatially and temporally independent complex Gaussian noise with zero mean and unit variance. f r ¼ 4v p ∕λ; hence, β ¼ 2v p T r ∕d ¼ 1. All the results are obtained from the average of 100 independent Monte-Carlo simulations.

Setting of Regularization Parameter
The regularization parameter provides a tradeoff between the SCNR steady-state performance and the convergence speed. Although it is clear that the value of λ should be proportional to the noise power and be inversely proportional to the rank of the clutter covariance matrix, it is still difficult to determine the optimal value. Adjusting the regularization parameter adaptively is an interesting research area (e.g., Refs. 13 and 14). However, this area is not the main focus of our paper. In this paper, the regularization parameter is selected from a fixed set Ω ¼ f0.1; 1;10; 50g. The output SCNR versus the number of snapshots that are used with different values of the regularization parameter λ is shown in Fig. 5. In this simulation, we assume that the signal of the moving target impinges the array from a DOA of 90 deg and that the radial velocity of the moving target v t is 28 m∕s (the Doppler frequency of the moving target is nearly 231 Hz). The results in Fig. 5 indicate that (i) the value of λ is crucial to the output SCNR performance, and there is a reasonable range of values, i.e., 1 ≤ λ ≤ 10, that can improve the convergence speed and the output SCNR steady-state performance simultaneously; (ii) the output SCNR is degraded when λ is too large since the filter weight vector is shrunk to zero; and (iii) the output SCNR performance is not considerably improved when λ is too small. In this case, the output SCNR performance is nearly similar to that of the conventional STAP algorithm.
The output SCNR performance versus the Doppler frequency of the moving target at a DOA of 90 deg is shown in Fig. 6. The range of potential Doppler frequency is from −500− to 500 Hz, and 60 snapshots are used to optimize the filter vector. The same conclusion can be obtained. This figure shows that the ADMM-STAP algorithm with 1 ≤ λ ≤ 10 provides a satisfactory output SCNR performance.  The number of iterations with different values of λ is shown in Fig. 7. As shown, if we choose λ from an appropriate range (0.5 ≤ λ ≤ 10), then the ADMM-STAP algorithm can converge rapidly within a few tens of iterations, which is acceptable in practice. Otherwise, the number of iterations increases significantly, and the iteration output cannot converge to the optimal solution leading, to a performance degradation to a certain extent.

Comparison with Other Algorithms
In this section, we will compare the output SCNR performance of our proposed algorithm with that of IPM-STAP, OCD-STAP, and RLS-STAP algorithms. The regularization parameter λ is set to 1 for all the algorithms, and the other parameters are the same as in the previous simulations. The output SCNR performances versus the number of used snapshots and the target Doppler frequency are compared in Figs. 8 and 9. As shown in these figures, we can see that (i) the output SCNR performance of the IPM-STAP algorithm is superior to that of the RLS-STAP and OCD-STAP algorithms. However, it is achieved at a high computational cost and (ii) the output SCNR performance of the ADMM-STAP algorithm can outperform that of the IPM-STAP algorithm, which supports our previous conclusion that optimizing the problem of parameter estimation to a high accuracy generally yields no improvement.

l 1 -Regularized STAP with Mountaintop Data
The performance of the l 1 -regularized STAP approaches is verified here using the Mountaintop data set (data No. t38pre01v1) acquired with the experimental radar system RSTER (radar surveillance technology experimental radar) sponsored by the Advanced Research Projects Agency. The Mountaintop program is devoted to supporting the mission requirements of next-generation airborne early warning platforms and to supporting the evaluation of STAP algorithms. The antenna for the system is a 5-m wide by 10-m high horizontally polarized array composed of 14 column elements. The CPI pulse number is 16, the antenna array spacing is 0.333 m, the PRF is 625 Hz, the carrier frequency is 435 MHz, and the bandwidth is 500 kHz. The transmit beam is steered to illuminate a mountain range (a large clutter scatter). The data set is divided into two subsets in our experiment. The first subset, including 100 snapshots, is used to train the STAP filters. The second subset, including 100 snapshots, is used to test the performance. Two simulated moving targets are added to the test data subset. The signal of the first target impinges the array from a DOA of −25 deg, and the Doppler frequency is 62.5 Hz. The signal of the second target impinges the array from a DOA of 20 deg, and the Doppler frequency is 187.5 Hz. Hence, the first target can essentially be regarded as a ground moving vehicle in the mountain, and the second target can be regarded as an aircraft near the mountain. The minimum variance distortionless response (MVDR) spectra of the two subsets are shown in Fig. 10.
The improvement factor (IF) performance, which is defined as the ratio of the output SCNR to the input SCNR, is investigated in Fig. 11. The regularization parameter λ is set to 1 for all the  algorithms. As shown, the IF performance of the proposed ADMM-STAP approach substantially outperforms that of the other approaches. Hence, the effectiveness of the proposed approach is confirmed by an experimental multichannel radar system RSTER.

Conclusions
In this paper, we proposed a sparsity-based approach based on an l 1 -regularized constraint to accelerate the convergence speed of STAP. The optimization problem with an additional l 1 -regularized constraint was solved using the ADMM, and the detailed iterative procedure of ADMM-SATP was derived. Through the examples, it was demonstrated that the proposed method can effectively decrease the required number of secondary snapshots and provide better performance than the l 1 -regularized OCD-STAP and l 1 -regularized RLS-STAP methods.