## 1.

## Introduction

Through-the-wall radar imaging (TWRI) is an emerging technology with considerable research interest and important applications in surveillance and reconnaissance for both civilian and military missions.^{1}2.3.4.5.^{–}^{6} To deliver high-resolution radar images in both range and crossrange, TWRI systems use wideband signals and large aperture arrays (physical or synthetic). This leads to prolonged data acquisition and high computational complexity because a large number of samples need to be processed. New approaches for TWRI are therefore needed to obtain high-quality images from fewer data samples at a faster speed. To this end, this paper proposes a new approach using compressive sensing (CS) for through-the-wall radar imaging. CS is used here to reconstruct a full measurement set, which is then employed for image formation using delay-and-sum (DS) beamforming.

CS enables a sparse signal to be reconstructed using considerably fewer data samples than what is required by the Nyquist-Shannon theorem.^{7}8.^{–}^{9} In through-the-wall radar imaging, the objective of applying CS is to speed up data acquisition and achieve high-resolution imaging.^{10}11.12.13.14.^{–}^{15} So far, the application of CS in TWRI can be divided into two main categories. In the first category, CS is applied to reconstruct the imaged scene directly by solving an ${\ell}_{1}$ optimization problem or using a greedy reconstruction algorithm.^{10}^{,}^{11}^{,}^{13}14.^{–}^{15} In the second category, CS is employed in conjunction with traditional beamforming methods. In other words, CS is applied to reconstruct a full data volume, and the conventional image formation methods, such as DS beamforming, are then used to form the image of the scene.^{12} By exploiting CS, the latter approach enables conventional beamforming methods to reconstruct high-quality images from reduced data samples. Moreover, adopting a conventional image formation approach produces images suitable for target detection and classification tasks, which typically follow the image formation step.

In Ref. 12, the full measurement set is recovered from the range profiles obtained by solving a separate CS problem at each sensor location. CS is applied in the temporal frequency domain only, leaving uncompressed sensing in the spatial domain. To recover the full measurement set, several CS problems are solved independently—one for each sensing location. There are also limitations in reducing the measurements along the temporal frequencies. Since the target radar-cross-section depends highly on signal frequency, significant reduction in transmitted frequencies will lead to deficient information about the target.^{13} Thus, to guarantee accurate reconstruction, imaging the scene with extended targets may require an increase in the number of measurements.^{5}^{,}^{16}

The conventional beamforming methods have been shown to be effective for image-based indoor target detection and localization when using a large aperture array and large signal bandwidth.^{17}18.19.20.21.22.23.^{–}^{24} However, a limitation of the traditional beamforming methods is that they require the full data volume to form a high-quality image; otherwise, the image quality deteriorates rapidly with a reduction of measurements. The question is then how to exploit the advantages of the traditional beamforming methods to obtain high-quality images from a reduced set of measurements.

To answer the aforementioned question and address the limitation of existing CS-based imaging methods, this paper proposes a new CS approach for TWRI, whereby a significant reduction in measurements is achieved by compressing both the transmitted frequencies and the sensor locations. First, CS is employed to restore the full measurement set. Then DS beamforming is applied to reconstruct the image of the scene. To increase sparsity of the CS solution, an overcomplete Gabor dictionary is used for sparse representation of the imaged scene; Gabor dictionaries have been shown to be effective for image sparse decomposition and representation.^{25}26.^{–}^{27} In the proposed approach, fast data acquisition is achieved by reducing both the number of transceivers and transmitted frequencies used to collect the measurement samples. In Ref. 12, data collection was performed at all antenna locations, using a reduced set of frequencies only. In contrast, the proposed approach achieves further measurement reduction by subsampling both the number of frequencies and antenna locations used for data collection. Furthermore, to satisfy the sparsity assumption, a Gabor dictionary is incorporated in the scene representation. In Ref. 14, a wavelet transform was used as a sparsifying basis for the scene. However, our preliminary experiments show that the performance is highly dependent on the particular wavelet function used. We also found that wavelets offer no significant advantage over Gabor basis in the problem of through-the-wall radar image formation. Finally, we should note that there are several approaches that have been proposed for wall clutter mitigation in TWRI,^{1}^{,}^{28} including recent successful CS-based techniques.^{29}^{,}^{30} In this paper, we assume that wall clutter can be removed using any of those techniques, or the background scene is available to perform background subtraction.

The remainder of the paper is organized as follows. Section 2 gives a brief review of CS theory. Section 3 presents TWRI using DS beamforming, and describes the proposed approach for TWRI image formation. Section 4 presents experimental results and analysis. Section 5 gives concluding remarks.

## 2.

## Compressive Sensing

CS is an innovative and revolutionary idea that offers joint sensing and compression for sparse signals.^{7}8.^{–}^{9}^{,}^{31} Consider a $P$-dimensional signal $\mathbf{x}$ to be represented using a dictionary $\mathrm{\Psi}\in {\mathbb{R}}^{P\times Q}$ with $Q$ atoms. The dictionary is assumed to be overcomplete, that is, $Q>P$. Signal $\mathbf{x}$ is said to be $K$-sparse if it can be expressed as

Using a projection matrix $\mathbf{\Phi}$ of size $L\times P$, where $K<L\ll P$, we can obtain an $L$-dimensional measurement vector $\mathbf{y}$ as follows:

The original signal $\mathbf{x}$ can be reconstructed from $\mathbf{y}$ by exploiting its sparsity. Among all $\mathit{\alpha}$ satisfying $\mathbf{y}=\mathbf{\Phi}\mathbf{\Psi}\mathit{\alpha}$ we seek the sparsest vector, and then obtain $\mathbf{x}$ using Eq. (1). This signal reconstruction requires solving the following problem:

## (3)

$$\mathrm{min}{\Vert \mathit{\alpha}\Vert}_{0}\phantom{\rule[-0.0ex]{1em}{0.0ex}}\text{subject to}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathbf{y}=\mathbf{\Phi}\mathbf{\Psi}\mathit{\alpha}.$$Equation (3) is known to be NP-hard.^{32} Alternatively, the problem can be cast into an ${\ell}_{1}$ regularization problem:

## (4)

$$\mathrm{min}{\Vert \mathit{\alpha}\Vert}_{1}{\phantom{\rule[-0.0ex]{1em}{0.0ex}}\text{subject to}\Vert \mathbf{y}-\mathbf{\Phi}\mathbf{\Psi}\mathit{\alpha}\Vert}_{2}\le \u03f5,$$^{33}basis pursuit,

^{34}and orthogonal matching pursuit,

^{35}have been proposed that produce stable and accurate solutions.

## 3.

## Proposed Approach

In this section, we introduce the proposed two-stage TWRI approach based on CS. The main steps of the proposed approach are as follows. First, compressive measurements are acquired using a fast data acquisition scheme that requires only a reduced set of antenna locations and frequency bins. An additional Gabor dictionary is incorporated into the CS model to sparsely represent the scene. Next, the full TWRI data samples are recovered, and then conventional DS technique is applied to generate the scene image. In this section, we first give a brief review of the conventional DS beamforming method for image formation in Sec. 3.1, before presenting the new image formation approach in Sec. 3.2.

## 3.1.

### TWRI Using Delay-and-Sum Beamforming

Consider a stepped-frequency monostatic TWRI system that uses $M$ transceivers and $N$ narrowband signals to image a scene containing $R$ targets. The signal received at the $m$-th antenna location and $n$-th frequency, ${f}_{n}$, is given by

where ${\sigma}_{r}({f}_{n})$ is the reflection coefficient of the $r$-th target for the $n$-th frequency, and ${\tau}_{m,r}$ is the round-trip travel time of the signal from the $m$-th antenna location to the $r$-th target location. In the stepped-frequency approach, the frequency bins ${f}_{n}$ are uniformly distributed over the entire frequency band, with a step size $\mathrm{\Delta}f$:## (6)

$${f}_{n}={f}_{1}+(n-1)\mathrm{\Delta}f,\phantom{\rule[-0.0ex]{1em}{0.0ex}}\mathrm{for}\text{\hspace{0.17em}}\text{\hspace{0.17em}}n=1,2,\dots ,N,$$The target space behind the wall is partitioned into a rectangular grid, with ${N}_{x}$ pixels along the crossrange direction and ${N}_{y}$ pixels along the downrange direction. Using DS beamforming, a complex image is formed by aggregating the measurements ${z}_{m,n}$. The value of the pixel at location $(x,y)$ is computed as follows:

## (7)

$$I(x,y)=\frac{1}{\mathrm{MN}}\sum _{m=1}^{M}\sum _{n=1}^{N}{z}_{m,n}\text{\hspace{0.17em}}\mathrm{exp}\{j2\pi {f}_{n}{\tau}_{\mathrm{}m(x,y)}\},$$## 3.2.

### Proposed Two-Stage TWRI

Let $\mathbf{z}$ be the column vector obtained by stacking the data samples ${z}_{m,n}$ in Eq. (5), where $m=1,2,\dots ,M$ and $n=1,2,\dots ,N$. Let ${s}_{xy}$ be an indicator function defined as

## (8)

$${s}_{xy}=\{\begin{array}{ll}{\sigma}_{r},& \mathrm{if}\text{\hspace{0.17em}}\text{\hspace{0.17em}}a\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{target}\text{\hspace{0.17em}}\text{\hspace{0.17em}}r\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{exists}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{at}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{the}\text{\hspace{0.17em}}\text{\hspace{0.17em}}xy\text{-th}\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{pixel};\\ 0,& \mathrm{otherwise}\end{array}.$$The elements ${s}_{xy}$ are then lexicographically ordered into a column vector $\mathbf{s}$. The magnitude of each element in $\mathbf{s}$ reflects the significance of a point in the scene. From Eq. (5), the full measurement vector $\mathbf{z}$ can be represented as

where $\mathbf{\Psi}$ is an overcomplete dictionary, which depends on the target scene, the antenna locations, and the transmitted frequencies. More precisely, $\mathbf{\Psi}$ is a matrix with ($M\times N$) rows, and (${N}_{x}\times {N}_{y}$) columns. The entry at row $r$ and column $c$ is given as where $r=(m-1)\times N+n$, and $c=(x-1)\times {N}_{y}+y$.To reduce the data acquisition time and computational complexity, we propose acquiring only a small number of samples, represented by vector $\mathbf{y}$. The measurements in $\mathbf{y}$ are obtained by selecting only a subset of ${M}_{a}$ antenna locations and ${N}_{f}$ frequencies. In this paper, the reduced antenna locations are uniformly selected, and at each selected antenna location, the same number of frequency bins are regularly subsampled. This fast data acquisition scheme leads to stable image quality and is more suitable for hardware implementation. Figure 1(a) shows the conventional radar imaging where full data samples are acquired. Figure 1(b) illustrates the space-frequency subsampling pattern used in the proposed approach.

Mathematically, the CS data acquisition can be represented using a projection matrix $\mathbf{\Phi}$ with (${M}_{a}\times {N}_{f}$) rows and ($M\times N$) columns. Each row of $\mathbf{\Phi}$ has only one non-zero entry with a value of 1 at a position determined by the selected antenna locations and frequency bins. Thus the reduced measurement vector $\mathbf{y}$ can be expressed as

## (11)

$$\mathbf{y}=\mathbf{\Phi}\mathbf{z}=\mathbf{\Phi}\mathbf{\Psi}\mathbf{s}=\mathbf{A}\mathbf{s},$$In practical situations, the scene behind the wall is not exactly sparse because of multipath propagations, wall reflections and the presence of extended objects, such as people or furniture. Therefore, the sparsity assumption of vector $\mathbf{s}$ may be violated. To overcome this problem, an additional overcomplete dictionary is employed to sparsely represent $\mathbf{s}$. In our approach, a Gabor dictionary is used. Let $\mathbf{W}$ be the synthesis operator for the signal expansion. Thus, the vector $\mathbf{s}$ can be expressed as

Substituting Eq. (12) into Eq. (11) yields

For noisy radar signals, the compressive measurement vector $\mathbf{y}$ is modeled as

where $\mathbf{v}$ is the noise component.The full data volume can be recovered by two techniques: the synthesis method and the analysis method. In the synthesis technique, the problem is cast as follows:

## (15)

$$\mathrm{min}{\Vert \mathit{\alpha}\Vert}_{1}\phantom{\rule[-0.0ex]{1em}{0.0ex}}\text{subject to}{\Vert \mathbf{y}-\mathbf{A}\mathbf{W}\mathit{\alpha}\Vert}_{2}\le \u03f5\mathrm{.}$$Once the coefficient $\mathit{\alpha}$ has been obtained by solving the optimization problem, the full TWRI data samples are obtained, using Eqs. (9) and (12),

In the analysis technique, the problem is formulated as

## (17)

$$\mathrm{min}{\Vert {\mathbf{W}}^{-1}\mathbf{s}\Vert}_{1}\phantom{\rule[-0.0ex]{1em}{0.0ex}}\text{subject to}{\Vert \mathbf{y}-\mathbf{A}\mathbf{s}\Vert}_{2}\le \u03f5.$$By solving this optimization problem, we obtain the vector $\mathbf{s}$ directly, which can be used to reconstruct the full measurement vector $\mathbf{z}$, see Eq. (9).

Note that it was suggested in Ref. 37 that the analysis technique is less sensitive to noise, compared to the synthesis technique. In our approach, we use the analysis technique for solving the CS problem. After reconstructing the full measurement vector $\mathbf{z}$, we apply the conventional DS beamforming to generate the scene image as described in Sec. 3.1.

## 4.

## Experimental Results and Analysis

In this section, we evaluate the proposed approach using both synthetic and real TWRI data sets. First, the performance of the proposed approach is investigated in Sec. 4.1 using synthetic data. Then the experimental results on real data are presented in Sec. 4.2, along with the TWRI experimental setup for radar signal acquisition.

## 4.1.

### Synthetic TWRI Data

Data acquisition is simulated for a stepped-frequency radar system, with a frequency range between 0.7 and 3.1 GHz with a 12-MHz frequency step. Therefore, the number of frequency bins used is $N=201$. The scene is illuminated with an antenna array of length 1.24 m and an inter-element spacing of 0.022 m, which means the number of transceivers used is $M=57$. The full data volume $\mathbf{z}$ comprises $M\times N=\phantom{\rule{0ex}{0ex}}57\times 201=11,457$ samples. Our goal is to acquire much fewer data samples without degrading the quality of the image.

The TWRI system is placed in front of a wall at a standoff distance of ${Z}_{\mathrm{off}}=1.5\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{m}$. The thickness and relative permittivity of the wall are $d=0.143\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{m}$ and ${\u03f5}_{r}=7.6$, respectively. The downrange and crossrange of the scene extend from 0 to 6 m, and $-2$ to 2 m, respectively. The pixel size is equal to the Rayleigh resolution of the radar, which gives an image of size $97\times 65\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{pixels}$. In this experiment, three extended targets (each covering 4 pixels) are placed behind the wall at positions ${p}_{1}=$ (1.21 m, $-0.78\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{m}$), ${p}_{2}=$ (3.09 m, 1.09 m), and ${p}_{3}=(4.96\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{m},-0.16\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{m})$. The reflection coefficients are considered to be independent of signal frequency: ${\sigma}_{1}=1$, ${\sigma}_{2}=0.5$ and ${\sigma}_{3}=0.7$, respectively. In our experiment, the first-order method Nesta is used to solve the CS optimization problem with the analysis method due to the robustness, flexibility, and speed of the solver. More details about the Nesta solver can be found in Ref. 33. Here a dictionary consisting of the complex Gabor functions is used for sparse decomposition of the scene.^{27}

The peak-signal-to-noise ratio (PSNR) is used to evaluate the quality of the reconstructed images:

where ${I}_{\mathrm{max}}$ denotes the maximum pixel value, and RMSE is the root-mean-square error between the reconstructed and the ground-truth images. The performance of the proposed approach in the presence of noise is evaluated by adding white Gaussian noise to the received signal.Figure 2 shows the ground-truth image and the DS beamforming image reconstructed using the full measurement volume. Note that in this paper, all output images are normalized by the maximum image intensity. The true target position is indicated with a solid white rectangle. Figure 3 illustrates the images formed with reduced subsets of measurements (12% and 1%), using DS beamforming [Fig. 3(a) and 3(b)] and the proposed approach [Fig. 3(c) and 3(d)]. Here, the received signals are corrupted by additive white Gaussian (AWG) noise with $\mathrm{SNR}=20\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{dB}$. Compared to the image obtained using DS beamforming with all measurement samples [Fig. 2(b)], the images produced using DS beamforming with reduced data samples [Fig. 3(a) and 3(b)] deteriorate significantly in quality and contain many false targets. By contrast, Fig. 3(c) and 3(d) shows that images obtained with the proposed approach, using the same reduced datasets, suffer little or no degradation. These results demonstrate that the proposed approach performs significantly better than the standard DS beamforming when the number of measurements is reduced significantly. Furthermore, the images produced by the proposed approach using the reduced data samples have the same visual quality as the images formed by the standard DS beamforming method using the full data volume.

To evaluate the robustness of the proposed approach in the presence of noise, the measurement signals are corrupted with AWG with SNR equal to 5 and 30 dB. Figure 4 presents the average PSNR of the reconstructed images as a function of the ratio between the reduced measurement set and the full dataset. The figure clearly shows that the images formed with the proposed approach have considerably higher PSNR than the images formed with the standard DS beamforming, using the same measurements. This is because the proposed approach reconstructs the full data samples using CS, before applying DS beamforming.

To compare the performance of different imaging methods, we used three antenna locations and 40 uniformly selected frequencies, which represents 1% of the total data volume. Figure 5 shows the results obtained by different imaging methods. Figure 5(a) shows the CS image reconstructed with the method proposed in Ref. 10. Although the targets can easily be located, there are many false targets in the image. Figure 5(b) illustrates the image formed with the method presented in Ref. 12; this image is considerably degraded with the presence of heavy clutter. The reason is that the imaging method in Ref. 12 is not able to restore the full data volume from a reduced set of antenna locations. Figure 5(c) and 5(d) shows the images formed with the proposed approach using wavelet and Gabor sparsifying dictionaries, respectively. Here, the wavelet family is the dual-tree complex wavelet transform. It can clearly be observed that the image formed using the Gabor dictionary contains less clutter; however, both dictionaries yield high-quality images even with a significant reduction in the number of collected measurements.

In the next experiment, only the frequency samples are reduced; the data is collected at all antenna locations, using $M=57$ transceivers. The reduced dataset represents 20% of the full data volume. Figure 6 presents the images formed using different approaches: standard CS method,^{10} the temporal frequency CS method,^{12} the proposed method with a wavelet dictionary, and the proposed method with a Gabor dictionary. It can be observed from Fig. 6(b) that there is a substantial improvement in the performance of the temporal frequency CS method.^{12} This is because when using all antenna locations, this imaging method can obtain the full data volume for forming the image. However, the proposed method yields images with less clutter, using both wavelet and Gabor dictionaries.

In summary, experimental results on synthetic TWRI data demonstrate that the proposed approach produces high-quality images using far fewer measurements by applying CS data acquisition in both frequency domain and spatial domain. The proposed approach performs better than the conventional DS and CS-based TWRI methods, especially when the number of measurements is drastically reduced.

## 4.2.

### Real TWRI Data

In this experiment, the proposed approach is evaluated on real TWRI data. The data used in this experiment were collected at the Radar Imaging Laboratory of the Center for Advanced Communications, Villanova University, USA. The radar system was placed in front of a concrete wall of thickness 0.143 m, and relative permittivity ${\u03f5}_{r}=7.6$. The imaged scene is depicted in Fig. 7. It contains a 0.4 m high and 0.3 m wide dihedral, placed on a turntable made of two $1.2\times 2.4\text{\hspace{0.17em}}\text{\hspace{0.17em}}{\mathrm{m}}^{2}$ sheets of 0.013 m thick plywood. A step-frequency signal between 0.7 and 3.1 GHz, with 3-MHz frequency step, was used to illuminate the scene. The antenna array was placed at a height of 1.22 m above the floor and a standoff distance of 1.016 m away from the wall. The antenna array was 1.24 m long, with inter-element spacing of 0.022 m. Therefore, the number of antenna elements is $M=57$ and the number of frequencies is $N=801$; the full measurement vector $\mathbf{z}$ comprises $M\times N=\phantom{\rule{0ex}{0ex}}57\times 801=45,657$ samples. The imaged scene, extending from $[0,3]\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{m}$ in downrange and $[-1,1]\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{m}$ in crossrange, the scene is partitioned into $81\times 54\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{pixels}$.

To quantify the performance of the various imaging methods, we use the target-to-clutter ratio (TCR) as a measure of quality of reconstructed images. The TCR is defined as the ratio between the maximum magnitude of the target pixels and the average magnitude of clutter pixels (in dBs):^{1}

## (19)

$$\mathrm{TCR}=20\text{\hspace{0.17em}}{\mathrm{log}}_{10}\frac{{\mathrm{max}}_{(x,y)\in {R}_{t}}|I(x,y)|}{\frac{1}{{N}_{c}}{\sum}_{(x,y)\in {R}_{c}}|I(x,y)|},$$For reference purposes, Fig. 8(a) presents the image formed by the standard DS beamforming method using the full data volume. If all available data samples are used, the conventional DS beamforming method yields a high-quality image ($\mathrm{TCR}=30.33\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{dB}$). However, when the number of samples is significantly reduced, the standard DS beamforming method alone does not yield a high-quality image. Figure 8(b) shows the image formed using 2 antenna locations and 200 frequency bins (i.e., 0.9% of the collected data). Clearly this image contains too much clutter ($\mathrm{TCR}=\phantom{\rule{0ex}{0ex}}16.76\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{dB}$).

Using the same reduced dataset, we compare the proposed approach with other CS-based TWRI methods. Figure 9(a) shows the standard CS image formed using the approach in Ref. 10. This is a significantly degraded image, compared to the image in Fig. 8(a) (obtained using DS beamforming with full measurements). The reason is that the imaging method in Ref. 10 directly forms the scene image by solving the conventional CS problem; when the measurements are drastically reduced and the CS solution is moderately sparse due to the presence of clutter and noise, the reconstructed image becomes less accurate. Because of the appearance of heavy clutter in Fig. 9(a), the TCR of the formed image drops to 21.78 dB. Figure 9(b) presents the image formed by the temporal frequency CS method of Ref. 12. The quality of the formed image deteriorates because this method does not recover the full data volume when the antenna locations are reduced. The background noise and clutter appear with stronger intensity than the target in the reconstructed image ($\mathrm{TCR}=12.13\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{dB}$). Figure 9(c) and 9(d) shows, respectively, images formed by the proposed approach without and with the sparsifying Gabor dictionary. It can be observed that the image in Fig. 9(c), formed without the Gabor sparsifying basis, contains high clutter ($\mathrm{TCR}=\phantom{\rule{0ex}{0ex}}14.40\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{dB}$), resulting in false targets. By contrast, Fig. 9(d) shows that the image formed using the proposed approach is considerably enhanced by incorporating the Gabor dictionary; the true target is located accurately and the clutter is considerably suppressed ($\mathrm{TCR}=28.82\text{\hspace{0.17em}}\text{\hspace{0.17em}}\mathrm{dB}$).

The effectiveness of the proposed approach is partly due to the excellent space-frequency localization of Gabor atoms. The Gabor functions are optimum in the sense that they achieve the minimum space-bandwidth product (by analogy to time-bandwidth product), which gives the best tradeoff between signal localization in space and spatial frequency domains. Figure 10 shows the recovered signal coefficients $\mathbf{s}$ for the dihedral scene shown in Fig. 7. The signal coefficients recovered with the Gabor dictionary, shown in Fig. 10(a), are much more sparse and concentrated on the target location, whereas the signal coefficients recovered without using the Gabor dictionary, Fig. 10(b), are more spread out.

In the final experiment, we use several wavelet families [Daubechies 8, Coiflet 2, and the dual-tree complex wavelet transform (DT-CWT)] as sparsifying basis, and compare their performances with that of the Gabor dictionary. All wavelet transforms use three decomposition levels. Figure 11 illustrates the images formed using different wavelet transforms [Fig. 11(a) to 11(c)], and the image formed with the Gabor dictionary [Fig. 11(d)]. It can be observed from the figure that the images reconstructed with the DT-CWT and the Gabor dictionaries are of superior quality than those obtained with the Daubechies and Coiflet wavelets. The formed images using the DT-CWT and the Gabor dictionary have similar TCRs of 28.71 and 28.82 dB, respectively. The superiority of the DT-CWT and the Gabor dictionaries can be explained by better directional selectivity and localization in space and spatial-frequency. However, we should note that the choice of the best dictionary for a specific TWRI system depends on many factors, such as the scene characteristics, target structure and the decomposition level.

## 5.

## Conclusion

In this paper, we proposed a new approach for TWRI image formation based on CS and DS beamforming. The proposed approach requires significantly fewer number of frequency bins and antenna locations for sensing operations. This leads to a considerable reduction in data acquisition, processing time, and computational complexity, while producing TWRI images of almost the same quality as the DS beamforming approach with full data volume. The experimental results demonstrate that the proposed approach produces images with considerably higher PSNRs and is less sensitive to noise or the number of data samples used, compared to the standard DS beamforming. Furthermore, experimental results on real TWRI data indicate that the proposed approach produces images with higher TCRs compared to other CS-based image formation methods. Our approach also produces images of similar TCRs compared with the DS beamforming approach that uses the full data volume. Therefore it would be reasonable to expect that the proposed approach will enhance TWRI target detection, localization and classification, while allowing a reduction in the number of measurements and data acquisition time.

## Acknowledgments

We thank the Center of Advanced Communications at Villanova University, USA, for providing the real TWRI data used in the experiments. This work is supported in part by a grant from the Australian Research Council. We thank the anonymous reviewers for the constructive comments and suggestions.

## References

## Biography

**Van Ha Tang** received a BEng degree in 2005 and an MEng degree in 2008, both in computer engineering, from Le Quy Don Technical University, Hanoi, Vietnam. He is currently completing his PhD degree in computer engineering from the University of Wollongong, Australia.

**Abdesselam Bouzerdoum** received his MSc and PhD degrees in electrical engineering from the University of Washington, Seattle. In 1991, he joined the University of Adelaide, Australia, and in 1998, he was appointed associate professor at Edith Cowan University, Perth, Australia. Since 2004, he has been professor of computer engineering with the University of Wollongong, where he served as head of School of Electrical, Computer and Telecommunications Engineering (2004 to 2006), and associate dean of research, Faculty of Informatics (2007 to 2013). He is the recipient of the Eureka Prize for Outstanding Science in Support of Defence or National Security (2011), Chester Sall Award (2005), and a Chercheur de Haut Niveau Award from the French Ministry of Research (2001). He has published over 280 technical articles and graduated many PhD and master’s students. He is a senior member of IEEE and a member of the International Neural Network Society and the Optical Society of America.

**Son Lam Phung** received the BEng degree with first-class honors in 1999 and a PhD degree in 2003, all in computer engineering, from Edith Cowan University, Perth, Australia. He received the University and Faculty Medals in 2000. He is currently a senior lecturer in the School of Electrical, Computer and Telecommunications Engineering, University of Wollongong. His general research interests are in the areas of image and signal processing, neural networks, pattern recognition, and machine learning.