## 1.

## Introduction

The frequency range of a millimeter wave (MMW) is 30 to 300 GHz.^{1} The frequency range of terahertz (THz) is normally 100 GHz to 10 THz.^{2} The wavelength of MMW is longer; it has higher atmospheric transmittance, and in the process of propagation, poor environmental conditions have less impact on it.^{1} The frequency of THz is higher; it has higher spatial resolution or longer depth of field when the spatial resolution is consistent.^{2} Therefore, the images of the two bands may contain different information. Since the human body radiates MMW and THz waves, the two bands are often used for human body security imaging. Compared with commonly used MMW or THz single-band imaging systems, MMW and THz dual-band imaging systems can acquire more information because the images obtained using image fusion algorithms contain more accurate descriptions of the scene than single-band images could.^{3}4.^{–}^{5}

In our investigation, we employed an MMW and THz dual-band passive human-body security imaging system composed of 94 and 250 GHz single-element detectors to study MMW and THz dual-band image preprocessing and fusion algorithms.^{6}^{,}^{7} We developed an algorithm called the MMW and THz image preprocessing and fusion integrated algorithm (MMW-THz IPFIA), which is proposed herein. The MMW-THz IPFIA consists of denoising, enhancement, registration, and fusion algorithms and yields fused images that contain most of the information from the corresponding dual-band images, improves the probability of detection, and enhances the performance of the dual-band imaging system. The effectiveness of the algorithm is analyzed by using image signal-to-noise ratio (SNR) and information entropy.^{8}9.^{–}^{10} The performance improvement provided by the MMW-THz IPFIA to the system is also demonstrated in this report. Because the system responds to blackbody radiation, the commonly used thermal imaging system performance evaluation parameter—the minimum detectable temperature difference (MDTD)—was employed to analyze the system’s performance. Finally, we performed an experiment in which the MDTD was measured both before and after applying the MMW-THz IPFIA to obtain objective, quantitative evidence of the improvements it yields.

## 2.

## Millimeter Wave and Terahertz Dual-Band Passive Human-Body Security Imaging System

Figure 1 depicts a schematic diagram of a MMW and THz dual-band passive human-body security imaging system, which is composed of a plane mirror, trihedral focusing scanning mirror, reflector, beam splitter, 94-GHz single-element detector, and 250-GHz single-element detector. The plane mirror collects the MMW and THz radiation emitted by the human body as it is swept along the direction perpendicular to the image plane to conduct a line scan. The trihedral focusing scanning mirror focuses the target radiation on the reflector and rotates along the direction parallel to the image plane to perform a column scan. The reflector reflects the radiation to the beam splitter. The beam splitter is an optical element with a semitransparent, semireflective characteristic, and it can divide the radiation into two beams: the transmitted beam and the reflected beam. The transmitted beam and the reflected beam can be captured by the 94- and 250-GHz detectors, respectively, and used to form image signals and create dual-band images.

As shown in Fig. 2(a), the images obtained by the 94 and 250 GHz detectors depict a test subject hiding a steel ruler in front of the abdomen. Figures 2(b) and 2(c) present similarly generated images of the subject with a THz attenuator in front of the left side of the chest and with papers hidden within the subject’s clothes, respectively. In these original images, there is significant noise, the resolution and contrast are low, and affine problems are obvious. Thus, it is necessary to preprocess such images by performing denoising, enhancement, and registration before fusing them to obtain high SNRs, high contrast, and rich details in the resulting fused images.

## 3.

## Millimeter Wave and Terahertz Image Preprocessing and Fusion Integrated Algorithm

In this report, we propose an MMW-THz IPFIA, which can be used to preprocess and fuse MMW and THz dual-band human-body images.

## 3.1.

### Millimeter Wave-Terahertz Image Preprocessing and Fusion Integrated Algorithm

The MMW-THz IPFIA employs a block matching and three-dimensional (BM3D) filtering denoising algorithm, a contrast-limited adaptive histogram-equalization (CLAHE) enhancement algorithm, a mutual information based registration algorithm, and a wavelet transform-based image-fusion algorithm.

## 3.1.1.

#### Block matching and three-dimension

The BM3D algorithm, a three-dimensional (3-D) transform-domain filtering algorithm based on block matching, is one of the most effective image denoising algorithms currently in use. Figure 3 presents the structure of the BM3D algorithm, which consists of two steps: basic estimation and final denoising.^{11}^{,}^{12}

Basic estimation: Obtaining a basic estimate requires three steps.

1. Block matching and grouping: By sliding a window with dimensions of ${N}_{1}\times {N}_{1}$ matching a certain step length, divide Image I into several blocks. Assume that $P$ is the currently selected block; ${N}_{D}$ is the search diameter; $Q$ is the sliding block in the search region; and ${s}_{P}$ and ${s}_{Q}$, which represent the locations of the blocks, are the values of the pixels in the top left corners or the centers of blocks $P$ and $Q$, respectively. Then the distance between the two blocks is ${D}_{PQ}={\Vert {s}_{P}-{s}_{Q}\Vert}^{2}/{N}_{1}^{2}$. If an appropriate distance threshold $\tau $ is selected, where ${D}_{PQ}<\tau $, then $Q$ is the similarity block of $P$. In addition, the set of the similarity blocks of $P$ forms a 3-D matrix ${S}_{P}$, where ${S}_{P}=\{\mathrm{Q}\in \mathrm{I}|{D}_{PQ}<\tau \}$.

2. 3-D Transform Domain Denoising: First, arrange the elements of ${S}_{P}$ in order of the size of ${D}_{PQ}$. Next, perform a one-dimensional (1-D) Haar wavelet transfer ${\tau}_{3D}$ between the blocks and a two-dimensional (2-D) Bior wavelet hard-threshold filtering within the block. Then the transform coefficient of the hard threshold filtering is $r$, which can be modulated by

## (1)

$$r(x)=\{\begin{array}{ll}0& |x|\le \sigma {\lambda}_{3D}\\ x& |x|>\sigma {\lambda}_{3D}\end{array}.$$After performing that operation, write the denoised information back to the original position in the image by conducting an interblock inverse transformation. The 3-D transform denoising equation is as follows:

In Eq. (1), $x$ is the numeric of ${S}_{P}$ in the matrix, ${\lambda}_{3D}$ is the hard threshold shrinkage parameter, $\sigma $ is the estimated standard deviation of the noise, and ${R}_{P}$ is the denoised 3-D image of the block matrix.

3. Aggregation: After step (2) has been completed, an estimate will have been performed for every block and element in the image, and the similarity block set of $P$, ${S}_{P}$, will have changed to ${R}_{P}$. In ${R}_{P}$, ${N}_{P}$ is the number of nonzero coefficient blocks, ${\omega}_{P}$ is the basic estimated weight of $P$, and the equation used to calculate ${\omega}_{P}$ is as follows:

Because the step length of the sliding window may be less than its size, the blocks may overlap. Thus, an element $i$ may appear in different blocks. Estimating the value of $R$ for the $i$’th element requires reassessing the overlapping blocks by using the weighted average as follows:

## (4)

$$R(i)=\frac{\sum _{{S}_{P}}{\omega}_{p}\sum _{O\in {S}_{P}}{R}_{PO}}{\sum _{{S}_{P}}{\omega}_{p}\sum _{O\in {S}_{P}}{x}_{O}},$$## (5)

$${R}_{PO}=\{\begin{array}{l}\begin{array}{cc}{R}_{PO}& i\in O\end{array}\\ \begin{array}{cc}0& i\notin O\end{array}\end{array}$$## (6)

$${x}_{O}=\{\begin{array}{l}\begin{array}{cc}1& i\in O\end{array}\\ \begin{array}{cc}0& i\notin O\end{array}\end{array}.$$The final denoising process involves three steps.

1. Block matching and grouping: Match and group the denoised images obtained from the basic estimate, generating a new 3-D matrix ${S}_{P2}$.

2. Joint Wiener filtering denoising: Transform ${S}_{P}$ and ${S}_{P2}$ by applying a 3-D transformation ${\kappa}_{3D}$ (2-D DCT cosine transform and 1-D Haar wavelet transform). Use ${S}_{P}$ to perform Wiener filtering of ${S}_{P2}$ to obtain the final weight estimate $\omega $. Write the estimator of every pixel back to the original position by applying the interblock inverse transformation:

3. Aggregation: Re-estimate the overlapped blocks and obtain the result by calculating the weighted average:

## 3.1.2.

#### Image enhancement

Image enhancement technology can improve the visibility of barely observable targets and enhance the level of detail of the information acquired. Some small targets and detailed texture features occupy relatively few pixels in MMW and THz images. Thus, they occupy smaller portions of the images’ grayscales. In the proposed method, weak target loss during the fusion process is avoided since a CLAHE algorithm is used to enhance the denoised images.^{13}^{,}^{14}

The CLAHE algorithm, a classic image enhancement algorithm, involves four steps.

1. Blocking: Use a sliding window to pass over the image. The grayscale of the image is $[0,L-1]$. Acquire a series of local image sub-blocks with dimensions of $m\times n$.

2. Calculate the probability of grayscale $k$, ${p}_{r}({r}_{k})={n}_{k}/N$, and obtain the normalized histogram and probability distribution function (PDF) of every sub-block. In the preceding equation, $0\le {r}_{k}\le 1$ represents the $k$’th grayscale, ${n}_{k}$ is the number of pixels in grayscale ${r}_{k}$, $N=m\times n$ is the number of pixels in the sub-block, and $k=0,1,2,\dots ,L-1$, where $L$ is the gray level.

3. As depicted in Fig. 4, for the PDF of each sub-block, set a threshold: ${\mathrm{PDF}}_{th}$. The part of the PDF that is larger than ${\mathrm{PDF}}_{th}$ is the intercepted portion, the lower and upper limits of the grayscale of the intercepted portion are ${k}_{l}$ and ${k}_{h}$, respectively. If the total number of the gray level in the sub-block is $sL$, when the grayscale is ${k}_{th}\in [{k}_{l},{k}_{h}]$, the PDF corresponding to ${k}_{th}$ is divided into $sL$ copies, that is, ${\mathrm{PDF}}_{\mathrm{ave}}={\mathrm{PDF}}_{{k}_{th}}/sL$, then ${\mathrm{PDF}}_{\mathrm{ave}}$ is added to the PDF corresponding to each grayscale in the sub-block. The PDF below ${\mathrm{PDF}}_{th}$ is increased, and the PDF above ${\mathrm{PDF}}_{th}$ is reduced.

4. Transform the contrast-limited histogram of each sub-block, map the elements of the original grayscale ${r}_{k}$ onto grayscale ${s}_{k}$, where

stretch and disperse the high gray levels, and compress and combine the low gray levels, thereby obtaining an enhanced image with a reasonable contrast distribution and obvious texture details.

## 3.1.3.

#### Image registration

The MMW and THz dual-band, passive human-body security imaging system employed in this investigation uses two different detectors to capture dual-band images. Variations in the detector parameters lead to geometric deviation problems, such as translation, scaling, and rotation between images. It is necessary to correct such geometric deviations to register images. In this study, we employed a mutual-information-based method to register the images, since otherwise the grayscale details in the dual-band images would be minimal and the characteristics of the images would not be obvious.^{15}^{,}^{16}

The mutual-information-based registration method utilizes mutual information to conduct similarity measurements and image registration, and includes the following four basic components:

1. A feature space consisting of the gray values of the images;

2. A search space for global translation, rotation, and scaling;

3. Similarity measurements of mutual information: Mutual information is used to assess the correlation between the information provided by two systems. In the system employed in this study, although the two images in each set were obtained by different detectors, they were based on the same scene information. Thus, when the spatial positions of the two images in a set are identical, the value of the information that one image provides about the other, that is, the mutual information, was maximized. Algorithms typically use the generalized distance between the joint probability distribution and separate probability distributions to estimate the mutual information. Here, this relation is as follows:

## (10)

$$I(A,B)=\sum _{a,b}{P}_{AB}(a,b)\mathrm{log}\text{\hspace{0.17em}}\frac{{P}_{AB}(a,b)}{{P}_{A}(a){P}_{B}(b)}.$$For images $A$ and $B$, the corresponding pixel gray values are $a$ and $b$, respectively, which are linked by a coordinate transformation. The joint distribution ${P}_{AB}(a,b)$ can be calculated using the normalized joint grayscale histogram $h(a,b)$. Thus,

The marginal probability distributions are ${P}_{A}(a)=\sum _{b}{P}_{AB}(a,b)$ and ${P}_{B}(b)=\sum _{a}{P}_{AB}(a,b)$.

4. A search strategy employing the (1 + 1) evolution strategy algorithm [(1 + 1) ES]: This algorithm is a numerical optimization algorithm for solving continuous search spaces that was proposed in the 1960s by German mathematicians, Rechenberg and Schwefel, and occupies an important position in research dedicated to and applications of evolution strategy.

^{17}^{,}^{18}

Each generation in (1 + 1), ES consists of only two individuals: a parent and an offspring. The offspring is generated through a mutation of the parent; the better of the parent and offspring is chosen as the next generation’s parent. This process can be summarized by the following four steps:

1. Randomly generate an initial individual ${\varsigma}_{0}\in S$ as the parent, where $k=0$;

2. Generate an offspring by performing the mutation ${\eta}_{k}={\xi}_{k}+{Z}_{k}$, where ${Z}_{k}$ is a continuous random vector;

3. If $f({\eta}_{k})>f({\xi}_{k})$ and ${\eta}_{k}\in S$, then make ${\xi}_{k+1}={\eta}_{k}$; otherwise, make ${\xi}_{k+1}={\xi}_{k}$, and let $k=k+1$;

4. If the algorithm termination conditions are not met, return to Step (2).

## 3.1.4.

#### Image fusion

Image fusion involves integrating information from two or more source images to obtain a new image that can present the scene exactly, entirely, and reliably. Compared with the prefusion images, a fused image contains more information, more details, and greater clarity. Thus, the potential target detection rate is greater. The wavelet-transform-based fusion method takes full advantage of the wavelet’s multiresolution analysis capabilities by dividing the original image into a series of subimages with different spatial resolutions and frequency domain features. The subimages reflect the local characteristics of the original image. In this method, the subimages are fused according to certain rules, the multiscale description of the fused image is formed, and, finally, the fused image is obtained by performing inverse-wavelet-transform reconstruction. Figure 5 presents a schematic diagram of the steps of the fusion algorithm.^{19}^{,}^{20}

The wavelet base used in this research is the simplest Haar wavelet base:

The original image is decomposed by performing a three-level discrete wavelet transform. The image characteristics are acquired using different resolutions, and after Eq. (13) is employed to complete the computation of the weighted average of the wavelet coefficient, the fused image is obtained:

## 3.2.

### Algorithm Evaluation and Verification

## 3.2.1.

#### Objective evaluation parameters

We use SNR (SNR is a measure of the strength relationship between the signal and the noise) and the information entropy (the larger the value, the richer the details contained in the image) to analyze the performance of the algorithm.

### Signal-to-noise ratio

For an image $I$, it is first divided into several sub-blocks by sliding a window with dimensions of $w\times w$ on it. Then the local mean and the local variance of the sub-bocks are calculated.^{9} Suppose that the central pixel of a block is $(i,j)$, the local mean ${\mu}_{I}(i,j)$ and the local variance ${\sigma}_{I}^{2}(i,j)$ are as follows:

## (15)

$${\sigma}_{I}^{2}(i,j)=\frac{1}{{(2w+1)}^{2}}\sum _{k=-w}^{w}\sum _{l=-w}^{w}{[I(i+k,j+l)-{\mu}_{I}(i,j)]}^{2}.$$The maximum value of the local variance is selected as the signal variance of the image:

A flat area without significant changes in the grayscale of the image is selected, and the mean of the local variance of the flat area is taken as the noise variance of the image:^{8}^{,}^{9}

The SNR of the image is as follows:

### Information entropy

For an image I, the information entropy is as follows:

where $En$ is the information entropy; $l$ is total number of the gray levels in the image; $p(i)$ is the ratio of the number of pixels in grayscale $i$ to the total number of pixels in image.^{10}

## 3.2.2.

#### Algorithm verification

### Subjective verification

In this section, indicators of the effectiveness of applying the MMW-THz IPFIA to the images depicted in Fig. 2 are discussed. Figures 6Fig. 7–8 correspond to Figs. 2(a)–2(c), respectively. From left to right in Figs. 6–8, the profiles are presented in three groups. The first and second groups of four images, which were acquired using the 94 and 250 GHz detectors, respectively, each include the original, denoised, enhanced, and registered images, from left to right. The image on the far right in each figure is the final fused image.

The processed images display reduced noise, improved detail contrast, and effective affine problem correction. The features of the two-band image pairs are synthesized in the fused images, which contain more detail. The probability of detection and capabilities of the dual-band passive THz human-body security imaging system are improved by applying the MMW-THz IPFIA.

### Objective verification

According to Eqs. (14)–(18), the SNR of the original image and the fused image are calculated first. The size of the sliding window is $w=5$. The rows of the flat area are [10 25] and the columns are [30 50] (as shown in Figs. 6–8, the area in the red lines is the flat area). The grayscale distribution of the selected area in the fused images is flat. It does not contain edges or significant changes. Moreover, in this area, there is only noise in the original images, and the grayscale distribution is also flat.

Table 1 shows that the values of SNR of the fused image are several times larger than that of the original image. This indicates that the noise in the image is obviously reduced.

## Table 1

SNRs of the original image and the fused image.

Images | Imaging target | |||||
---|---|---|---|---|---|---|

Fig. 2(a) steel ruler | Fig. 2(b) THz attenuator | Fig. 2(c) original papers | ||||

94 GHz | 250 GHz | 94 GHz | 250 GHz | 94 GHz | 250 GHz | |

Original image | 6.2109 | 4.4267 | 7.0924 | 4.4933 | 5.6664 | 4.8961 |

Fusion images | 29.6308 | 29.6308 | 28.8140 | 28.8140 | 28.1755 | 28.1755 |

Table 2 shows the values of information entropy of the original image and the fused image. The values are increased by 1 basically (for 8-bit digital images, the maximum of the information entropy is 8). This indicates that the detail information contained in the image is enhanced effectively.

## Table 2

Information entropy of the original image and the fused image.

Images | Imaging target | |||||
---|---|---|---|---|---|---|

Fig. 2(a) metal ruler | Fig. 2(b) beam splitter | Fig. 2(c) printing paper | ||||

94 GHz | 250 GHz | 94 GHz | 250 GHz | 94 GHz | 250 GHz | |

Original image | 5.9677 | 6.1851 | 5.9396 | 6.1919 | 5.9288 | 6.1912 |

Fusion image | 7.4597 | 7.4597 | 7.4052 | 7.4052 | 7.4798 | 7.4798 |

The two parameters fully illustrate that the algorithm can effectively reduce the noise and improve the level of detail in the images. The fused image contains more details than the original image, thus it can describe the scene more precisely.

## 4.

## Minimum Detectable Temperature Difference Measurement and Verification

MDTD is an important parameter for evaluating the performances of thermal imaging systems. It can reflect a system's thermal sensitivity characteristics and spatial resolutions. Furthermore, the MDTD does not restrict the duration of the observation. The MDTD can be measured as soon as the position of a square or circular blackbody target in the image displayed on the screen starts to become identifiable, as illustrated in Fig. 9, and equals the temperature difference between the blackbody and background.

The imaging target of the MMW-THz dual band passive human-body security imaging system is the scene at room temperature, therefore, the evaluation of the system performance can be objectively carried out by using the measurement method similar to the thermal imaging system MDTD.

The components of the experimental system designed to measure the MDTD are depicted in Fig. 10. Given that the environmental temperature was 20°C, both the 94 and 250 GHz subsystem unit detectors were 10 mm in size. The focal length of the system was 300 mm; the aperture diameter was 200 mm; the imaging object distance was 1.7 m; the imaging range of the object was $2\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{m}\text{\hspace{0.17em}}(\text{high})\times 0.8\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{m}\text{\hspace{0.17em}}(\text{width})$; and the sample spacing was 10 mm. By calculating the optical diffraction limit, the minimum resolvable sizes of the optical system of the 250 GHz subsystem and the 94-GHz subsystem are 2.196 and 5.84 mm, respectively. They are smaller than the size of the detector, therefore, the spatial resolution of the imaging system is mainly affected by the size of the detector.

Since the imaging object distance was 1.7 m, and the MDTD measurement targets were designed to be circular and to range in diameter from 60 to 130 mm in 10-mm increments, the spatial angular frequency is presented in Table 3. Images of the 60- and 100-mm-diameter targets are illustrated in Fig. 11 as examples. The measured MDTD curve is shown in Fig. 12.

## Table 3

Spatial angular frequencies of the targets f (cyc/mrad).

l/m | D/mm | |||||||
---|---|---|---|---|---|---|---|---|

60 | 70 | 80 | 90 | 100 | 110 | 120 | 130 | |

1.7 | 0.02833 | 0.02429 | 0.02125 | 0.01889 | 0.0170 | 0.01545 | 0.01417 | 0.01308 |

The curve shown in Fig. 12 can be used to compare the original MDTDs with those after preprocessing with the 94 GHz and 250 GHz subsystems. In principle, for the same spatial angular frequency, the smaller the MDTD, the higher the system detection rate. In each case, the MDTD is reduced by preprocessing, indicating that the detection performance of the system is improved. Under identical conditions, the MDTD of the 94-GHz subsystem is larger than that of the 250-GHz subsystem in both the original and processed systems. Thus, the MDTD of the system processed by using MMW-THz IPFIA lies between those of the processed 94- and 250-GHz subsystems, that is, ${\mathrm{MDTD}}_{250\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{GHz}\_\text{after}}<{\mathrm{MDTD}}_{\text{system}}<{\mathrm{MDTD}}_{94\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{GHz}\_\text{after}}$. In addition, the MDTDs obtained using the proposed algorithm are smaller and are favorable compared with those obtained using the two original subsystems. This outcome, attributable to the difference between the performances of the two subsystems, demonstrates that the final MDTD is related to the proportions of the subsystem images in the fused images. These results quantitatively demonstrate that the detection performance of the system is improved by applying the MMW-THz IPFIA.

## 5.

## Conclusion

In this article, a comprehensive algorithm, MMW-THz IPFIA, was proposed for use in MMW and THz dual-band passive human-body security imaging systems. This algorithm, which consists of denoising, enhancement, registration, and fusion algorithms, can be employed to generate fused images while preserving most of the details present in the original dual-band images. The performance of the algorithm was analyzed by calculating the SNR and information entropy of the original images and fused images. The result indicated that the resulting fused images exhibited less noise, greater detail, and more information than the single-band images did.

We used a test method similar to the MDTD of the thermal imaging system to objectively verify the performance improvement of the system provided by applying the MMW-THz IPFIA. The result indicated that the MDTD performance has been improved. Thus, the performance of the system can be improved by employing the MMW-THz IPFIA.

Since the denoising, enhancement, registration, and fusion algorithms for MMW and THz images cannot yet be employed in real-time processing, the objectives of the next phase of research will be to integrate these processes with hardware systems and to achieve real-time fusion of dual-band images to support efficient security monitoring using MMW and THz dual-band, passive human-body security imaging systems.

## Acknowledgments

This work is supported by the project of National Science Foundation of China (Grant No. 61171051).

## References

## Biography

**Li Tian** is a doctoral student at Beijing Institute of Technology in China. Her current research interests include THz imaging theory, THz imaging systems, and THz image processing.