**regularization is applied to reconstruct frame differences using few numbers of measurements. Then, an edge-detection-based denoising method is employed to reduce the error in the frame difference image. The experimental results show that the proposed algorithm together with the single-pixel imaging system makes compressive video cameras available.**

*ℓ*_{1}## 1.

## Introduction

Video acquisition captures time-dependent natural scenes and brings real-time images directly to screens for immediate observation. It not only serves for the live television (TV) production, but also for security, military, and industrial operations including professional video cameras, camcorders, closed circuit TV, webcams, camera phones, and special camera systems. In traditional video acquisition, e.g., H.261, H.265, and MPEG series, the sampling and compression procedures are implemented in sequential order. The Nyquist–Shannon sampling theorem requires the sampling rate to be at least twice that of the signal frequency for guaranteed exact recovery. The compression procedure is implemented by video compression chipsets^{1} or separate software.^{2}

Although state-of-the-art video cameras can record most nature scenes, they do not work for very high-resolution images or high fps videos because the growth in data storage, communication, and processing is far behind the growth in data generation. In space exploration, an image of the shuttle discovery flight deck could be 2.74 gigapixels,^{3} and a bubble dynamics research needs a 500-fps video microscopy.^{4} More importantly, commercialized high-performance video cameras are extremely expensive, e.g., the price of a basic model with 7500 fps, one-megapixel resolution, and 12-bit color depth (FASTCAM SA5 from Photron) is around $100,000.

The limitation comes from weak light irradiation and the readout bandwidth when capturing high-speed objects at a high resolution. As shown in Fig. 1 and Eq. (1), the reflected illumination is collected by sensor arrays in a limited space–time volume

The number of electrons ($\mathbf{J}$) accumulated on each pixel is reversely proportional to the square of the ratio of the focal length to the aperture of the lens ($F$), but proportional to exposure time ($t$), incident illumination (${I}_{src}$), scene reflectivity ($R$), quantum efficiency ($q$), and the pixel size (${\mathrm{\Delta}}^{2}$).^{5}In video sensing, the exposure time ($t$) corresponds to the temporal resolution and the pixel size (${\mathrm{\Delta}}^{2}$) is related to the spatial resolution. In other words, the temporal and spatial resolutions are mutual restraint in conventional video cameras due to the imaging sensors’ requirement on the minimum number of accumulated electrons and the fixed number of total electrons. The spatial resolution will decrease when the temporal resolution increases. Another limitation is the sensor’s readout speed. The readout timing includes an analog-to-digital conversion, clear charge from the parallel register, and shutter delay, e.g., a one-megapixel, 1000 fps, and 16-bit color camera will need a $4\text{-}\mathrm{GB}/\mathrm{s}$ readout circuit.

To obtain high-resolution images and high fps videos, the sampling rate has to be reduced, and compressive sensing technique can be applied. Compressive sensing^{6} allows combining both sampling and compression procedures together. This paradigm directly samples the signal in a compressed form such that the sampling rate can be significantly reduced. Compressive sensing has attracted extreme interest in imaging,^{7} geophysical data analysis,^{8} control and robotics,^{9} communication,^{10} and medical imaging processing.^{11}

Compressive sensing has been applied in compressive video sensing since 2006, when the single-pixel camera setup was first used for video sampling.^{12} In this first approach, the three-dimensional (3-D) video was reconstructed with all the measurements together using 3-D wavelets as a sparse representation. This method cannot be used for real-time video streaming without incurring latency and delay because all the measurements have to be obtained before the reconstruction starts. Since then, in order to reconstruct the frames one by one for the purpose of real-time streaming, most approaches reconstruct or sample reference frames with more measurements and find the differences between two consecutive frames with fewer measurements. There are mainly two types of strategies: sampling the frame and sampling the difference between frames. In the first sampling method, in order to obtain a continuous video, motion estimation techniques are applied to recover frames from reference frames. For example, the evolution of dynamic textured scenes was modeled as a linear dynamical system.^{13} A multiframe motion estimation algorithm was proposed.^{14} The latest compressive video sensing research learned a linear mapping between video sequences and corresponding measured frames.^{15} In addition, the correlation between consecutive frames in the frequency domain^{16} and other transform domains^{17} was also used.

There are also several approaches in sampling the difference between two frames. For example, Stankovic et al.^{18} split the video frame into nonoverlapping blocks of equal size, and compressive sampling was performed on sparse blocks determined by predicting sparsities based on previous reference frames, which were sampled conventionally. The remaining blocks were sampled fully. It would be time-consuming to determine the sparse blocks because every block has to be tested. In addition, directly sampling the difference between two consecutive frames was employed^{19} to save the sampling time.

Though compressive sensing techniques are used in video sensing, most of the approaches use the convex ${\ell}_{1}$ minimization to approximate the nonconvex ${\ell}_{0}$ minimization, which is a nondeterministic polynomial-time (NP)-hard and difficult to solve. The compressive sensing theorem can reduce the number of measurements using the ${\ell}_{1}$ minimization. However, with nonconvex regularizations, it can reduce the number of measurements and thus the sampling rate further so as to achieve real-time video capturing. Recently, there are many nonconvex regularizations proposed to obtain better performance than the ${\ell}_{1}$ norm in compressive sensing.^{20}^{,}^{21}^{,}^{22}

In this paper, a single-pixel compressive video sensing framework based on the nonconvex sorted ${\ell}_{1}$ regularization is proposed for fast and super resolution video. In this framework, we sample reference frames using the spatial sparsity (individual image sparsity) and the difference between two frames using the temporal sparsity. In Sec. 2, we first give a short review about compressive sensing and nonconvex solvers. Then, we propose our nonconvex compressive video sensing framework. The experimental results are depicted in Sec. 3.

## 2.

## Compressive Video Sensing

## 2.1.

### Compressive Sensing

The core of compressive sensing is recovering the sparse vector $\mathbf{x}\in {\mathbb{R}}^{n}$ from a small number of linear measurements $\mathbf{y}=\mathrm{\Phi}\mathbf{x}$, where $\mathrm{\Phi}\in {\mathbb{R}}^{m\times n}$ is the measurement matrix ($m\ll n$). There are many solutions for the underdetermined linear system if $\mathbf{y}$ is in the range of $\mathrm{\Phi}$, and we are interested in finding the sparsest one among all the solutions. However, finding the sparsest solution is NP-hard. Therefore, instead of solving the NP-hard problem, people are looking into alternative approaches. Convex approaches are of great interest because there are lots of algorithms for solving these convex problems and it is easy to analyze the solutions of the convex problems. If $\mathbf{x}$ is sparse and $\mathrm{\Phi}$ satisfies some conditions such as the null space property,^{23} the incoherence condition,^{24} and the restricted isometry property,^{25} the following problem is equivalent for finding the sparest solution:

## Eq. (2)

$$\tilde{\mathbf{x}}=\underset{\mathbf{x}}{\mathrm{arg}\text{\hspace{0.17em}}\mathrm{min}}\text{\hspace{0.17em}}{\Vert \mathbf{x}\Vert}_{1}\phantom{\rule[-0.0ex]{1em}{0.0ex}}\text{subject to}\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{\Phi}\mathbf{x}=\mathbf{y}.$$## Eq. (3)

$$\tilde{\mathbf{x}}=\underset{\mathbf{x}}{\mathrm{arg}\text{\hspace{0.17em}}\mathrm{min}}\text{\hspace{0.17em}}{\Vert \mathbf{x}\Vert}_{1}+\frac{\lambda}{2}{\Vert \mathrm{\Phi}\mathbf{x}-\mathbf{y}\Vert}^{2},$$^{26}

^{,}

^{27}

Although the ${\ell}_{1}$ minimization is fully understood and stable with theoretical guarantee, the number of required measurements is still high, and the performance is not good in many applications with a small number of measurements. For example, radiologists want to reduce more projections and thus radiation than that required for ${\ell}_{1}$ minimization in computed tomography. For the difference between two frames in a video, we want to decrease the number of measurements further such that it can realize higher fps videos than current cameras can produce. In order to recover signals from even fewer measurements, nonconvex regularizations are applied, and a short review will be given in Sec. 2.2.

## 2.2.

### Nonconvex Optimization Problems for Compressive Sensing

In this section, we review several nonconvex regularizations for compressive sensing and their corresponding algorithms. Denote $\mathbf{x}=({x}_{1},{x}_{2},\dots ,{x}_{n})\in {\mathbb{R}}^{n}$, the truth sparse signal as ${\mathbf{x}}_{0}$, and ${\mathbf{x}}^{l}$ as the $l$’th iteration.

The ${\ell}_{p}$ ($0\le p\le 1$) term is commonly used,^{28} and it has ${\ell}_{0}$ and ${\ell}_{1}$ as special cases. Because of the nonconvexity, it recovers sparse signals with even fewer measurements than the convex counterpart, ${\ell}_{1}$. To solve the nonconvex problems, there are several approaches. We describe three of them on both the noise-free and noisy cases. First, two reweighted algorithms for the following noise-free case are presented:

## Eq. (4)

$$\tilde{\mathbf{x}}=\underset{\mathbf{x}}{\mathrm{arg}\text{\hspace{0.17em}}\mathrm{min}}\text{\hspace{0.17em}}{\Vert \mathbf{x}\Vert}_{p}\phantom{\rule[-0.0ex]{1em}{0.0ex}}\text{subject to}\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{\Phi}\mathbf{x}=\mathbf{y}.$$The iteratively reweighted ${\ell}_{1}$ minimization (IRL1)^{20} replaces the ${\ell}_{p}$ term using a weighted ${\ell}_{1}$ term with the weights depending on the previous iteration. The iteration is expressed as

## Eq. (5)

$${\mathbf{x}}^{l+1}=\underset{\mathbf{x}}{\mathrm{arg}\text{\hspace{0.17em}}\mathrm{min}}\sum _{i=1}^{n}\frac{1}{{(|{x}_{i}^{l}|+\u03f5)}^{1-p}}|{x}_{i}|\phantom{\rule[-0.0ex]{1em}{0.0ex}}\text{subject to}\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{\Phi}\mathbf{x}=\mathbf{y}.$$Similarly, the iteratively reweighted least squares^{21}^{,}^{22} replace the ${\ell}_{p}$ term using a weighted least squares term with the weights depending on the previous iteration. The iteration is expressed as

## Eq. (6)

$${\mathbf{x}}^{l+1}=\underset{\mathbf{x}}{\mathrm{arg}\text{\hspace{0.17em}}\mathrm{min}}\text{\hspace{0.17em}}\sum _{i=1}^{n}\frac{1}{{({|{x}_{i}^{l}|}^{2}+\u03f5)}^{1-p/2}}{|{x}_{i}|}^{2}\phantom{\rule[-0.0ex]{1em}{0.0ex}}\text{subject to}\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{\Phi}\mathbf{x}=\mathbf{y}.$$Except for these two reweighted algorithms for solving ${\ell}_{p}$ minimization problems, some algorithms for solving convex optimization problems are applied to solve nonconvex problems with general nonconvex regularizations.^{29} One example is the forward–backward iteration. In each forward–backward iteration, for solving

## Eq. (7)

$$\tilde{\mathbf{x}}=\underset{\mathbf{x}}{\mathrm{arg}\text{\hspace{0.17em}}\mathrm{min}}\text{\hspace{0.17em}}r(\mathbf{x})+\frac{\lambda}{2}{\Vert \mathrm{\Phi}\mathbf{x}-\mathbf{y}\Vert}^{2},$$## Eq. (8)

$${\mathbf{x}}^{l+1}=\underset{\mathbf{x}}{\mathrm{arg}\text{\hspace{0.17em}}\mathrm{min}}\text{\hspace{0.17em}}\tau r(\mathbf{x})+\frac{1}{2}{\Vert \mathbf{x}-({\mathbf{x}}^{l}-\tau \lambda {\mathrm{\Phi}}^{\mathrm{T}}(\mathrm{\Phi}{\mathbf{x}}^{l}-\mathbf{y})\Vert}^{2}.$$^{30}

The success of ${\ell}_{p}$ minimization and both iterative algorithms for solving ${\ell}_{p}$ minimization problems depicts that it is better to assign small weights for components with large absolute values and large weights for zero components and components with small absolute values. A nonconvex sorted ${\ell}_{1}$ that assigns weights based on the ranking of absolute values was developed by Huang et al.^{31} Let the coefficients ${\{{\omega}_{i}\}}_{i=1}^{n}$ be a nondecreasing sequence of nonnegative real numbers, i.e., $0\le {\omega}_{1}\le \cdots \le {\omega}_{n}\ne 0$. The nonconvex sorted ${\ell}_{1}$ regularization is defined as

## Eq. (9)

$${r}_{\omega}({x}_{1},{x}_{2},\dots ,{x}_{n})={\omega}_{1}|{x}_{[1]}|+{\omega}_{2}|{x}_{[2]}|+\cdots +{\omega}_{n}|{x}_{[n]}|,$$## Eq. (10)

$$\begin{array}{c}{\omega}_{i}^{l}=\{\begin{array}{cc}1,& \text{if}\text{\hspace{0.17em}\hspace{0.17em}}i>{K}^{l},\\ {e}^{-r({K}^{l}-i)/{K}^{l}},& \text{otherwise},\end{array}\end{array}$$^{32}

## 2.3.

### Video Compressive Sampling

A video can be considered as a series of images, as shown in Fig. 2 (left), where the coordinate space $(x,y,t)$ consists both the spatial domain $(x,y)$ and the temporal domain $(t)$. Each frame could be realized as a static natural image that is redundant because natural images are intrinsically sparse in a specific domain.^{24}^{,}^{33} Another redundancy happens between similar frames in the temporal domain. As shown in Fig. 3, more than 85% of the pixels have no significant changes. Therefore, difference coding^{34} in MPEG and H.265 series reuses existing frames and updates only the pixels with significant changes.

As discussed in Sec. 1, the objective of compressive video sensing is to combine both compression and sampling procedures to achieve the signal compression in hardware. In our proposed compressive video sensing, there are two types of image frames: intraframes (I-frames in H.264 or reference frames) and interframes (P-frames in H.264), shown in Fig. 4. The compressive sampling is applied on both I-frames and P-frames, where P-frames are reconstructed by the difference between P-frames and their previous frames.

Since I-frames are considered as static images and the image compressive sampling has already been studied for single-pixel cameras,^{7}^{,}^{35} a total variation algorithm^{36} is applied to recover intraframes from the I-frame sampling. For the P-frames, because the difference between similar frames is sparse, a nonconvex regularization is adopted to reduce the number of samples and thus increase the compression ratio. We compare the performance of four different nonconvex regularizations numerically and choose the best in the experiment. The four regularizations are: ${\ell}_{p}$ with IRL1, ISD, 2-level, and the nonconvex sorted ${\ell}_{1}$ ($m$-level). In IRL1, the weights are updated by

## Eq. (11)

$$\begin{array}{c}\begin{array}{c}{\omega}_{i}^{l}=\frac{1}{|{x}_{i}|+\mathrm{max}({0.5}^{l-1},{0.8}^{8})}.\end{array}\end{array}$$We compare the runtimes, root-mean-square error (RMSE), and the peak signal-to-noise ratio (PSNR) for these four algorithms on the difference between two consecutive frames ($64\times 64$) in Fig. 5. The difference between the left and the middle images in Fig. 5 is shown on the right. We choose the measurement matrices to be randomized Bernoulli matrices with $\pm 1$ entries. The sampling rate (the number of measurements/the number of pixels) is changed from 6% to 35%. The comparison result is shown in Fig. 6, where the $x$-axis represents the sampling rate. When the number of measurements is small, nonconvex algorithms are unstable because they can easily be trapped at stationary points and the strategy for adaptively updating weights may not work so well. Overall, $m$-level is the most efficient and effective algorithm among all these four algorithms. Therefore, we choose $m$-level in our experiments in Sec. 3.

Though nonconvex algorithms are able to recover sparse signals accurately from a small number of linear measurements, there is still error due to the hardware noise and the modeling error. For example, there is noise in the measurements and the algorithms cannot recover the sparse signals exactly. In Fig. 7, we show the exact difference image between two frames on the left and compare it with that recovered using the nonconvex sorted ${\ell}_{1}$ on the middle. It is noticed that there are many isolated pixels with small nonzero values in the recovered difference image, and these pixels are supposed to have zero values. In order to improve this, we develop a simple and effective method to remove these pixels and update only the pixels in the areas with significant changes.

We apply the Sobel operator with a pair of $3\times 3$ convolution masks on the recovered difference image to find the edges since the Sobel kernels compute the gradient with smoothing in both the horizontal and vertical directions. Then a threshold is selected to obtain a binary mask that indicates the pixels with large gradient values. However, it does not delineate the outline of the changing area of interest. Then the binary gradient mask is dilated using the vertical structuring element followed by the horizontal structuring element for a better outline. Because the mask shows only the edges of the difference image and the areas with significant changes are inside the edges, the whole areas with significant changes are obtained via filling the holes inside the edges using a flood fill operation via the MATLAB^{®} function “imfill.” This method keeps the most significant changes and removes error on the difference image so as to reduce the reconstruction error in P-frames. Figure 7(c) shows the performance of this postprocessing (denoising) procedure. The flow chart for this procedure is described in Fig. 8.

Due to the frame difference sensing mechanism, the reconstruction error accumulates because every time we reconstruct P-frames using the difference between two consecutive frames. The error in the first P-frame is accumulated to the second P-frame. Therefore, the reconstruction of the first P-frame after I-frames is very important, and an improvement on this frame also improves following P-frames. On the other hand, if the number of P-frames between two consecutive I-frames is small, we can compute the difference image between the P-frame and the previous I-frame instead to avoid the accumulated error from previous P-frames.

The next numerical experiment shows that we can apply the simple denoising procedure to improve the reconstruction results of the first P-frame and all the P-frames after that. In this numerical experiment, there are five P-frames after one I-frame. In Fig. 9, all five P-frames are plotted. The first row has five ground true frames (${\mathrm{P}}_{01}$ to ${\mathrm{P}}_{05}$). For the second and third rows, we show the reconstruction results using the difference image between two consecutive frames, and the reconstruction results using the difference image between P-frames and the I-frame are shown in the fourth and fifth rows. The reconstruction results using $m$-level without the denoising step are shown in the second row (${\mathrm{P}}_{11}$ to ${\mathrm{P}}_{15}$) and the fourth row (${\mathrm{P}}_{31}$ to ${\mathrm{P}}_{35}$). The reconstruction results with the denoising step are shown in the third row (${\mathrm{P}}_{21}$ to ${\mathrm{P}}_{25}$) and the fifth row (${\mathrm{P}}_{41}$ to ${\mathrm{P}}_{45}$). The PSNR and RMSE values are shown in Tables 1 and 2. From both tables, we can see that the PSNR value is decreasing and the RMSE value is increasing for the five P-frames, if the difference images between two consecutive frames are used and the denoising step improves all P-frames, especially the first P-frame. However, if all the P-frames are compared with the I-frame, the improvement of the denoising step is large for all five P-frames. This numerical experiment suggests that we may choose to compare P-frames with the previous I-frame instead of the previous frame because the error in the previous P-frames will be accumulated.

## Table 1

PSNR values for the five reconstructed P-frames with four methods: difference images between two consecutive images without the denoising step (m-level); difference images between two consecutive images with the denoising step (denoising); difference images between P-frames and the I-frame without the denoising step (m-level*); and difference images between P-frames and the I-frame with the denoising step (denoising*).

P01 | P02 | P03 | P04 | P05 | |
---|---|---|---|---|---|

$m$-level | 40.8987 | 37.4587 | 36.2745 | 35.6323 | 35.0012 |

Denoising | 42.3382 | 37.5839 | 36.6928 | 35.9371 | 35.0856 |

$m$-level* | 40.8987 | 39.5386 | 40.2128 | 39.3341 | 39.5685 |

Denoising* | 42.3382 | 40.5984 | 41.5240 | 40.7008 | 41.0858 |

## Table 2

PSNR values for the five reconstructed P-frames with four methods: difference images between two consecutive images without the denoising step (m-level); difference images between two consecutive images with the denoising step (denoising); difference images between P-frames and the I-frame without the denoising step (m-level*); and difference images between P-frames and the I-frame with the denoising step (denoising*).

P01 | P02 | P03 | P04 | P05 | |
---|---|---|---|---|---|

$m$-level | 2.2994 | 3.4167 | 3.9157 | 4.2163 | 4.5340 |

Denoising | 1.9482 | 3.3678 | 3.7317 | 4.0709 | 4.4901 |

$m$-level* | 2.2994 | 2.6891 | 2.4883 | 2.7532 | 2.6799 |

Denoising* | 1.9482 | 2.3802 | 2.1396 | 2.3523 | 2.2504 |

The whole algorithm for P-frames reconstruction is depicted in Table 3. The steps (a) to (c) show the nonconvex sorted ${\ell}_{1}$ calculation process, while steps (d) to (e) demonstrate the edge-detection denoising procedure to reduce the error in the compressive video sensing.

## Table 3

P-frames reconstruction algorithm.

Algorithm |

Initialize ${\mathbf{x}}_{0}$, $\beta $, $r$ and $\tau $ |

for$l$= 1: maxit |

a. Compute ${K}^{l}$ |

b. Update ${\omega}^{l}$ |

c. Apply one forward–backward iteration and check stopping rules. |

end |

d. Find the areas with significant changes |

e. Reconstruct the P-frame by updating only the pixels values in the areas identified in the previous step. |

## 3.

## Experiments

The projection measurement matrices can be implemented by spatial light modulators such as the digital micromirror device (DMD) and the liquid crystal on silicon. The DMD runs as fast as 32,000 Hz, and we use a DMD with 6000 Hz in the experiments. A DMD chip has several thousand microscopic mirrors arranged in a rectangular array on its surface. These mirrors correspond to the pixels in the image to be reconstructed. The mirrors can be individually rotated $\pm 12\text{\hspace{0.17em}\hspace{0.17em}}\mathrm{deg}$ to an on or off state. These two states correspond to $\pm 1$ in the Bernoulli matrix. During the sampling process, the measurement matrix is sent to the DMD controller row by row. The matrices for P-frames are selected from the rear end of the matrix for the previous I-frame, e.g., if the previous I-frame measurement matrix is $\mathrm{\Phi}\in {\mathbb{R}}^{m\times n}$, then the P-frame measurement matrix will be $\mathrm{\Phi}(m-p+1:m,:)\in {\mathbb{R}}^{p\times n}$ with $p\ll m$. During the experiments, the irradiator (THORLABS LIU850A) is 850 nm near the IR source, and a silicon photodiode (THORLABS FDS1010) is chosen as the receiver sensor.

We validate the proposed nonconvex compressive video sensing system using two experiments: a linear moving object and a rotating object. In the first experiment with a linear moving airplane in Fig. 10, the frame rate is 10 fps. There is only one P-frame between two consecutive I-frames, i.e., ${t}_{00},{t}_{02},\dots ,{t}_{16}$ are I-frames, while ${t}_{01},{t}_{03},\cdots ,{t}_{17}$ are P-frames. The sampling ratios are 18% and 8.5% for I-frames and P-frames, respectively. The proposed system records the whole scene in real time.

The second experiment is to capture the rotation of a fan. As shown in Fig. 11, each blade is designed with a different length for easy identification. There are three P-frames between two consecutive I-frames, and each row in Fig. 11 shows one I-frame on the first column and three P-frames after the I-frame on the last three columns. The frame rate is 18 fps, and the sampling ratios are 20% and 9% for I-frames and P-frames, respectively.

## 4.

## Conclusions

Nonconvex compressive sensing algorithms require a fewer number of linear measurements to reconstruct a sparse signal than convex algorithms. In this work, the nonconvex sorted ${\ell}_{1}$ approach is employed to reconstruct the difference images, which are sparse, and decrease the sampling rate. Furthermore, an edge-detection-based denoising step is applied to reduce the error on the difference image. Thus, it requires a smaller number of measurements compared to the conventional compressive video sensing. We tested our algorithm on the real-time video reconstruction in the experiments. Though the frame rate in the experiments is only 18 fps, it can reach up to 105 fps based on current DMD mirror speed (maximum 32,000 Hz).

## Acknowledgments

This research work was partially supported under National Science Foundation Grants Nos. IIS-0713346 and DMS-1621798, Office of Naval Research Grants Nos. N00014-04-1-0799 and N00014-07-1-0935, the U. S. Army Research Laboratory, and the U. S. Army Research Office under Grant No. W911NF-14-1-0327.

## References

## Biography

**Liangliang Chen** received his bachelor’s and master’s degrees in electrical engineering from the Huazhong University of Science and Technology, Wuhan, China, in 2009 and 2007, respectively. Currently, he is pursuing his PhD at Michigan State University, East Lansing. His research interests include infrared sensor and imaging, ultraweak signal detection in nanosensors, signal processing, analog circuits, and carbon nanotube/graphene nanosensors.

**Ming Yan** received his PhD from the University of California, Los Angeles, in 2012. He is an assistant professor at the Department of Computational Mathematics, Science and Engineering and the Department of Mathematics, Michigan State University. His research interests include signal and image processing, optimization, and parallel and distributed methods for large-scale datasets.

**Chunqi Qian** received his BS degree in chemistry from Nanjing University and his PhD in physical chemistry from the University of California, Berkeley, in 2007. Following postdoctoral trainings at the National High Magnetic Field Laboratory and the National Institutes of Health, he joined Michigan State University as an assistant professor in radiology. His research interest includes the development and application of imaging technology in biomedical research.

**Ning Xi** received his DSc degree in systems science and mathematics from Washington University in St. Louis, Missouri, USA, in 1993. Currently, he is the chair professor of robotics and automation at the Department of Industrial and Manufacturing System, and director of Emerging Technologies Institute of the University of Hong Kong. He is a fellow of the Institute of Electrical and Electronics Engineers (IEEE). His research interests include robotics, manufacturing automation, micro/nanomanufacturing, nanosensors and devices, and intelligent control and systems.

**Zhanxin Zhou** received her bachelor’s and master’s degrees in control engineering from the Second Artillery Engneering College, Xi’an, China, in 1992 and 1997, respectively. She received her PhD in control engineering from Beijing Institute of Technology, Beijing, China, in 2008. Her research interests include infrared imaging, imaging enhancement, nonlinear filter and optimal control.

**Yongliang Yang** received his BS degree in mechanical engineering from Harbin Engineering University, Harbin, China, in 2005. He received his MS and PhD degrees from the University of Arizona, Tucson, USA, in 2012 and 2014, respectively. He has been a research associate at Michigan State University since 2014. His research interests include micro/nanorobotics and their application in biomedicine.

**Bo Song** received his BEng degree in mechanical engineering from Dalian University of Technology, Dalian, China, in 2005, and his MEng degree in electrical engineering from the University of Science and Technology of China, Hefei, China, in 2009. Currently, he is pursuing his PhD at the Department of Electrical and Computer Engineering, Michigan State University, East Lansing. His research interests include nanorobotics, nonvector space control, compressive sensing, and biomechanics.

**Lixin Dong** received his BS and MS degrees in mechanical engineering from Xi’an University of Technology, Xi’an, China, in 1989 and 1992, respectively, and his PhD in microsystems engineering from Nagoya University, Nagoya, Japan, in 2003. He is an associate professor at Michigan State University. His research interests include nanorobotics, nanoelectromechnical systems, mechatronics, mechanochemistry, and nanobiomedical devices. He is a senior editor of the *IEEE Transactions on Nanotechnology*.