Proc. SPIE. 10396, Applications of Digital Image Processing XL
KEYWORDS: Infrared imaging, Compressed sensing, Embedded systems, Facial recognition systems, Video processing, Field programmable gate arrays, Detection and tracking algorithms, Digital signal processing, Classification systems
We propose a platform for robust face recognition in Infrared (IR) images using Compressive Sensing (CS). In line with CS theory, the classification problem is solved using a sparse representation framework, where test images are modeled by means of a linear combination of the training set. Because the training set constitutes an over-complete dictionary, we identify new images by finding their sparsest representation based on the training set, using standard l1-minimization algorithms. Unlike conventional face-recognition algorithms, we feature extraction is performed using random projections with a precomputed binary matrix, as proposed in the CS literature. This random sampling reduces the effects of noise and occlusions such as facial hair, eyeglasses, and disguises, which are notoriously challenging in IR images. Thus, the performance of our framework is robust to these noise and occlusion factors, achieving an average accuracy of approximately 90% when the UCHThermalFace database is used for training and testing purposes. We implemented our framework on a high-performance embedded digital system, where the computation of the sparse representation of IR images was performed by a dedicated hardware using a deeply pipelined architecture on an Field-Programmable Gate Array (FPGA).
Broadcasting an outdoor sports event at daytime is a challenging task due to the high contrast that exists between areas in the shadow and light conditions within the same scene. Commercial cameras typically do not handle the high dynamic range of such scenes in a proper manner, resulting in broadcast streams with very little shadow detail. We propose a hardware architecture for real-time shadow removal in high-resolution video, which reduces the shadow effect and simultaneously improves shadow details. The algorithm operates only on the shadow portions of each video frame, thus improving the results and producing more realistic images than algorithms that operate on the entire frame, such as simplified Retinex and histogram shifting. The architecture receives an input in the RGB color space, transforms it into the YIQ space, and uses color information from both spaces to produce a mask of the shadow areas present in the image. The mask is then filtered using a connected components algorithm to eliminate false positives and negatives. The hardware uses pixel information at the edges of the mask to estimate the illumination ratio between light and shadow in the image, which is then used to correct the shadow area. Our prototype implementation simultaneously processes up to 7 video streams of 1920×1080 pixels at 60 frames per second on a Xilinx Kintex-7 XC7K325T FPGA.
We present a striping noise compensation architecture for hyperspectral push-broom cameras, implemented on a Field-Programmable Gate Array (FPGA). The circuit is fast, compact, low power, and is capable of eliminating the striping noise in-line during the image acquisition process. The architecture implements a multi dimensional neural network (MDNN) algorithm for striping noise compensation previously reported by our group. The algorithm relies on the assumption that the amount of light impinging at the neighboring photo-detectors is approximately the same in the spatial and spectral dimensions. Under this assumption, two striping noise parameters are estimated using spatial and spectral information from the raw data. We implemented the circuit on a Xilinx ZYNQ XC7Z2010 FPGA and tested it with images obtained from a NIR N17E push-broom camera, with a frame rate of 25fps and a band-pixel rate of 1.888 MHz. The setup consists of a loop of 320 samples of 320 spatial lines and 236 spectral bands between 900 and 1700 nanometers, in laboratory condition, captured with a rigid push-broom controller. The noise compensation core can run at more than 100 MHZ and consumes less than 30mW of dynamic power, using less than 10% of the logic resources available on the chip. It also uses one of two ARM processors available on the FPGA for data acquisition and communication purposes.
We present a custom digital architecture for bruised apple classification using hyperspectral images in the near infrared (NIR) spectrum. The algorithm classifies each pixel in an image into one of three classes: bruised, non-bruised, and background. We extract two 5-element feature vectors for each pixel using only 10 out of the 236 spectral bands provided by the hyperspectral camera, thereby greatly reducing both the requirements of the imager and the computational complexity of the algorithm. We then use two linear-kernel support vector machine (SVM) to classify each pixel. Each SVM was trained with 504 windows of size 17×17-pixel taken from 14 hyperspectral images of 320×320 pixels each, for each class. The architecture then computes the percentage of bruised pixels in each apple in order to adequately classify the fruit. We implemented the architecture on a Xilinx Zynq Z-7010 field-programmable gate array (FPGA) and tested it on images from a NIR N17E push-broom camera with a frame rate of 25 fps, a band-pixel rate of 1.888 MHz, and 236 spectral bands between 900 and 1700 nanometers in laboratory conditions. Using 28-bit fixed-point arithmetic, the circuit accurately discriminates 95.2% of the pixels corresponding to an apple, 81% of the pixels corresponding to a bruised apple, and 96.4% of the background. With the default threshold settings, the highest false positive (FP) for a bruised apple is 18.7%. The circuit operates at the native frame rate of the camera, consumes 67 mW of dynamic power, and uses less than 10% of the logic resources on the FPGA.
We present a face-classification architecture for long-wave infrared (IR) images implemented on a Field Programmable
Gate Array (FPGA). The circuit is fast, compact and low power, can recognize faces in real time and
be embedded in a larger image-processing and computer vision system operating locally on an IR camera. The
algorithm uses Local Binary Patterns (LBP) to perform feature extraction on each IR image. First, each pixel
in the image is represented as an LBP pattern that encodes the similarity between the pixel and its neighbors.
Uniform LBP codes are then used to reduce the number of patterns to 59 while preserving more than 90% of the
information contained in the original LBP representation. Then, the image is divided into 64 non-overlapping
regions, and each region is represented as a 59-bin histogram of patterns. Finally, the algorithm concatenates all
64 regions to create a 3,776-bin spatially enhanced histogram. We reduce the dimensionality of this histogram
using Linear Discriminant Analysis (LDA), which improves clustering and enables us to store an entire database
of 53 subjects on-chip. During classification, the circuit applies LBP and LDA to each incoming IR image in real
time, and compares the resulting feature vector to each pattern stored in the local database using the Manhattan
distance. We implemented the circuit on a Xilinx Artix-7 XC7A100T FPGA and tested it with the UCHThermalFace
database, which consists of 28 81 x 150-pixel images of 53 subjects in indoor and outdoor conditions.
The circuit achieves a 98.6% hit ratio, trained with 16 images and tested with 12 images of each subject in the
database. Using a 100 MHz clock, the circuit classifies 8,230 images per second, and consumes only 309mW.
This paper presents a digital architecture for face detection on infrared (IR) images. We use Local Binary
Patterns (LBP) to build a feature vector for each pixel, which represents the texture of the image in a vicinity
of that pixel. We use a Support Vector Machine (SVM), trained with 306 images of 51 different subjects, to
recognize human face textures. Finally, we group the classified pixels into rectangular boxes enclosing the faces
using an algorithm for connected components. These boxes can then be used to track, count, or identify faces
in a scene, for example. We implemented our architecture on a Xilinx XC6SLX45 FPGA and tested it on 306
IR images of 51 subjects, different from the data used to train the SVM. The circuit correctly identifies 100%
of the faces in the images, and reports 4.5% of false positives. We also tested the system on a set of IR video
streams featuring multiple faces per image, with varied poses and backgrounds, and obtained a hit rate of 94.5%,
with 7.2% false positives. The circuit uses less than 25% of the logic resources available on the FPGA, and can
process 313 640x480-pixel images per second with a 100MHz clock, while consuming 266mW of power.
We present a model and a signal-processing algorithm for compensating the nonuniformity (NU) noise and surrounding temperature self-heating e ects on the response of uncooled microbolometer-based infrared cameras. The model for the NU noise considers pixelwise gain and o set parameters. The representation for the self-heating dynamics of the camera is an autoregressive moving average (ARMA) model for camera's internal temperature. The algorithm performs initially a two-point calibration at a known surrounding temperature. Next, without modifying the NU parameters, we dynamically compensate variations in the camera readout using both estimates of the ARMA model and measurements of the surrounding temperature taken by a simple sensor embedded in the camera. Tested on a CEDIP Jade UC33 camera, our system compensates reference black-body images at 30 degrees Celsius, with a peak error below 1.3 and a mean error below 0.3 degrees Celsius, in scenarios where the room temperature varied up to 14 degrees Celsius. Moreover, the regularity and simplicity of the algorithm enables us to implement it on embedded digital hardware, thereby reducing its cost, size, and power consumption. We implemented the algorithm on a Xilinx XC6SLX45 FPGA using xed-point arithmetic. The circuit exhibits an arithmetic error of 0.06 degrees compared to a software double-precision implementation. It compensates 320 × 240-pixel video at up to 1,437 fps and 640 × 480-pixel video at up to 360 fps, using 1% of the logic resources of the FPGA, and less than 1 mW of dynamic power at 110 MHz. Adding Gigabit Ethernet communication, HDMI display, and a pseudocolor map on the chip uses 10% of the resources and consumes 915 mW.
This paper presents a digital hardware filter that estimates the nonuniformity (NU) noise in an Infrared Focal Plane Array (IRFPA) and corrects it in real time. Implementing the algorithm in hardware results in a fast, compact, low-power nonuniformity correction (NUC) system that can be embedded into an intelligent imager at a very low cost. Because it does not use an external reference, our NUC circuit works in real time during normal operation, and can track parameter drift over time. Our NUC system models NU noise as a spatially regular source of additive noise, uses a Kalman filter to estimate the offset in each detector of the array and applies an inverse model to recover the original information captured by the detector. The NUC board uses a low-cost Xilinx Spartan 3E XC3S500E FPGA operating at 75MHz. The NUC circuit consumes 17.3mW of dynamic power and uses only 10% of the logic resources of the FPGA. Despite ignoring the multiplicative effects of nonuniformity, our NUC circuit reaches a Peak Signal-to-Noise Ratio (PSNR) of 35dB in under 50 frames, referenced to two-point calibration using black bodies. This performance lies within 0.35dB of a double-precision Matlab implementation of the algorithm. Without the bandwidth limitations currently imposed by the external RAM that stores the offset estimations, our circuit can correct 320x240-pixel video at up to 1,254 frames per second.