PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 11399, including the Title Page, Copyright information, and Table of Contents
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the field of IR technology careful regulation of temperature elements, such as blackbodies or temperature targets is important, particularly for calibration. Feedback based controllers, independent of state space, such as Proportional Integral Derivative (PID) controllers, are a popular and effective way to control these temperature systems. In this paper we explore different types of control for a prototype heated reference target. We show that we can use a combination of PID and Least Means Square (LMS) closed loop adaptive control to determine both the optimal weight proportion and the magnitude of the weights for optimal power draw. This allows us to develop a faster and more optimal controller than by manually tuning the weights by hand.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recent breakthroughs in EO/IR sensing, real-time signal processing, and deep machine learning technologies have enabled standoff heart rate estimation from facial and body video. This technology is also known as remote photoplethysmography (rPPG). Research and development of rPPG has attracted much attention recently. This paper gives a timely review of this fast-paced field to give the researcher, engineer, and graduate student a quick grasp of the recent advancement of rPPG. We first review two rPPG design approaches: color variation based and motion-based detections. To enable rPPG for less constrained use cases, various signal processing and machine learning algorithms are developed to handle signal variabilities introduced by lighting source, view angle, and subject motion. To help newcomers quickly start work in this field, we then describe some existing rPPG research datasets, open-source rPPG research tools, and some demonstration systems. Six commonly used rPPG algorithm evaluation metrics are described to evaluate and visualize the research advance in this field. As the rPPG technology matures, more application domains become possible. We cover six applications of rPPG in commercial, security, and defense domains, including emerging applications in biometric liveness and video media authenticity. Finally, we outline some challenges yet to overcome, especially in the domain of security and defense. These challenges include unconstrained outdoor environment, rPPG form air-platform, night time operation, moving and non-cooperative subjects. These challenges require special algorithmic considerations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In recent years, the utility of multi-channel image processing techniques has increased. Applications to image processing include color image registration, edge detection, filtering, and color space correction, and multi-channel image matching. These techniques represent multiple image channels, such as three color channels, as image functions taking values in a finite-dimensional algebra, for example, the algebra of quaternions. Use of these algebras enables processing a multichannel image as a single entity, holistically. This holistic processing provides the opportunity to exploit cross-channel dependencies for improved performance, for example, improved matching performance in image matching or registration applications. In this paper, we make a critical analysis of the foundations of quaternion-valued image matching. We investigate standard alternatives to using quaternions for four-channel image matching including image tiling, image averaging, independent component processing, and coordinated component translation approaches. We examine the advantages that quaternion-based processing provides over these alternatives. We interpret the quaternion match metric as an inner product, which provides motivation for image matching using quaternion correlation. We generalize the quaternion match metric to use arbitrary combinations of component correlations to define new match metrics. We show that these new match metrics correspond to new product structures on four-dimensional algebras. We present numerical results demonstrating and validating aspects of the theory.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recently, quaternions have been applied to multi-channel image processing such as color image registration, edge detection, and image matching. The general approach consists of using up to four image components as separate components in a quaternion-valued image. Techniques from quaternion algebra and analysis, such as quaternion Fourier transforms and correlation, are exploited to implement the desired image processing operations. The motivation underlying use of quaternions involves the concept of holistic data processing. That is, information in separate channels is processed jointly, thereby leveraging correlations across channel. Furthermore, quaternion transforms, such as the Fourier and wavelet transforms, have been defined and can be applied to process quaternion-valued imagery. For example, analogues from classical signal processing, such as the convolution theorem, have been derived and can be used to reduce computational burden. Such quaternion-based image processing techniques can provide performance superior to classical approaches which process image components independently. However, under certain conditions, performance of quaternion-based methods fails to meet performance of independent channel processing methods. In this paper, we perform a theoretical analysis of the performance of quaternion image matching. We present conditions under which quaternionbased processing outperforms, or falls short of, independent channel processing methods. We produce numerical image matching results validating the theoretical performance analysis using synthetic data. We define and solve a linear system of equations to generate the synthetic data. Receiver Operating Characteristics curves are provided to demonstrate the improved performance of the quaternion image matching approach under the favorable conditions predicted by the theory.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper offers a new multiple signal restoration tool to solve the inverse problem, when signals are convoluted with a multiple impulse response and then degraded by an additive noise signal with multiple components. Inverse problems arise practically in all areas of science and engineering and refers to as methods of estimating data/parameters, in our case of multiple signals that cannot directly be observed. The presented tool is based on the mapping multiple signals into the quaternion domain, and then solving the inverse problem. Due to the non-commutativity of quaternion arithmetic, it is difficult to find the optimal filter in the frequency domain for degraded quaternion signals. As an alternative, we introduce an optimal filter by using special 4×4 matrices on the discrete Fourier transforms of signal components, at each frequency point. The optimality of the solution is with respect to the mean-square-root error, as in the classical theory of the signal restoration by the Wiener filter. The Illustrative example of optimal filtration of multiple degraded signals in the quaternion domain is given. The computer simulations validate the effectiveness of the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Analysis of point cloud records, in any dimension, have been shown to benefit from analysing the topological invariants of simplicial (or cell) complex shapes obtained with these points as vertices. This approach is based on rapid advances in computational algebraic topology that underpin the Topological Data Analysis (TDA) innovative paradigm. Simplicial complexes (SCs) of a given point cloud are constructed by connecting vertices to their next nearest neighbours and gluing to the interiors of real k-simplexes (k<2) to each set of pairwise connected (k+1) vertices. This process is often done iteratively, in terms of an increasing sequence of distance thresholds, to generate a nested sequence of SCs. The Persistent Homology (PH) of such nested sequence of SCs records the lifespan of the homology invariants (No. of connected components, 2D holes, 3D tunnels, etc.) over the sequence of thresholds. Despite numerous success stories of TDA and its PH tool for computer vision and image classification, its deployment is lagging well behind the exponentially growing Deep Learning Convolutional Neural Networks (CNN) schemes. Excessive computational cost of extracting PH features beyond small size images, is widely reported as a major contributor to this shortcoming of TDA. Many methods have been proposed to mitigate this problem but only modestly for large size images, due to the way images are represented by very large point clouds rather than the computational cost of PH extractions. We shall propose an innovative approach of representing images by point clouds consisting of small sets of texture image landmarks, and thereby create a large number of efficiently extractible PH features for image analysis. We shall demonstrate the success of this approach for different image classification tasks as case studies.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Human body temperature is an important vital sign especially for health monitoring and exercise training. In this study, we propose a CNN plus support vector machine (SVM) approach (CNN-SVM) to estimate body temperature from a sequence of facial images. The sequence images could be from multiple shots or from video frames using a smartphone camera. First, the facial region is cropped out from a digital picture using a face detection algorithm, which can be implemented on the smartphone or at cloud side. Second, normalize the batch of facial images, and extract the facial features using a pretrained CNN model. Lastly, train a body temperature prediction model with the CNN features using a multiclass SVM classifier. The feature extraction and classification are performed in the cloud side with GPU acceleration and the predicted temperature is then sent back to the mobile app for display. We have a facial sequence database from 144 subjects. There are 12-18 shots of facial images taken from each subject. We selected AlexNet, ResNet-50, VGG-19, or Inception-ResNet-v2 models for feature extraction. The initial results show that the performance of the proposed method is very promising.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Early-stage breast cancers are very challenging for computer-aided detection (CAD) because they are small and often blend in with surrounding tissues. One reason for the current CAD limitations may be the lack of temporal analysis. A radiologist usually uses the current and prior mammograms side by side to evaluate changes over time. We propose a CAD method for breast cancer screening using a recurrent neural network (RNN), a convolutional neural network (CNN) with follow-up scans. First, mammographic images are examined by three cascading object detectors to detect suspicious cancerous regions. This is similar to generating a region proposal. Then all regional images (one channel) are scaled to 224×224×3 and fed to a pre-trained CNN (ResNet-50 model) to extract features. The image features are extracted from a registered prior scan, a current scan, and their difference image, each of which has a dimension of 2048 prior to the fully-connected layer. Finally the features from the three images are combined to train a RNN classifier. The RNN functions as a temporal analysis, which can factor in multiple follow-up scans. Our digital mammographic database includes 102 cancerous masses, architecture distortion, and 27 healthy subjects, each of which includes two scans: current (cancerous or healthy), and prior scan (healthy typically one year before). Our experimental results show that the performance of the proposed CAD method is very promising.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the latest advances in image sensor technology, cameras are able to generate video with tens of megapixels per frame. These high resolution videos streams offer great potential to be used in the surveillance domain. For ground based systems, gigapixel streams are already used with great effect as illustrated by the ICME 2019 crowd counting challenge. However, for Unmanned Aerial Vehicles (UAVs), this vast stream of data exceeds the limit of transmission bandwidth to send this data back to the ground. On board data analysis and selection is thus required to use and benefit from high resolution cameras. This paper presents a result of the CAVIAR project, where a combination of hardware and algorithms was designed to answer the question: ‘how to exploit a high resolution high frame rate camera on board a UAV?’. With the associated size, weight and power limitations, we implement data reduction by deploying deep learning on hardware to find the relevant information and transmit it to an operator station. The proposed solution aims at employing the high resolution potential of the sensor only onto objects of interest. We encode and transmit the identified regions containing those objects of interest (ROI) at the original resolution and framerate, while also transmitting the downscaled background to provide context for an operator. We demonstrate using a 35 fps, 65 Megapixel camera that this set-up indeed saves considerable bandwidth while retaining all important video data at high quality at the same time.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Enhancement and segmentation of suspicious regions of a thermal breast image are among the most significant challenges facing radiologists while examining and interpreting the thermogram images. The proposed focuses to following problems: How can increase the contrast between cancer regions and the background, how to adjust the intensity of the presence of BC region to be more homogeneous in the infrared image; how to efficiently segment tumors as suspicious regions with a very weak contrast to their background and how to extract the relevant features which separate tumors from background. The proposed cancer segmentation scheme composed of three main stages: (i) image enhancement; (ii) detection of the tumor region; (iii) features extraction from the segmented tumor area followed by coloring the segmented region. The performance of the proposed enhancement and segmentation method was evaluated on DMR-IR database and the average segmentation Accuracy, MCC, Dice and Jaccard obtained are 98.8%, 47.96%, 43.03%, and 34.8% respectively which is better than FCM, LCV-LSM, and EM-GMM methods. Besides, we also investigate the role of thermal image enhancement in tumor characterization and feature extraction.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image segmentation is the critical step in imaging including applications such as video surveillance and security in controlled areas: detection and recognition of objects, their classification, analysis of crowd behavior, for identification (face recognition), for remote sensing for objects of critical infrastructure for manmade disasters and other hazards. Recently several image segmentations tools have been developed. However, these tools have limitations and sometimes not aureate since the capture devices usually generate low-resolution images, which are mostly noise and blurry. The goal of this study are: (1) To map optimally images into color images to enhance their contrast and the visibility of otherwise obscured details; (2) To perform an automated segmentation analysis using modified Chan and Vese method; and (3) To study the impact of the segmentation evaluation method. Computer simulations on the thermal dataset show that the new segmentation algorithm exhibits better results compared to state-of-the-art techniques.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Often, state-of-the-art techniques for face detection exhibit suboptimal performance under poor illumination conditions. For instance, the absence of ambient light effectively challenges a model’s capacity to extract relevant global and local features needed to localize faces in images. This paper proposes a technique based on generative adversarial networks to improve the visual quality of images captured in low lighting conditions to significantly improve face detection and landmark localization in real-world images. An attention-guided cascaded residual generator network (AG-CRN) along with a Markovian discriminator is trained in an adversarial manner to synthesize enhanced and properly illuminated images given corresponding low-light pairs. Extensive experiments on several datasets, including the DARK FACE dataset, demonstrate that AG-CRN is capable of producing visually pleasing images, and also significantly improves the performance of state-of-the-art models for face detection and landmark localization under very low-lighting conditions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Skin lesion segmentation (SLS) has a vital role in the early and precise diagnosis of skin cancer by computer-aided diagnosis (CAD) systems. But, the automatic SLS in dermoscopic images is a challenging task due to the substantial differences in color, texture, artifacts (hairs, gel bubbles, ruler markers), indistinct boundaries, low contrast, and varying sizes, position, and shapes of the lesion images. In the paper, we propose an extended GrabCut image segmentation algorithm for Foreground/Backgrounds dermoscopic image segmentation applications. The method integrates octree color quantization and a modified GrabCut method with a new energy function. Extensive computer simulation on ISIC 2017 has shown to compare favorably on both qualitative and quantitative evaluations with commonly used segmentation tools.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
During the past two decades, Oil Spill Detection (OSD) received widespread attention from research communities. Both detection and analysis of OSD have fundamental importance for improving the efficiency of maritime environment ecosystems. Most recently, thermal imaging devices are used for oil detection and disaster management projects since they can provide spilling information at Day/Night time and can work under adverse weather conditions. Nevertheless, the quality of these images are poor, they are noisy, blurry, and they have low resolution. As well as a thermal image contrast between oil and water is often so small, that makes OSD problematic and challenging. The goal of this paper is to automatically detect and analyze the OSD on the upper sea/ocean layer that may help in the visualization of oil spills for disaster management purpose. For the purposes of comparison, quantitative and qualitative analysis was conducted on the existing segmentation approaches, namely OTSU, and Sauvola by using two new databases composed each of 100 diversified images extracted from 2 different videos. The performance of the proposed also evaluated by examining statistical measures of boundary error (BE), probabilistic rand index (PRI), variation of information (VI), global consistency error (GCE), and structural similarity index (SSI). The obtained results proved that the proposed method is more robust, accurate, and straightforward. Future research recommendations and conclusions are presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, the inverse gradient operators are considered that allow for calculating the original image from its gradients, such as gradients in the horizontal and vertical directions. Different gradient operators are considered and the image reconstruction from the gradients is presented. The presented method of image reconstruction is also applied for mean operators. Examples with images processed by different gradients are given.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In quaternion algebra the color images are processed not by separating their color components, but these components are presented and processed together as one unit. In framework of the quaternion image representation, many effective methods of processing color images can be developed, including image enhancement in spatial and frequency domains. In this paper, we unite two approaches for processing color images, by proposing a quaternion representation in quantum imaging, which includes the color images in the RGB model together with the grayscale component or brightness. The concept of quaternion two-qubit is considered and applied for image representation in each quantum pixel. The colors at each pixel are processed as one unit in quaternion representation. Other new models for quaternion image representation are also described. It is shown that a quaternion image or four component image of 𝑁 × 𝑀 pixels, can be represented by (𝑟 + 𝑠 + 2) qubits, when 𝑁 = 2𝑟 and 𝑀 = 2𝑠, 𝑟, 𝑠 > 1. The number of qubits for representing the image can be reduced to (𝑟 + 𝑠), when using the quaternion 2-qubit concept.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Unlike RBG cameras, thermal cameras perform well under very low lighting conditions and can capture information beyond the human visible spectrum. This provides many advantages for security and surveillance applications. However, performing face recognition tasks in the thermal domain is very challenging given the limited visual information embedded in thermal images and the inherent similarities among facial heat maps. Attempting to perform recognition across modalities, such as recognizing a face captured in the thermal domain given the corresponding visible light domain ground truth database or vice versa is also a challenge. In this paper, a Thermal to RGB Generative Adversarial Network (TRGAN) to automatically synthesize face images captured in the thermal domain, to their RBG counterparts, with a goal of reducing current inter-domain gaps and significantly improving cross-modal facial recognition capabilities is proposed. Experimental results on the TUFTS Face Dataset using the VGG-Face recognition model without retraining, demonstrates that performing image translation with the proposed TR-GAN model almost doubles the cross-modal recognition accuracy and also performs better than other state-of-the-art GAN models on the same task. The generator in our network uses a UNET like architecture with cascaded-in-cascaded blocks to reuse features from earlier convolutions, which helps generate high quality images. To further guide the generator to synthesize images with fine details, we optimize a training loss as the weighted sum of the perceptual, adversarial, and cycle-consistent loss. Simulation results demonstrate that the proposed model generates more realistic and more visually appealing images, with finer details and better reconstruction of intricate details such sunglasses and facial emotions, than similar GAN models.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
According to the American cancer society, the average risk of women getting diagnosed with breast cancer during their life is 13%. The World Health Organization also reports that the number of cancer cases is projected to rise to 19.3 million by 2025. Recent research works point out that physicians can only diagnose cancer with 79% accuracy while machine learning procedures achieve 91% accuracy or more. The current challenges are early cancer detection and the efficient and accurate diagnosis of histopathology tissue samples. Several Deep Learning breast cancer classification models have been developed to assist medical practitioners. However, these methods are data hungry and require thousands of training image samples, often coupled with data augmentation to achieve satisfactory results with long training hours. In this paper, we propose a machine learning classification model by integrating the Parameter free Thresholding Adjacency Statistics (PFTAS) with Fibonacci-p patterns for breast cancer detection. Computer simulations on BreakHis cancer datasets in comparison to other machine learning and deep learning-based methods show that (i) the presented method helps eliminate dependence on large training data and data augmentation, (ii) robustness to noise and background stains, and (iii) lightweight model easy to train and deploy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper offers a review of effective methods of image representation, or visibility images in enhancement applied to the thermal images and night vision images. The quality of images is estimated by quantitative enhancement measures, which are based on the Weber-Fechner, Michelson, and other parameterized ratio and entropy-type measures. We also apply the concept of visibility images, by using different types of gradient operators which allow for extracting and enhancing features in images. Examples of gradient visibility images, Weber-Fechner, and Michelson contrast, log and power Michelson and Weber visibility images are given. Experimental results show the effectiveness of visibility images in enhancing thermal and night vision images. Different methods of image enhancement in spatial and frequency domains are analyzed on the thermal and night vision images, by using the visibility images. Experimental results show that visibility images can be used in processing in spatial and frequency domains. Here, we mention the alpha-rooting by the classical discrete Fourier transform (DFT) and quaternion DFT, the retinex method, and new gradient based histogram equalization.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Neural networks have emerged to be the most appropriate method for tackling the classification problem for hyperspectral images (HIS). Convolutional neural networks (CNNs), being the current state-of-art for various classification tasks, have some limitations in the context of HSI. These CNN models are very susceptible to overfitting because of 1) lack of availability of training samples, 2) large number of parameters to fine-tune. Furthermore, the learning rates used by CNN must be small to avoid vanishing gradients, and thus the gradient descent takes small steps to converge and slows down the model runtime. To overcome these drawbacks, a novel quaternion based hyperspectral image classification network (QHIC Net) is proposed in this paper. The QHIC Net can model both the local dependencies between the spectral channels of a single-pixel and the global structural relationship describing the edges or shapes formed by a group of pixels, making it suitable for HSI datasets that are small and diverse. Experimental results on three HSI datasets demonstrate that the Q-HIC Net performs on par with the traditional CNN based methods for HSI Classification with a far fewer number of parameters.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The structural analysis buildings include integrated data processing. During the analysis, volumes of big data are formed that describe the current state of buildings. Such data are obtained using both sensor systems and human participation. On the basis of sensor systems, the functions of analysis of structural movements, changes in geometry, and microclimate monitoring are implemented. Human-generated data uses fuzzy concepts, such as the beginning of destruction processes, the destruction of protective films, the search for anomalies, etc. The greatest interest in the energy audit of buildings is the analysis of heat losses. Automation of this function allows you to reduce the cost of the process, make it permanent (or increase the frequency) and eliminate the human factor. The paper proposes an approach that allows the primary analysis of thermal images obtained with low-resolution devices. The primary analysis procedure includes procedures for smoothing images, changing color spaces, zooming, and highlighting local stationary areas. The final stage is the procedure for searching for small cracks in structures, the thickness of the leakage area is set by the operator or can change evenly. The decision-making operation is carried out by the user, the proposed approach allows to minimize the analysis procedure and increase productivity. The set of test images obtained by the FLIR C2 camera shows the effectiveness of detecting cracks in the walls of buildings built from various materials.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The main goal of image steganography techniques is to maximize embedding rate while minimizing the change of the cover image after embedding. Much work has been done on how sender would embed the secret message in the cover image (i.e. embedding techniques) but there is a few works focus on how the senders choose the cover images. One advantage of image steganography is that the cover image only acts as a carrier for the message and the embedder (sender) has the freedom to choose a cover image amongst a set of cover images those results in the least detectable stego image. The way of choosing the cover image is important and since it is available to the sender both the cover and stego images, then the senders are able to measure the embedding artifacts directly. Thus, we are interested in measures which are able to quantify such artifacts. We can use the cover-stego based on measures which we have employed in our work to select best cover image among set of images. The measures used are (1) Number of Modifications to the cover image could be thought as the most intuitive. The smaller the number of changes made the less detectable the resulting stego image should be, (2) Peak Signal to Noise Ratio (PSNR) which is obtained from the cover-stego image pairs where higher PSNR values are assumed to be indicative of lesser delectability, or (3) Based on the robustness to the steganalysis techniques. For the experiments we used a dataset of gray scale images with size of 512×512 resolutions as a cover image with different secret message size from 0.2 to 1.0 bits per pixel.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Scene Labeling and Segmentation: Additional presentations
Manual inspections of glass façade of high rising buildings are expensive, time-consuming and potentially life-threatening for both inspectors and pedestrians on the street. Advances in machine learning for image/video analysis and availability of affordable unmanned aerial vehicles (UAVs) with onboard video recording and processing sensors provide opportunities for smart, safe and automatic glass façade inspections. This paper is concerned with developing an effective solution for recognizing cracked glass panels, which can be installed on board a UAV. From static 2D photographic images, the proposed solution analyzes textural patterns of smooth glass surface and crack segments, linearity of detected crack segments, geometrical characteristics of crack curvatures and the crack pixel patterns, captures these discriminative features for glass cracks using Uniform Local Binary Pattern (ULBP), histograms of linearity, geometrical curvature descriptors with fixed length connected pixel configurations, and accordingly classifies images of cracked and non-cracked glass panels using a kNN classifier. Experimental results with images of different resolutions acquired by a UAV drone in a real office building setting and images collected through Google search demonstrate that the proposed solution achieves promising results with accuracy rates in excess of 80% and even as high as 91% despite the presence of reflections.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Convolutional Neural Network (CNN) based deep learning technique is fast gaining acceptability and deployment in a variety of computer vision and image analysis applications, and is widely perceived as achieving optimal performance in detecting and classifying objects/patterns in images. Despite considerable success in various image analysis tasks, several shortcomings have been raised including high computational complexity, model overfitting to the training data, requiring extremely large training image datasets, and above all its black-box style of decision making with no informative explanation. Understandably, the latter is a major obstacle for deployments for medical image diagnostics. Conventional machine learning approaches rely on image texture analysis to achieve high, but not optimal, performances and their decisions can be justified quantitatively. The emergence of the new paradigm of Topological Data Analysis (TDA), to deal with the growing challenges of Big Data applications, has recently been adopted to design and develop innovative image analysis schemes that automatically construct filtrations of topological shapes of image texture and use the TDA tool of persistent homology (PH) to distinguish different image classes. This work is an attempt to investigate the effect of CNN convolution layers on the discriminating strengths of TDA based extracted features. We shall present the effect of the pre-trained filters for the convolutional layers - AlexNet on various PH features extracted from Ultrasound scan images of human bladder for distinguishing benign masses from malignant ones. We shall demonstrate that the condition number of the pre-trained filters influences the discriminatory power of PH representation of certain types of local binary pattern (LBP) texture features post convolution in a manner that could be exploited in designing a strategy of filter pruning that maintain classification accuracy while improving efficiency.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.