PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 12528, including the Title Page, Copyright information, Table of Contents, and Conference Committee information.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A method of near real-time detection and tracking of resident space objects (RSOs) using a convolutional neural network (CNN) and linear quadratic estimator (LQE) is proposed. Advances in machine learning architecture allow the use of low-power/cost embedded devices to perform complex classification tasks. In order to reduce the costs of tracking systems, a low-cost embedded device will be used to run a CNN detection model for RSOs in unresolved images captured by a gray-scale camera and small telescope. Detection results computed in near real-time are then passed to an LQE to compute tracking updates for the telescope mount, resulting in a fully autonomous method of optical RSO detection and tracking.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Deep Neural Networks (DNNs) are adopted in numerous application areas of signal and information processing with Convolutional Neural Networks (CNNs) being a particularly popular class of DNNs. Many machine learning (ML) frameworks have evolved for design and training of CNN models, and similarly, a wide variety of target platforms, ranging from mobile and resource-constrained platforms to desktop and more powerful platforms, are used to deploy CNN-equipped applications. To help designers navigate the complex design spaces involved in deploying CNN models derived from ML frameworks on alternative processing platforms, retargetable methods for implementing CNN models are of increasing interest. In this paper, we present a novel software tool, called the Lightweight-dataflow-based CNN Inference Package (LCIP), for retargetable, optimized CNN inference on different hardware platforms (e.g., x86 and ARM CPUs, and GPUs). In LCIP, source code for CNN operators (convolution, pooling, etc.) derived from ML frameworks is wrapped within dataflow actors. The resulting coarse grain dataflow models are then optimized using the retargetable LCIP runtime engine, which employs higher- level dataflow analysis and orchestration that is complementary to the intra-operator performance optimizations provided by the ML framework and the back-end development tools of the target platform. Additionally, LCIP enables heterogeneous and distributed edge inference of CNNs by offloading part of the CNN to additional devices, such as onboard GPU or network devices. Our experimental results show that LCIP provides significant improvements in inference throughput on commonly-used CNN architectures, and the improvement is consistent across desktop and resource-constrained platforms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Monitoring the movement and actions of humans in video in real-time is an important task. We present a deep learning based algorithm for human action recognition for both RGB and thermal cameras. It is able to detect and track humans and recognize four basic actions (standing, walking, running, lying) in real-time on a notebook with a NVIDIA GPU. For this, it combines state of the art components for object detection (Scaled-YoloV4), optical flow (RAFT) and pose estimation (EvoSkeleton). Qualitative experiments on a set of tunnel videos show that the proposed algorithm works robustly for both RGB and thermal video.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The purpose of this paper is to aid in detecting synthesized video (specifically created through the use of DeepFake) by exploring facial-feature tracking methods. Analyzing individual facial features, should allow for more successful detection of DeepFake videos according to H. Nguyen et al.’s research [22] and A. A. Maksutov’s list of commonly use techniques to identify fabricated media [17]. To detect these facial features in images, Computer Vision techniques such as YOLOv3 [24] can be used. Once detected, object-tracking methods should be explored. This paper will compare the accuracy of three existing object-tracking methods: the minimum-distance approach, the Kalman Filter (KF) method, and the Sliding Innovation Filter (SIF) method. Following this comparison, the paper proposes a novel hybrid object-tracking approach, in which the benefits of the KF method and SIF method are combined to provide a time-gap tolerant object-tracking method. Each of the models are tested on their ability to track multiple objects that follow different trajectories and compared against one another to identify the most effective manner of tracking.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An accurate and consistent survey of road surface distresses is critical for pavement rehabilitation design and management, allowing public managers to maximize the value of constantly limited budgets for road improvements and maintenance. Manual pavement distress surveys are time-consuming, costly, and dangerous on heavily traveled highways. Automated surveys using video recording hardware devices have been developed and improved over the years, to solve the problems associated with manual surveys. However, reliable distress detection software and data analysis remain difficult. With the advances in smartphone technology, it is now possible to use mounted devices in the field effectively for such applications. A smartphone application was previously developed to utilize on-board accelerometer, gyroscope, and GPS sensors, along with software derived signals from the same sensors, to sample vibrational and geolocation datasets to capture pavement distresses such as potholes when mounted in a standardized configuration in a vehicle. This study examines the possibility of using real-time video processing for pavement surface quality detection. Video captured from the mounted camera is analyzed to estimate road conditions and correlated with sensor data ground truth to corroborate the efficacy of the technique. The findings of this study could indeed aid in developing more effective uses of specialized software for pavement condition classification, to assist decision makers in selecting solutions based on budget and desired survey accuracy, and to evaluate how existing devices will perform when used with the developed algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Internet of Things (IoT) uses cloud-enabled data sharing to connect physical objects to sensors, processing software, and other technologies via the Internet. IoT allows a vast network of communication amongst these physical objects and their corresponding data. This study investigates the use of an IoT development board for real-time sensor data communication and processing, including images from a camera, as part of a custom-made home security system intended for the elderly for easy access.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Sediment plumes are generated from both natural and human activities in benthic environments, increasing the turbidity of the water and reducing the amount of sunlight reaching the benthic vegetation. Seagrasses, which are photosynthetic bioindicators of their environment, are threatened by chronic reductions in sunlight, impacting entire aquatic food chains. This research uses UAV aerial video and imagery to investigate the characteristics of sediment plumes generated by a model of anthropogenic disturbance. The extent, speed and motion of the plumes were assessed as these parameters may pertain to the potential impacts of plume turbidity on seagrass communities. In a case study using UAV video, the turbidity plume was observed to spread over 250 feet over 20 minutes of the UAV campaign. The directional speed of the plume was estimated to be between 10.4 and 10.6 ft/min. This was corroborated by observation of greatest plume turbidity and sediment load near the location of disturbance and diminishing with distance. Further temporal studies are necessary to determine long-term, if any, impacts of human activity-generated sediment plumes on seagrass beds.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Defective dies on a silicon wafer form a pattern that is called a wafer map. In order to adequately train a deep learning-based automated optical inspection system to detect such defective patterns, a large number of defective patterns or wafer maps are needed. In practice, on an actual production line, defective patterns occur infrequently and thus are difficult and time consuming to collect. A computationally efficient defective pattern generation solution is developed in this paper by using the deep learning network of CycleGAN which is a variant of the generative adversarial network. The public domain WM-811K wafer dataset was used to generate or synthesize defective patterns or wafer maps. The two metrics of Fréchet inception distance and kernel inception distance were utilized to evaluate the resemblance of the generated defective images to the real defective images. The results obtained indicate that the developed defective pattern generation method produces realistic wafer maps at a computationally efficient rate of 3 synthesized images per second.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Neural Architecture Search (NAS) is a method of autonomously designing deep learning models to achieve top performance for tasks such as data classification and data retrieval by using defined search spaces and strategies. These strategies have demonstrated improvements in a variety of tasks over ad-hoc deep neural architectures, but have presented unique challenges related to bias in search spaces, the intensive training requirements of various search strategies, and inefficient model performance evaluation. These challenges have been a primary focus for NAS research until recently. However, artificial intelligence (AI) on the edge has emerged as a significant area of research and producing models that achieve top performance on small devices with limited resources has become a priority. NAS research has primarily been focused on improving models by using more diverse search spaces, improving search strategies, and evaluating models faster. A limitation when applied to edge devices is that NAS has historically been finding superior deep neural networks that have become increasingly more difficult to port to embedded devices due to memory limitations, computational bottlenecks, latency requirements, and power restrictions. In recent years, researchers have begun to consider these limitations and develop methods for porting deep neural networks to these embedded devices, but few methods have incorporated the device itself in the training process efficiently. In this paper, we compile a list of methods actively being explored and discuss their limitations. We also present our evidence in support of the use of genetic algorithms as a method for hardware-aware NAS that efficiently considers hardware, power, and latency requirements during training.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Segregating waste accurately at an individual level is paramount for efficient waste management, especially considering the staggering 268 million tons of waste generated each year in the USA, a large portion of which is recyclable. To address this issue, we developed a Smart Waste Sorter (SWS), which is a portable device that can be placed on any waste bin. It uses deep learning models to identify whether a piece of waste is a battery, recyclable, compostable, or recyclable, or trash, and provides a real time alert if the user is about to dispose of the item incorrectly. To develop the SWS image classification model, we utilized a dataset of 4,122 images, obtained from a combination of publicly available and manually collected images from households over several months. We experimented with four models of varying sizes: VGG16, EfficientNetB1, MobileNetV2, and ResNet50, to investigate whether a smaller model could achieve comparable performance, given that our device is portable and requires a compact model that can operate on limited memory without internet connectivity. Our experiments showed that ResNet50 achieved the highest validation and test accuracy of 77.91% and 96.39% respectively over four categories, suggesting that smaller models can be effective. Our results demonstrate the potential of the SWS to improve real-time waste segregation at the individual level, while considering practical constraints for implementation. The proposed solution utilizes a Raspberry Pi to detect motion, capture images and classify them. Our solution provides an effective, practical, and low-cost method for accurately segregating waste and contributing to sustainable waste management.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The COVID-19 epidemic forced governments to adopt worldwide lockdowns in order to limit the virus's spread. Wearing a face mask, it is said, would reduce the possibility of transmission. Due to the growing urban population, proper city management is more important than ever in the modern day to reduce the impacts of COVID-19 infection. To check the mask in public places, however, would require incredibly long lineups and delays. Therefore, it is necessary for an autonomous mask detection system to assess whether someone is wearing a face mask. On the face mask dataset, three different machine learning methods are applied to determine the likelihood of wearing a face mask. The models were assessed using a number of measures, including accuracy, recall, and ROC curve. The main objective of the study is to detect the presence of face masks using deep learning, machine learning, and image processing approaches. All three models—NB, KNN, and CNN—achieved noteworthy accuracy of more than 80%, with CNN showing the best overall performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The article proposes to control of algorithm for the process of forming a coating with an increased content of an oxide layer resulting from the application of plasma formation of surface films. An implementation of an algorithm for adaptive determination of the contours of plasma discharge boundaries during the formation of films of memristor structures is proposed. The construction of the algorithm is based on the use of a multicriteria data processing method in the function of the boundary detector. An implementation of an adaptive change in the contact mask of the plasma discharge with the surface is proposed. Analysis of the contact size and density influences the shape and rate of formation of the oxide layer. The appearance of such a coating has the ability, when exposed to current, to form a complex curve of a function of a given shape. With the subsequent application of voltage, it can be used as an activation function. Recommendations on control and changes influences are presented. A hardware model implementation of an artificial neuron based on blocks of digital elements is presented. Examples of solving the problem of predicting the movement of an actuating element in the control of robotic complexes based on the formed neurons are given.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Currently, there are many options for controlling robotic devices. Human-machine interaction is a key component of the control infrastructure. The most common solution is mobile devices or embedded touch screens, as well as next generation virtual reality devices. In human-machine interaction, most input devices are controlled manually, which is not always convenient, and sometimes even impossible. One option is gesture control, which has become increasingly common in the last few years. This artificial cognitive “sensory perception” or ability is a communication channel between a human and a machine. This article presents a two-steps approach to real-time control robotic devices. The first step is the hand recognition method base on palm detection (SSD Detector) and hand landmark models. After a palm detection, the hand landmark model performs fine localization of the key points of the 3-D coordinate of the hand inside the detected areas of the hand through regression and direct coordinate prediction. The model learns a consistent internal representation of the hand posture and is resistant to even partially visible hands and self-occlusions. The second human gesture recognition step is based on obtaining the coordinates of the hand, the distance from the camera to the hand in space, raised, lowered fingers and other indicators that allow you to accurately determine the shown gesture. In terms of gesture recognition accuracy, the proposed real-time system is better than the state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The article proposes an algorithm for processing parallel analysis of visual data obtained by a machine vision system, recorded information in the human visible spectrum, and information received by a range camera. An algorithm for the formation of stable features as elements of the human body, head and pupils of a person and parallel tracking of their increment is proposed. To highlight trend lines in element displacement and eliminate the high frequency component based on a combined criterion. The image is preliminarily processed to reduce the effect of the noise component based on a multi-criteria objective function. As test data used to evaluate the effectiveness, a video stream with a resolution of 1024x768 (8-bit, color image, visible range), 3D data, and expert evaluation data are used.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.