In this paper, we elaborate on what was done to implement a semi-supervised self-structured learning algorithm using aerial visual and infrared (IR) images. The objective of this paper is to focus on the processed visual and IR images and the impact they had on our testing software package with noisy and sparse areal visual and infrared data. We encountered several issues with the processed test data due to noise, invalid detections from shadow, and two or more detections being mistaken as a single detection (or vice versa). The target detections include vehicles, people, noise and unidentified objects. To overcome these phenomena, we utilized our software package to extract information from detections, such as the exact pixel content, orientation, etc. We were also able to infer information from tracks as we built them, such as direction and speed, which further helped. As a result, our algorithm is capable generating patterns to build longer tracks from detections. The improved algorithm also has the ability to differentiate and classify target detections based on binary feature representations and attributes. We plan to further extend this track generation to include learning via pattern recognition, and complex object building.
A Self-structuring Data Learning Algorithm was introduced and has been implemented in our prior work. While the algorithm and the software package are advancing, it has been tested with both synthetic data and real-world data. After encouraging synthetic data test results, real-world data testing also shows promising outcomes while posing some challenges such as object occlusion, objects merging, and going into and emerging from under bridge. To resolve such problems, a multi-int solution is proposed. One of the key features in this solution is similarity measure. There are different types of similarity measures. In this paper, we primarily focus on aerial images similarity measure. The images we worked on presents unique challenge in similarity measure because of small object in distance and large area image, which consequently provides limited information. To deal with this difficulty, we have developed 14 different similarity metrics by employing Normalized Cross Correlation method, Sum of Squared Differences, and overlapping and colors of pixels. We used object tracking ability to evaluate the metrics. The simulation results show each metric has some advantages and disadvantages. In attempt to improve tracking capability, we imposed some metrics thresholds in addition to the image similarity metrics. Such metrics thresholds were learned from labeled data with valuation of tracking correctness. To further enhance tracking ability, speed similarity was incorporated on top of two features mentioned above. More improvement can be done by studying robustness of images similarity metrics and using tracks fusion.
Internet of Things (IoT) and other similar devices often have little to no security and thus can be readily exploited in any number of ways. In this work, we collect radio frequency (RF) emissions from simple processors on several IoT devices and apply machine learning techniques to detect modifications (corrupted or injected via malware) in ‘known’ software running on the processor. We can detect these modifications due to the correlation between RF emissions and the digital state of the devices. Every bit flip produces a small but potentially detectable electrical pulse. Our approach to developing the recognition algorithm is to adapt to the variability created by the input data by recognizing the sequences in which instruction blocks are executed. Seemingly minor changes to input values can have a detectable effect on the measured RF side channel. We collect RF data from a variety of IoT devices with clock speeds varying from 16-96 MHz. A 1-GHz Riscure RF near-field antenna probe was placed within a millimeter of the IoT device, RF emissions were acquired, and software controls triggered data collection. A classification architecture was trained using object code portioned into blocks to develop the truth data. We then applied new data to the trained block classifier. This approach detects deviations in individual blocks and block sequences as a whole, allowing a greater level of detection resolution than just binary ‘Yes/No’ classification. Initial testing results showed greater than 90% classification accuracy for block-level modifications, and we can detect deviations from truth data with 100% accuracy.
We applied machine learning to detect changes in state of key registers in digital devices from their analog RF emissions. As digital devices operate, they emit information via analog side channels. We collected the RF side channel with a 500-MHz shielded loop probe from Riscure, placed in the nearfield (<1mm) of the device under test (DuT). We investigated a number of Internet-of-Thing (IoT) DuTs including Arduino Uno and PIC24 processors. Conventional processors implement instructions as a sequence of subtasks. The first subtasks include incrementing the program counter (PC) register and fetching the next instruction from program memory to the instruction register (IR). These two subtasks occur almost every instruction cycle. We ran programs on the DuT and collected the RF emissions. We parsed the object code of the programs to determine the state of key registers including the PC and IR during each instruction cycle and observed that the RF signal of each cycle is strongly correlated with the Hamming Distance (HD) (i.e., the number of bits changing) in the PC and IR registers. Based on this result, we developed classifiers to extract the HD of the PC, IR, as well as the stack pointer (SP). The classification results vary with true HD as some values are rare and have few examples in the training set. The classification accuracy exceeds 99% for the PC and the IR. Due to the relatively few HD in the training set for the SP, its results slightly exceeded 97%.
All digital devices leak information through unintended emissions into analog side channels. The RF side channel enables passive collection of high-bandwidth information about the digital state of the device. We collected these RF emissions with a 500-MHz Riscure probe placed in the nearfield of the device under test (DuT) and applied machine learning to detect what program is running on the processor to identify malware intrusions. We explored the applicability of a generalized algorithm classification infrastructure built from a training set of similar DuTs to a similar device from a different production batch (same model number, different serial number.) We collected RF-SC data for five programs running on 28 distinct Arduino Unos (and 28 MSP430 processors.) We trained program classifiers on RF data from all but one DuT and tested the classifiers on the device withheld from the training set. The high-SNR signal provided by the Riscure probe enabled almost perfect classification results when we trained and tested on the same device. Our classification results remained above 99% when we generalized testing to the new DuT of the same model but a different serial number. The classifier was trained on 27 of the devices and tested to determine its ability to detect deviations from a baseline algorithm on a withheld device. The worst misclassification rate was a mere 0.08%.
Field Programmable Gate Arrays (FPGAs) are increasingly vital components of electronic systems used in numerous industries. FPGAs possess well-documented logic and hardware vulnerabilities that could allow an adversary to penetrate and manipulate FPGA-based electronic infrastructure. To detect such attacks against FPGA firmware, we developed a technique that exploits the unintended RF side-channel emitted from an FPGA. Our approach presumes that malicious modification to a trusted FPGA bitstream will result in changes in radio frequency (RF) emissions—changes that our technique can detect and measure using signal processing and machine learning. The development of our RF side-channel technique was divided into three tasks: (1) determine if firmware changes can be detected using side-channel emissions, (2) determine the minimum firmware change that can be detected, and (3) extend our approach to work across multiple devices of the same type. We used the Digilent Arty development board to accomplish these tasks. We developed baseline firmware for the board and then generated additional bitstreams that incorporated quantifiable changes in the logic and placement. We then collected RF side-channel emissions for each bitstream using the Riscure EM Probe Station, which uses a 1 GHz bandwidth near-field antenna. Using our RF side-channel approach, we were able to detect the movement of a single register or lookup table element by one slice. We proved the effectiveness of our technique to detect changes across multiple FPGAs of the same type by achieving detection accuracy greater than 98%.
We use machine learning to characterize the state of digital devices based on their analog emissions. As digital devices operate, they emit internal information into a number of analog side channels. Remote sensing of these unintended signals leads to low signal-to-noise-ratio (SNR) and significant clutter. We developed classifiers to determine which program is executing on a digital device based on analog radio-frequency (RF) emissions collected via a 500-MHz Riscure RF probe. A standard algorithm was developed to serve as a baseline program and intrusions were simulated by introducing minor modifications to this program. We collected a thousand RF traces from each of these modified programs running on ten different devices for thousands of instruction cycles. The ten devices tested are representative of the Internet of Things (IoT) devices including Arduino Unos and PIC24 processors. Our primary approach to mitigating the impact of low SNR is to extend the program execution and signal collection time. Collecting a training set with more traces than samples is not practical. Even after down-sampling the raw data to thirty samples per instruction, the number of samples exceeds the number of traces by orders of magnitude. Such a training set nearly guarantees overlearning. To mitigate this, we present our Whitened Mean Classifier as a method to whiten this sparse training set and avoid overlearning. Classification accuracy exceeded 90% for the modified programs on a subset of the ten devices.
Previously, we proposed and implemented a Self-structuring Data Learning Algorithm. This realized software package and the concept are still progressing. Earlier, it was tested with synthetic data and exhibited interesting results. The objectives of this paper are testing the algorithm with raw infrared and visual images and updating the algorithm as required. We first performed registration transformation and detection from the images with an existing software package. We then registered the detections with the registration transformations from both infrared and visual images. The registered detections were delivered to the algorithm for target detection and tracking without modification. Results revealed inability to handle very noisy infrared image features. To overcome this problem, we developed multiscale grid processing to improve detection classification in the algorithm. This updated algorithm shows much better target detection and tracking with the real-world data. More algorithm enhancements are in work such as incorporating pattern recognition, classification, and fusion.
Side-Channel Analysis (SCA) is an increasingly well-known method for non-invasively extracting information from unintended “side-channel” emissions given off by electronic devices. The common method for extracting side-channel information is via a near-field antenna probe placed in the vicinity (i.e., millimeters) of the target device. The antenna detects and amplifies the radio-frequency (RF) emissions given off by the device and transmits the information for analysis and testing. Side-channel attacks are most known for their utility in cryptanalytics; however, they can also be used to fingerprint devices or even determine the digital state of the system. In this work, characterization studies on a 1- GHz antenna using Riscure’s RF probe station are performed. For RF-SCA, the ultimate limits of signal sensitivity and frequency response are determined by the antenna characteristics. In addition, the effective source-receiver distance (SRD), cross-talk and spatial signal averaging at various SRDs have to be characterized for signal attenuation and normalization. From our testing, it appears that the Riscure probe has a peak frequency response at about 200 MHz. For example, the 418MHz antenna had multiple peaks at 130 MHz, 172 MHz, 213 MHz, and 370 MHz, as well as multiple less significant protrusions at higher frequencies. The BeeHive100C probe peaked at exactly 200 MHz but had a couple of side-lobes in the 600-800 MHz range. The Pharad 30-512 MHz antenna peaked at a slightly lower 193MHz, although, some response was observed in the 600-800 MHz range as in the other antennas. The Pharad 225-6000MHz antenna exhibited a similar peak but lesser roll-off and an elevated response at increased frequencies than its predecessor.
The Internet of Things (IoT) and Internet of Everything (IoE) has driven the proliferation of processors into nearly every powered device around us: from thermostats to refrigerators to light bulbs. From a security perspective, IoT/IoE creates a new layer of signals and systems that can be exploited to access supporting network layers. Our research focuses on leveraging the analog side channels of IoT/IoE processors, for defensive purposes. We apply signal-processing and machine-learning techniques to collected RF emissions to detect if code running on the processor has been modified (i.e., corrupted or injected with malware). The paper describes our process for positioning a wide-bandwidth RF probe over the device under test (DuT). Classifiers are implemented for identifying the code running on the device. We demonstrate the ability to detect, identify, and isolate instructions based on signatures learned during initial DuT characterization. The probe is positioned to capture RF signals that support-vector machine (SVM) classifiers can accurately discriminate between instructions, rather than relying on raw power leakage. At this well-discriminated location, the signatures of each instruction are extracted by applying principal component analysis (PCA) to separate its signal into components (fetch, opcode, operands, and values). These signatures are used to identify instructions in the test code. Additionally, this paper discusses applying our methodology to blocks of code/algorithms using sequence learning algorithms. These techniques enable significant reduction in feature dimensions improving speed and accuracy of instruction level classification of low-SNR RF sidechannels.
This paper details the process we went through to visualize the output for our data learning algorithm. We have been developing a hierarchical self-structuring learning algorithm based around the general principles of the LaRue model. One example of a proposed application of this algorithm would be traffic analysis, chosen because it is conceptually easy to follow and there is a significant amount of already existing data and related research material with which to work with. While we choose the tracking of vehicles for our initial approach, it is by no means the only target of our algorithm. Flexibility is the end goal, however, we still need somewhere to start. To that end, this paper details our creation of the visualization GUI for our algorithm, the features we included and the initial results we obtained from our algorithm running a few of the traffic based scenarios we designed.
In this paper, we elaborate on what we did to implement our self-structuring data learning algorithm. To recap, we are working to develop a data learning algorithm that will eventually be capable of goal driven pattern learning and extrapolation of more complex patterns from less complex ones. At this point we have developed a conceptual framework for the algorithm, but have yet to discuss our actual implementation and the consideration and shortcuts we needed to take to create said implementation. We will elaborate on our initial setup of the algorithm and the scenarios we used to test our early stage algorithm. While we want this to be a general algorithm, it is necessary to start with a simple scenario or two to provide a viable development and testing environment. To that end, our discussion will be geared toward what we include in our initial implementation and why, as well as what concerns we may have. In the future, we expect to be able to apply our algorithm to a more general approach, but to do so within a reasonable time, we needed to pick a place to start.
In this paper, we propose a hierarchical self-structuring learning algorithm based around the general principles of the Stanovich/Evans framework and “Quest” group definition of unexpected query. One of the main goals of our algorithm is for it to be capable of patterns learning and extrapolating more complex patterns from less complex ones. This pattern learning, influenced by goals, either learned or predetermined, should be able to detect and reconcile anomalous behaviors. One example of a proposed application of this algorithm would be traffic analysis. We choose this example, because it is conceptually easy to follow. Despite the fact that we are unlikely to develop superior traffic tracking techniques using our algorithm, a traffic based scenario remains a good starting point if only do to the easy availability of data and the number of other known techniques. In any case, in this scenario, the algorithm would observe and track all vehicular traffic in a particular area. After some initial time passes, it would begin detecting and learning the traffic’s patters. Eventually the patterns would stabilize. At that point, “new” patterns could be considered anomalies, flagged, and handled accordingly. This is only one, particular application of our proposed algorithm. Ideally, we want to make it as general as possible, such that it can be applies to numerous different problems with varying types of sensory input and data types, such as IR, RF, visual, census data, meta data, etc.
The two stage hierarchical unsupervised learning system has been proposed for modeling complex dynamic surveillance
and cyberspace systems. Using a modification of the expectation maximization learning approach, we introduced a three
layer approach to learning concepts from input data: features, objects, and situations. Using the Bernoulli model, this
approach models each situation as a collection of objects, and each object as a collection of features. Further complexity
is added with the addition of clutter features and clutter objects. During the learning process, at the lowest level, only
binary feature information (presence or absence) is provided. The system attempts to simultaneously determine the
probabilities of the situation and presence of corresponding objects from the detected features. The proposed approach
demonstrated robust performance after a short training period. This paper discusses this hierarchical learning system in a
broader context of different feedback mechanisms between layers and highlights challenges on the road to practical
We applied a two stage unsupervised hierarchical learning system to model complex dynamic surveillance and cyber space monitoring systems using a non-commercial version of the NeoAxis visualization software. The hierarchical scene learning and recognition approach is based on hierarchical expectation maximization, and was linked to a 3D graphics engine for validation of learning and classification results and understanding the human – autonomous system relationship. Scene recognition is performed by taking synthetically generated data and feeding it to a dynamic logic algorithm. The algorithm performs hierarchical recognition of the scene by first examining the features of the objects to determine which objects are present, and then determines the scene based on the objects present. This paper presents a framework within which low level data linked to higher-level visualization can provide support to a human operator and be evaluated in a detailed and systematic way.