Side-channel analysis covers several methods for determining the state of a device without directly interacting with the device. In previous work, we collected near-field radio frequency emanations from simple programs to assess how various code operations could be differentiated at the instruction level. However, detecting operations in large blocks of instructions in more complicated programs have proven difficult due to the high dimensionality of the data. In this research, we examine methods to differentiate common operations using RF emanations. We use a series of example codes useful for Two Factor Authentication on an Arduino Mega. Some examples are coded with extra operations to simulate malware such as intentionally leaking the key, nuisance operations, or substituting a weaker hash function. After collecting RF data, approximation techniques are used to reduce the data dimensionality and identify motifs in the time series. The motifs are correlated with the operations taking place by use of a uniquely identifiable triggering mechanism. Several exemplary motifs are then used together as templates that can be used to search for a connected series of operations. These templates are compared with an RF time series of unknown operations using a minimum distance metric. We evaluate the quality of templates available from an RF data collection and examine the usefulness of templates as features for classification.
Side-channel analysis (SCA) provides an independent, non-invasive remote monitoring solution to determine the digital state of a programmable electronic device. In our work, we have conducted near-field SCA on various devices to determine how well different programs running on devices can be differentiated. We have tested devices ranging from the relatively simple Arduino Uno to the much more complex Samsung Galaxy S8. The antennas used for radio frequency (RF) collection have also varied from the self-contained ~500MHz Riscure probe to a 40mm Triarchy Loop antenna with attached amplifier. Our study implemented various collection techniques; however, all of them relied on the constraint of a trigger signal. The trigger signal was needed to initiate the data collection process and to act as a reference for sequencing the various blocks within a code execution. However, a trigger signal is not always available or even feasible to obtain from a device for remote monitoring applications. This work investigates potential methods for triggerless detection and alignment of digital code blocks on measured analog RF data. Methods for performing the detection range from boosting codes that generate easily aligned RF pulses, to correlation methods for signal alignment. The varying quality of RF data generated between the devices and the amount of noise embedded in the signals from the measurement schemes negatively impact triggerless collection. We estimate our probability of success at aligning signals to exceed 90% for the devices tested.
KEYWORDS: Clocks, Detection and tracking algorithms, Statistical analysis, Analog electronics, Data modeling, Internet, Magnetism, Signal processing, Manufacturing, Machine learning
All digital devices leak information through unintended emissions into analog side channels. The RF side channel enables passive collection of high-bandwidth information about the digital state of the device. We collected these RF emissions with a 500-MHz Riscure probe placed in the nearfield of the device under test (DuT) and applied machine learning to detect what program is running on the processor to identify malware intrusions. We explored the applicability of a generalized algorithm classification infrastructure built from a training set of similar DuTs to a similar device from a different production batch (same model number, different serial number.) We collected RF-SC data for five programs running on 28 distinct Arduino Unos (and 28 MSP430 processors.) We trained program classifiers on RF data from all but one DuT and tested the classifiers on the device withheld from the training set. The high-SNR signal provided by the Riscure probe enabled almost perfect classification results when we trained and tested on the same device. Our classification results remained above 99% when we generalized testing to the new DuT of the same model but a different serial number. The classifier was trained on 27 of the devices and tested to determine its ability to detect deviations from a baseline algorithm on a withheld device. The worst misclassification rate was a mere 0.08%.
KEYWORDS: Signal to noise ratio, Binary data, Data modeling, Signal processing, Analog electronics, Error analysis, Light emitting diodes, Remote sensing, Internet, Analytical research
We use machine learning to characterize the state of digital devices based on their analog emissions. As digital devices operate, they emit internal information into a number of analog side channels. Remote sensing of these unintended signals leads to low signal-to-noise-ratio (SNR) and significant clutter. We developed classifiers to determine which program is executing on a digital device based on analog radio-frequency (RF) emissions collected via a 500-MHz Riscure RF probe. A standard algorithm was developed to serve as a baseline program and intrusions were simulated by introducing minor modifications to this program. We collected a thousand RF traces from each of these modified programs running on ten different devices for thousands of instruction cycles. The ten devices tested are representative of the Internet of Things (IoT) devices including Arduino Unos and PIC24 processors. Our primary approach to mitigating the impact of low SNR is to extend the program execution and signal collection time. Collecting a training set with more traces than samples is not practical. Even after down-sampling the raw data to thirty samples per instruction, the number of samples exceeds the number of traces by orders of magnitude. Such a training set nearly guarantees overlearning. To mitigate this, we present our Whitened Mean Classifier as a method to whiten this sparse training set and avoid overlearning. Classification accuracy exceeded 90% for the modified programs on a subset of the ten devices.
We applied machine learning to detect changes in state of key registers in digital devices from their analog RF emissions. As digital devices operate, they emit information via analog side channels. We collected the RF side channel with a 500-MHz shielded loop probe from Riscure, placed in the nearfield (<1mm) of the device under test (DuT). We investigated a number of Internet-of-Thing (IoT) DuTs including Arduino Uno and PIC24 processors. Conventional processors implement instructions as a sequence of subtasks. The first subtasks include incrementing the program counter (PC) register and fetching the next instruction from program memory to the instruction register (IR). These two subtasks occur almost every instruction cycle. We ran programs on the DuT and collected the RF emissions. We parsed the object code of the programs to determine the state of key registers including the PC and IR during each instruction cycle and observed that the RF signal of each cycle is strongly correlated with the Hamming Distance (HD) (i.e., the number of bits changing) in the PC and IR registers. Based on this result, we developed classifiers to extract the HD of the PC, IR, as well as the stack pointer (SP). The classification results vary with true HD as some values are rare and have few examples in the training set. The classification accuracy exceeds 99% for the PC and the IR. Due to the relatively few HD in the training set for the SP, its results slightly exceeded 97%.
KEYWORDS: Operating systems, Detection and tracking algorithms, Binary data, Control systems, Signal processing, Machine learning, Algorithm development, Data acquisition
Internet of Things (IoT) and other similar devices often have little to no security and thus can be readily exploited in any number of ways. In this work, we collect radio frequency (RF) emissions from simple processors on several IoT devices and apply machine learning techniques to detect modifications (corrupted or injected via malware) in ‘known’ software running on the processor. We can detect these modifications due to the correlation between RF emissions and the digital state of the devices. Every bit flip produces a small but potentially detectable electrical pulse. Our approach to developing the recognition algorithm is to adapt to the variability created by the input data by recognizing the sequences in which instruction blocks are executed. Seemingly minor changes to input values can have a detectable effect on the measured RF side channel. We collect RF data from a variety of IoT devices with clock speeds varying from 16-96 MHz. A 1-GHz Riscure RF near-field antenna probe was placed within a millimeter of the IoT device, RF emissions were acquired, and software controls triggered data collection. A classification architecture was trained using object code portioned into blocks to develop the truth data. We then applied new data to the trained block classifier. This approach detects deviations in individual blocks and block sequences as a whole, allowing a greater level of detection resolution than just binary ‘Yes/No’ classification. Initial testing results showed greater than 90% classification accuracy for block-level modifications, and we can detect deviations from truth data with 100% accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.