In the last several years, the volume and diversity of cyber attacks on the U.S. commercial and government networks have increased dramatically, including malware, web attacks (e.g., drive-by downloads), zero-day exploits, and men in the middle (e.g., session hijacking). While many tools are available to attackers, cyber criminals increasingly rely on straightforward intrusion approaches (e.g., spear-phishing), employ vast distributed resources (botnets), and hide attack vectors via stepping stone attacks. Detecting such activities and infrastructure represent the most difficult challenge to cyber-security professionals, because these threats are often locally invisible at the isolated subnetworks. Cyber threat detection tools employed in the field today fail to deal with data volume, speed, and diversity of the cyber-attacks. Intrusion Detection Systems (IDS) are ineffective against novel threats, while anomaly-based methods generate large number of false alarms and are difficult to interpret. Supervised algorithms require curated labeled datasets to train their models which do not exist for novel attacks. Yet, the biggest challenge of these systems is a requirement that all of the data be available at a single global repository. The cost of maintaining global repository and associated computation infrastructure becomes unsustainable as the volume of cyber data collection increases. As threat detection solutions are deployed predominantly to analyze local traffic collected within and on the border of a single organization, these tools are unable to detect attacks that are locally invisible, such as attacks cross-cutting organizational boundaries. In this paper, we describe a new computational framework which will enable distributed enterprises to (a) perform local inference computations; and (b) collaborate using global messages and hybrid strategies to detect a wide range of global threats that are not locally visible. First, we present a matrix-based algebra that generalizes a wide range of machine learning algorithms to maximize the breadth of attack phenomena to be detected. We then derive a semi-supervised attack detection model that uses a hybrid collaboration with adaptive local and global computations at distributed repositories to detect global events when it is not possible to move all relevant data into a centralized location. Finally, we propose a feedback model to create active human-in-loop system which integrates cyber analysts into malicious behavior detection and pattern learning process by generating requests for annotation and result examination using small number of representative instances of anomaly and threat detection outcomes
Georgiy Levchuk, John Colonna-Romano, and Mohammed Eslami, "Algebra for distributed collaborative semi-supervised classification of cyber activities," Proc. SPIE 10652, Disruptive Technologies in Information Sciences, 1065210 (Presented at SPIE Defense + Security: April 18, 2018; Published: 10 May 2018); https://doi.org/10.1117/12.2305869.
Conference Presentations are recordings of oral presentations given at SPIE conferences and published as part of the conference proceedings. They include the speaker's narration along with a video recording of the presentation slides and animations. Many conference presentations also include full-text papers. Search and browse our growing collection of more than 12,000 conference presentations, including many plenary and keynote presentations.