Internet traffic is increasingly becoming encrypted, making the forensic analysis of packet content yield diminishing returns. Much traffic (web, email, chat, VoIP, etc.) is now protected using the cryptographic protocol known as Transport Layer Security (TLS). In 2014, Google encouraged increased TLS usage by favoring HTTPS in its search (SEO) rankings1. As a result, by 2016, approximately 30 percent of the top page search results on Google used HTTPS (SSL/TLS)1. While the largest fraction of traffic is now video (e.g. Netflix, YouTube), these communications too now use TLS2. Traditional traffic analysis leverages port numbers, domain names, certificate fields, and the available cryptographic suites. TLS fingerprinting3 for traffic classification4 has recently been used, but this is still insufficient to expose suspicious communication. In the absence of actual payload content, additional information such as the inter-packet arrival times, flow direction, TCP headers, and frequencies can be leveraged to estimate the application and data protected with SSL/TLS. For example, researchers leveraged supervised machine learning and a set of features such as previously suggested (packet arrival times, length, etc.) and achieved a 96% accuracy when predicting the 3-tuple of <Operating System, Browser, Application< of various SSL/TLS applications5. Our novel technique leverages data mining techniques and the TLS record size frequencies. We then leverage Multinomial Naïve Bayes and the K-means algorithm to respectively classify TLS sessions to a website and cluster the TLS sessions. We have achieved an accuracy of 90.5% in Multinomial Naïve Bayes Classification of websites and a V-measure of 89.9% and a Silhouette Coefficient of 54.6% in K-means clustering of TLS Sessions according to websites.
Michael J. De Lucia and Chase Cotton, "Identifying and detecting applications within TLS traffic," Proc. SPIE 10630, Cyber Sensing 2018, 106300U (Presented at SPIE Defense + Security: April 18, 2018; Published: 3 May 2018); https://doi.org/10.1117/12.2305256.
Conference Presentations are recordings of oral presentations given at SPIE conferences and published as part of the conference proceedings. They include the speaker's narration along with a video recording of the presentation slides and animations. Many conference presentations also include full-text papers. Search and browse our growing collection of more than 12,000 conference presentations, including many plenary and keynote presentations.