3 May 2018 Identifying and detecting applications within TLS traffic
Author Affiliations +
Internet traffic is increasingly becoming encrypted, making the forensic analysis of packet content yield diminishing returns. Much traffic (web, email, chat, VoIP, etc.) is now protected using the cryptographic protocol known as Transport Layer Security (TLS). In 2014, Google encouraged increased TLS usage by favoring HTTPS in its search (SEO) rankings1. As a result, by 2016, approximately 30 percent of the top page search results on Google used HTTPS (SSL/TLS)1. While the largest fraction of traffic is now video (e.g. Netflix, YouTube), these communications too now use TLS2. Traditional traffic analysis leverages port numbers, domain names, certificate fields, and the available cryptographic suites. TLS fingerprinting3 for traffic classification4 has recently been used, but this is still insufficient to expose suspicious communication. In the absence of actual payload content, additional information such as the inter-packet arrival times, flow direction, TCP headers, and frequencies can be leveraged to estimate the application and data protected with SSL/TLS. For example, researchers leveraged supervised machine learning and a set of features such as previously suggested (packet arrival times, length, etc.) and achieved a 96% accuracy when predicting the 3-tuple of <Operating System, Browser, Application< of various SSL/TLS applications5. Our novel technique leverages data mining techniques and the TLS record size frequencies. We then leverage Multinomial Naïve Bayes and the K-means algorithm to respectively classify TLS sessions to a website and cluster the TLS sessions. We have achieved an accuracy of 90.5% in Multinomial Naïve Bayes Classification of websites and a V-measure of 89.9% and a Silhouette Coefficient of 54.6% in K-means clustering of TLS Sessions according to websites.
Conference Presentation
© (2018) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Michael J. De Lucia, Michael J. De Lucia, Chase Cotton, Chase Cotton, } "Identifying and detecting applications within TLS traffic", Proc. SPIE 10630, Cyber Sensing 2018, 106300U (3 May 2018); doi: 10.1117/12.2305256; https://doi.org/10.1117/12.2305256


Back to Top