Classification of HTTP traffic based on C5.0 Machine Learning Algorithm

Tomasz Bujlow, Tahir Riaz, Jens Myrup Pedersen

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

10 Citations (Scopus)
462 Downloads (Pure)

Abstract

Our previous work demonstrated the possibility of distinguishing several groups of traffic with accuracy of over 99%. Today, most of the traffic is generated by web browsers, which provide different kinds of services based on the HTTP protocol: web browsing, file downloads, audio and voice streaming through third-party plugins, etc. This paper suggests and evaluates two approaches to distinguish various types of HTTP traffic based on the content: distributed among volunteers' machines and centralized running in the core of the network. We also assess the accuracy of the centralized classifier for both the HTTP traffic and mixed HTTP/non-HTTP traffic. In the latter case, we achieved the accuracy of 94%. Finally, we provide graphical characteristics of different kinds of HTTP traffic.
Original languageEnglish
Title of host publicationIEEE Symposium on Computers and Communications (ISCC), 2012
Number of pages6
Place of PublicationCappadocia
PublisherIEEE
Publication date1 Jul 2012
Pages000882 - 000887
ISBN (Print)978-1-4673-2712-1
ISBN (Electronic)978-1-4673-2711-4
DOIs
Publication statusPublished - 1 Jul 2012
EventThe Seventeenth IEEE Symposium on Computers and Communications - Cappadocia, Turkey
Duration: 1 Jul 20124 Jul 2012

Conference

ConferenceThe Seventeenth IEEE Symposium on Computers and Communications
CountryTurkey
CityCappadocia
Period01/07/201204/07/2012
SeriesI E E E International Symposium on Computers and Communications
ISSN1530-1346

Fingerprint

HTTP
Learning algorithms
Learning systems
Web browsers
World Wide Web
Classifiers
Network protocols

Keywords

  • traffic classification
  • computer networks
  • HTTP traffic
  • browser traffic
  • C5.0
  • Machine Learning Algorithms (MLAs)
  • performance monitoring

Cite this

Bujlow, T., Riaz, T., & Pedersen, J. M. (2012). Classification of HTTP traffic based on C5.0 Machine Learning Algorithm. In IEEE Symposium on Computers and Communications (ISCC), 2012 (pp. 000882 - 000887). Cappadocia: IEEE. I E E E International Symposium on Computers and Communications https://doi.org/10.1109/ISCC.2012.6249413
Bujlow, Tomasz ; Riaz, Tahir ; Pedersen, Jens Myrup. / Classification of HTTP traffic based on C5.0 Machine Learning Algorithm. IEEE Symposium on Computers and Communications (ISCC), 2012 . Cappadocia : IEEE, 2012. pp. 000882 - 000887 (I E E E International Symposium on Computers and Communications).
@inproceedings{276fdb95701149268a4e724aabdcb02b,
title = "Classification of HTTP traffic based on C5.0 Machine Learning Algorithm",
abstract = "Our previous work demonstrated the possibility of distinguishing several groups of traffic with accuracy of over 99{\%}. Today, most of the traffic is generated by web browsers, which provide different kinds of services based on the HTTP protocol: web browsing, file downloads, audio and voice streaming through third-party plugins, etc. This paper suggests and evaluates two approaches to distinguish various types of HTTP traffic based on the content: distributed among volunteers' machines and centralized running in the core of the network. We also assess the accuracy of the centralized classifier for both the HTTP traffic and mixed HTTP/non-HTTP traffic. In the latter case, we achieved the accuracy of 94{\%}. Finally, we provide graphical characteristics of different kinds of HTTP traffic.",
keywords = "traffic classification, computer networks, HTTP traffic, browser traffic, C5.0, Machine Learning Algorithms (MLAs), performance monitoring",
author = "Tomasz Bujlow and Tahir Riaz and Pedersen, {Jens Myrup}",
year = "2012",
month = "7",
day = "1",
doi = "10.1109/ISCC.2012.6249413",
language = "English",
isbn = "978-1-4673-2712-1",
pages = "000882 -- 000887",
booktitle = "IEEE Symposium on Computers and Communications (ISCC), 2012",
publisher = "IEEE",
address = "United States",

}

Bujlow, T, Riaz, T & Pedersen, JM 2012, Classification of HTTP traffic based on C5.0 Machine Learning Algorithm. in IEEE Symposium on Computers and Communications (ISCC), 2012 . IEEE, Cappadocia, I E E E International Symposium on Computers and Communications, pp. 000882 - 000887, The Seventeenth IEEE Symposium on Computers and Communications, Cappadocia, Turkey, 01/07/2012. https://doi.org/10.1109/ISCC.2012.6249413

Classification of HTTP traffic based on C5.0 Machine Learning Algorithm. / Bujlow, Tomasz; Riaz, Tahir; Pedersen, Jens Myrup.

IEEE Symposium on Computers and Communications (ISCC), 2012 . Cappadocia : IEEE, 2012. p. 000882 - 000887.

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

TY - GEN

T1 - Classification of HTTP traffic based on C5.0 Machine Learning Algorithm

AU - Bujlow, Tomasz

AU - Riaz, Tahir

AU - Pedersen, Jens Myrup

PY - 2012/7/1

Y1 - 2012/7/1

N2 - Our previous work demonstrated the possibility of distinguishing several groups of traffic with accuracy of over 99%. Today, most of the traffic is generated by web browsers, which provide different kinds of services based on the HTTP protocol: web browsing, file downloads, audio and voice streaming through third-party plugins, etc. This paper suggests and evaluates two approaches to distinguish various types of HTTP traffic based on the content: distributed among volunteers' machines and centralized running in the core of the network. We also assess the accuracy of the centralized classifier for both the HTTP traffic and mixed HTTP/non-HTTP traffic. In the latter case, we achieved the accuracy of 94%. Finally, we provide graphical characteristics of different kinds of HTTP traffic.

AB - Our previous work demonstrated the possibility of distinguishing several groups of traffic with accuracy of over 99%. Today, most of the traffic is generated by web browsers, which provide different kinds of services based on the HTTP protocol: web browsing, file downloads, audio and voice streaming through third-party plugins, etc. This paper suggests and evaluates two approaches to distinguish various types of HTTP traffic based on the content: distributed among volunteers' machines and centralized running in the core of the network. We also assess the accuracy of the centralized classifier for both the HTTP traffic and mixed HTTP/non-HTTP traffic. In the latter case, we achieved the accuracy of 94%. Finally, we provide graphical characteristics of different kinds of HTTP traffic.

KW - traffic classification

KW - computer networks

KW - HTTP traffic

KW - browser traffic

KW - C5.0

KW - Machine Learning Algorithms (MLAs)

KW - performance monitoring

UR - http://www.scopus.com/inward/record.url?scp=84866614441&partnerID=8YFLogxK

U2 - 10.1109/ISCC.2012.6249413

DO - 10.1109/ISCC.2012.6249413

M3 - Article in proceeding

SN - 978-1-4673-2712-1

SP - 882

EP - 887

BT - IEEE Symposium on Computers and Communications (ISCC), 2012

PB - IEEE

CY - Cappadocia

ER -

Bujlow T, Riaz T, Pedersen JM. Classification of HTTP traffic based on C5.0 Machine Learning Algorithm. In IEEE Symposium on Computers and Communications (ISCC), 2012 . Cappadocia: IEEE. 2012. p. 000882 - 000887. (I E E E International Symposium on Computers and Communications). https://doi.org/10.1109/ISCC.2012.6249413