A method for classification of network traffic based on C5.0 Machine Learning Algorithm

Tomasz Bujlow, M. Tahir Riaz, Jens Myrup Pedersen

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

53 Citations (Scopus)
5006 Downloads (Pure)

Abstract

Monitoring of the network performance in high-speed Internet infrastructure is a challenging task, as the requirements for the given quality level are service-dependent. Backbone QoS monitoring and analysis in Multi-hop Networks requires therefore knowledge about types of applications forming current network traffic. To overcome the drawbacks of existing methods for traffic classification, usage of C5.0 Machine Learning Algorithm (MLA) was proposed. On the basis of statistical traffic information received from volunteers and C5.0 algorithm we constructed a boosted classifier, which was shown to have ability to distinguish between 7 different applications in test set of 76,632-1,622,710 unknown cases with average accuracy of 99.3-99.9%. This high accuracy was achieved by using high quality training data collected by our system, a unique set of parameters used for both training and classification, an algorithm for recognizing flow direction and the C5.0 itself. Classified applications include Skype, FTP, torrent, web browser traffic, web radio, interactive gaming and SSH. We performed subsequent tries using different sets of parameters and both training and classification options. This paper shows how we collected accurate traffic data, presents arguments used in classification process, introduces the C5.0 classifier and its options, and finally evaluates and compares the obtained results.
Original languageEnglish
Title of host publicationICNC'12: 2012 International Conference on Computing, Networking and Communications (ICNC) : Workshop on Computing, Networking and Communications
Number of pages5
PublisherIEEE Press
Publication date2 Feb 2012
Pages237-241
ISBN (Print)978-1-4673-0008-7
ISBN (Electronic)978-1-4673-0723-9
DOIs
Publication statusPublished - 2 Feb 2012
Event2012 International Conference on Computing, Networking and Communications - Maui, Hawaii, United States
Duration: 30 Jan 20122 Feb 2012

Conference

Conference2012 International Conference on Computing, Networking and Communications
CountryUnited States
CityMaui, Hawaii
Period30/01/201202/02/2012

Fingerprint

Learning algorithms
Learning systems
Classifiers
Web browsers
Monitoring
Network performance
Telecommunication traffic
World Wide Web
Quality of service
Internet

Keywords

  • traffic classification
  • computer networks
  • C5.0
  • Machine Learning Algorithms (MLAs)
  • performance monitoring

Cite this

Bujlow, T., Riaz, M. T., & Pedersen, J. M. (2012). A method for classification of network traffic based on C5.0 Machine Learning Algorithm. In ICNC'12: 2012 International Conference on Computing, Networking and Communications (ICNC): Workshop on Computing, Networking and Communications (pp. 237-241). IEEE Press. https://doi.org/10.1109/ICCNC.2012.6167418
Bujlow, Tomasz ; Riaz, M. Tahir ; Pedersen, Jens Myrup. / A method for classification of network traffic based on C5.0 Machine Learning Algorithm. ICNC'12: 2012 International Conference on Computing, Networking and Communications (ICNC): Workshop on Computing, Networking and Communications. IEEE Press, 2012. pp. 237-241
@inproceedings{b467613667cc480bb736e473a38c2f97,
title = "A method for classification of network traffic based on C5.0 Machine Learning Algorithm",
abstract = "Monitoring of the network performance in high-speed Internet infrastructure is a challenging task, as the requirements for the given quality level are service-dependent. Backbone QoS monitoring and analysis in Multi-hop Networks requires therefore knowledge about types of applications forming current network traffic. To overcome the drawbacks of existing methods for traffic classification, usage of C5.0 Machine Learning Algorithm (MLA) was proposed. On the basis of statistical traffic information received from volunteers and C5.0 algorithm we constructed a boosted classifier, which was shown to have ability to distinguish between 7 different applications in test set of 76,632-1,622,710 unknown cases with average accuracy of 99.3-99.9{\%}. This high accuracy was achieved by using high quality training data collected by our system, a unique set of parameters used for both training and classification, an algorithm for recognizing flow direction and the C5.0 itself. Classified applications include Skype, FTP, torrent, web browser traffic, web radio, interactive gaming and SSH. We performed subsequent tries using different sets of parameters and both training and classification options. This paper shows how we collected accurate traffic data, presents arguments used in classification process, introduces the C5.0 classifier and its options, and finally evaluates and compares the obtained results.",
keywords = "traffic classification, computer networks, C5.0, Machine Learning Algorithms (MLAs), performance monitoring",
author = "Tomasz Bujlow and Riaz, {M. Tahir} and Pedersen, {Jens Myrup}",
year = "2012",
month = "2",
day = "2",
doi = "10.1109/ICCNC.2012.6167418",
language = "English",
isbn = "978-1-4673-0008-7",
pages = "237--241",
booktitle = "ICNC'12: 2012 International Conference on Computing, Networking and Communications (ICNC)",
publisher = "IEEE Press",

}

Bujlow, T, Riaz, MT & Pedersen, JM 2012, A method for classification of network traffic based on C5.0 Machine Learning Algorithm. in ICNC'12: 2012 International Conference on Computing, Networking and Communications (ICNC): Workshop on Computing, Networking and Communications. IEEE Press, pp. 237-241, Maui, Hawaii, United States, 30/01/2012. https://doi.org/10.1109/ICCNC.2012.6167418

A method for classification of network traffic based on C5.0 Machine Learning Algorithm. / Bujlow, Tomasz; Riaz, M. Tahir; Pedersen, Jens Myrup.

ICNC'12: 2012 International Conference on Computing, Networking and Communications (ICNC): Workshop on Computing, Networking and Communications. IEEE Press, 2012. p. 237-241.

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

TY - GEN

T1 - A method for classification of network traffic based on C5.0 Machine Learning Algorithm

AU - Bujlow, Tomasz

AU - Riaz, M. Tahir

AU - Pedersen, Jens Myrup

PY - 2012/2/2

Y1 - 2012/2/2

N2 - Monitoring of the network performance in high-speed Internet infrastructure is a challenging task, as the requirements for the given quality level are service-dependent. Backbone QoS monitoring and analysis in Multi-hop Networks requires therefore knowledge about types of applications forming current network traffic. To overcome the drawbacks of existing methods for traffic classification, usage of C5.0 Machine Learning Algorithm (MLA) was proposed. On the basis of statistical traffic information received from volunteers and C5.0 algorithm we constructed a boosted classifier, which was shown to have ability to distinguish between 7 different applications in test set of 76,632-1,622,710 unknown cases with average accuracy of 99.3-99.9%. This high accuracy was achieved by using high quality training data collected by our system, a unique set of parameters used for both training and classification, an algorithm for recognizing flow direction and the C5.0 itself. Classified applications include Skype, FTP, torrent, web browser traffic, web radio, interactive gaming and SSH. We performed subsequent tries using different sets of parameters and both training and classification options. This paper shows how we collected accurate traffic data, presents arguments used in classification process, introduces the C5.0 classifier and its options, and finally evaluates and compares the obtained results.

AB - Monitoring of the network performance in high-speed Internet infrastructure is a challenging task, as the requirements for the given quality level are service-dependent. Backbone QoS monitoring and analysis in Multi-hop Networks requires therefore knowledge about types of applications forming current network traffic. To overcome the drawbacks of existing methods for traffic classification, usage of C5.0 Machine Learning Algorithm (MLA) was proposed. On the basis of statistical traffic information received from volunteers and C5.0 algorithm we constructed a boosted classifier, which was shown to have ability to distinguish between 7 different applications in test set of 76,632-1,622,710 unknown cases with average accuracy of 99.3-99.9%. This high accuracy was achieved by using high quality training data collected by our system, a unique set of parameters used for both training and classification, an algorithm for recognizing flow direction and the C5.0 itself. Classified applications include Skype, FTP, torrent, web browser traffic, web radio, interactive gaming and SSH. We performed subsequent tries using different sets of parameters and both training and classification options. This paper shows how we collected accurate traffic data, presents arguments used in classification process, introduces the C5.0 classifier and its options, and finally evaluates and compares the obtained results.

KW - traffic classification

KW - computer networks

KW - C5.0

KW - Machine Learning Algorithms (MLAs)

KW - performance monitoring

UR - http://www.scopus.com/inward/record.url?scp=84859906955&partnerID=8YFLogxK

U2 - 10.1109/ICCNC.2012.6167418

DO - 10.1109/ICCNC.2012.6167418

M3 - Article in proceeding

SN - 978-1-4673-0008-7

SP - 237

EP - 241

BT - ICNC'12: 2012 International Conference on Computing, Networking and Communications (ICNC)

PB - IEEE Press

ER -

Bujlow T, Riaz MT, Pedersen JM. A method for classification of network traffic based on C5.0 Machine Learning Algorithm. In ICNC'12: 2012 International Conference on Computing, Networking and Communications (ICNC): Workshop on Computing, Networking and Communications. IEEE Press. 2012. p. 237-241 https://doi.org/10.1109/ICCNC.2012.6167418