Is our Ground-Truth for Traffic Classification Reliable?

Valentín Carela-Español; Tomasz Bujlow; Pere Barlet-Ros

doi:10.1007/978-3-319-04918-2_10

Is our Ground-Truth for Traffic Classification Reliable?

Valentín Carela-Español, Tomasz Bujlow, Pere Barlet-Ros

Institut for Elektroniske Systemer

Publikation: Bidrag til bog/antologi/rapport/konference proceeding › Konferenceartikel i proceeding › Forskning › peer review

50 Citationer (Scopus)

Abstract

The validation of the different proposals in the traffic classification literature is a controversial issue. Usually, these works base their results on a ground-truth built from private datasets and labeled by techniques of unknown reliability. This makes the validation and comparison with other solutions an extremely difficult task.

This paper aims to be a first step towards addressing the validation and trustworthiness problem of network traffic classifiers. We perform a comparison between 6 well-known DPI-based techniques, which are frequently used in the literature for ground-truth generation. In order to evaluate these tools we have carefully built a labeled dataset of more than 500 000 flows, which contains traffic from popular applications. Our results present PACE, a commercial tool, as the most reliable solution for ground-truth generation. However, among the open-source tools available, NDPI and especially Libprotoident, also achieve very high precision, while other, more frequently used tools (e.g., L7-filter) are not reliable enough and should not be used for ground-truth generation in their current form.

Originalsprog	Engelsk
Titel	Passive and Active Measurement : Passive and Active Measurement, 15th International Conference, PAM 2014, Los Angeles, USA, March 10-11, 2014, Proceedings Series:
Antal sider	11
Vol/bind	8362
Forlag	Springer Science+Business Media
Publikationsdato	11 mar. 2014
Sider	98-108
DOI	https://doi.org/10.1007/978-3-319-04918-2_10
Status	Udgivet - 11 mar. 2014

Navn	Lecture Notes in Computer Science
ISSN	0302-9743

Adgang til dokumentet

10.1007/978-3-319-04918-2_10

AUB Link

Søg efter materialet i Aalborg Universitetsbiblioteks søgemaskine

Citationsformater

Carela-Español, V., Bujlow, T., & Barlet-Ros, P. (2014). Is our Ground-Truth for Traffic Classification Reliable? I Passive and Active Measurement: Passive and Active Measurement, 15th International Conference, PAM 2014, Los Angeles, USA, March 10-11, 2014, Proceedings Series: (Bind 8362, s. 98-108). Springer Science+Business Media. https://doi.org/10.1007/978-3-319-04918-2_10

Carela-Español, Valentín ; Bujlow, Tomasz ; Barlet-Ros, Pere. / Is our Ground-Truth for Traffic Classification Reliable?. Passive and Active Measurement: Passive and Active Measurement, 15th International Conference, PAM 2014, Los Angeles, USA, March 10-11, 2014, Proceedings Series:. Bind 8362 Springer Science+Business Media, 2014. s. 98-108 (Lecture Notes in Computer Science).

@inproceedings{8fc51768657240e9a8b2f625f3af3805,

title = "Is our Ground-Truth for Traffic Classification Reliable?",

abstract = "The validation of the different proposals in the traffic classification literature is a controversial issue. Usually, these works base their results on a ground-truth built from private datasets and labeled by techniques of unknown reliability. This makes the validation and comparison with other solutions an extremely difficult task.This paper aims to be a first step towards addressing the validation and trustworthiness problem of network traffic classifiers. We perform a comparison between 6 well-known DPI-based techniques, which are frequently used in the literature for ground-truth generation. In order to evaluate these tools we have carefully built a labeled dataset of more than 500 000 flows, which contains traffic from popular applications. Our results present PACE, a commercial tool, as the most reliable solution for ground-truth generation. However, among the open-source tools available, NDPI and especially Libprotoident, also achieve very high precision, while other, more frequently used tools (e.g., L7-filter) are not reliable enough and should not be used for ground-truth generation in their current form.",

author = "Valent{\'i}n Carela-Espa{\~n}ol and Tomasz Bujlow and Pere Barlet-Ros",

year = "2014",

month = mar,

day = "11",

doi = "10.1007/978-3-319-04918-2_10",

language = "English",

volume = "8362",

series = "Lecture Notes in Computer Science",

publisher = "Springer Science+Business Media",

pages = "98--108",

booktitle = "Passive and Active Measurement",

address = "United States",

}

Carela-Español, V, Bujlow, T & Barlet-Ros, P 2014, Is our Ground-Truth for Traffic Classification Reliable? i Passive and Active Measurement: Passive and Active Measurement, 15th International Conference, PAM 2014, Los Angeles, USA, March 10-11, 2014, Proceedings Series:. bind 8362, Springer Science+Business Media, Lecture Notes in Computer Science, s. 98-108. https://doi.org/10.1007/978-3-319-04918-2_10

Is our Ground-Truth for Traffic Classification Reliable? / Carela-Español, Valentín; Bujlow, Tomasz; Barlet-Ros, Pere.
Passive and Active Measurement: Passive and Active Measurement, 15th International Conference, PAM 2014, Los Angeles, USA, March 10-11, 2014, Proceedings Series:. Bind 8362 Springer Science+Business Media, 2014. s. 98-108 (Lecture Notes in Computer Science).

Publikation: Bidrag til bog/antologi/rapport/konference proceeding › Konferenceartikel i proceeding › Forskning › peer review