Is our Ground-Truth for Traffic Classification Reliable?

Valentín Carela-Español, Tomasz Bujlow, Pere Barlet-Ros

Publikation: Bidrag til bog/antologi/rapport/konference proceedingKonferenceartikel i proceedingForskningpeer review

50 Citationer (Scopus)

Abstract

The validation of the different proposals in the traffic classification literature is a controversial issue. Usually, these works base their results on a ground-truth built from private datasets and labeled by techniques of unknown reliability. This makes the validation and comparison with other solutions an extremely difficult task.

This paper aims to be a first step towards addressing the validation and trustworthiness problem of network traffic classifiers. We perform a comparison between 6 well-known DPI-based techniques, which are frequently used in the literature for ground-truth generation. In order to evaluate these tools we have carefully built a labeled dataset of more than 500 000 flows, which contains traffic from popular applications. Our results present PACE, a commercial tool, as the most reliable solution for ground-truth generation. However, among the open-source tools available, NDPI and especially Libprotoident, also achieve very high precision, while other, more frequently used tools (e.g., L7-filter) are not reliable enough and should not be used for ground-truth generation in their current form.
OriginalsprogEngelsk
TitelPassive and Active Measurement : Passive and Active Measurement, 15th International Conference, PAM 2014, Los Angeles, USA, March 10-11, 2014, Proceedings Series:
Antal sider11
Vol/bind8362
ForlagSpringer Science+Business Media
Publikationsdato11 mar. 2014
Sider98-108
DOI
StatusUdgivet - 11 mar. 2014
NavnLecture Notes in Computer Science
ISSN0302-9743

Fingeraftryk

Dyk ned i forskningsemnerne om 'Is our Ground-Truth for Traffic Classification Reliable?'. Sammen danner de et unikt fingeraftryk.

Citationsformater