TY - JOUR
T1 - Diving Deep With BotLab-DS1
T2 - A Novel Ground Truth-Empowered Botnet Dataset
AU - Qasim, Muhammad
AU - Waleed, Muhammad
AU - Um, Tai Won
AU - Pahlevani, Peyman
AU - Pedersen, Jens Myrup
AU - Masood, Asif
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2024
Y1 - 2024
N2 - Cyberspace faces unparalleled threats due to the rapid rise in botnet attacks and their profound repercussions. Utilizing AI-assisted systems emerges as a potent solution for detecting and neutralizing such attacks. Existing research on botnet attack detection revolves around dataset creation, amplifying the detection methods' efficacy and precision via sophisticated machine learning models, and a behaviour-centric analysis. A discerning review of current datasets reveals their limitations: the obsolescence of some datasets, their limited relevance to certain attack types, and an imperative lack of ground truth. Addressing these gaps, we introduce a ground truth, the BotLab-DS1 dataset, featuring 5,279 real-world active botnet samples spanning 12 botnet families and 3,000 benign instances. This paper's core is threefold; initially, we delineate a thorough review of existing datasets and their inherent shortcomings. Subsequently, we unfold a holistic data creation strategy and leverage advanced feature engineering methods on static, behavioural, and network-centric attributes. Finally, the research involves training diverse machine learning algorithms using the BotLab-DS1 dataset for enhanced botnet detection. Our empirical findings underline that BotLab-DS1, when paired with the random forest algorithm, attains 98.6% accuracy and 99.0% precision. In contrast, gradient boosting trails closely, registering 96.34% accuracy and 96.0% precision. We believe our study will pioneer new pathways for dataset formulation and algorithmic scrutiny, enriching the research landscape and backing the global initiative to thwart botnet incursions effectively.
AB - Cyberspace faces unparalleled threats due to the rapid rise in botnet attacks and their profound repercussions. Utilizing AI-assisted systems emerges as a potent solution for detecting and neutralizing such attacks. Existing research on botnet attack detection revolves around dataset creation, amplifying the detection methods' efficacy and precision via sophisticated machine learning models, and a behaviour-centric analysis. A discerning review of current datasets reveals their limitations: the obsolescence of some datasets, their limited relevance to certain attack types, and an imperative lack of ground truth. Addressing these gaps, we introduce a ground truth, the BotLab-DS1 dataset, featuring 5,279 real-world active botnet samples spanning 12 botnet families and 3,000 benign instances. This paper's core is threefold; initially, we delineate a thorough review of existing datasets and their inherent shortcomings. Subsequently, we unfold a holistic data creation strategy and leverage advanced feature engineering methods on static, behavioural, and network-centric attributes. Finally, the research involves training diverse machine learning algorithms using the BotLab-DS1 dataset for enhanced botnet detection. Our empirical findings underline that BotLab-DS1, when paired with the random forest algorithm, attains 98.6% accuracy and 99.0% precision. In contrast, gradient boosting trails closely, registering 96.34% accuracy and 96.0% precision. We believe our study will pioneer new pathways for dataset formulation and algorithmic scrutiny, enriching the research landscape and backing the global initiative to thwart botnet incursions effectively.
KW - botnet
KW - Cyberspace
KW - dataset
KW - machine learning
KW - security attacks
UR - http://www.scopus.com/inward/record.url?scp=85186066077&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2024.3367122
DO - 10.1109/ACCESS.2024.3367122
M3 - Journal article
AN - SCOPUS:85186066077
SN - 2169-3536
VL - 12
SP - 28898
EP - 28910
JO - IEEE Access
JF - IEEE Access
ER -