Analysis of Malware behavior: Type classification using machine learning

Radu-Stefan Pirscoveanu; Steven Strandlund Hansen; Thor Mark Tampus Larsen; Matija Stevanovic; Jens Myrup Pedersen; Alexandre  Czech

doi:10.1109/CyberSA.2015.7166115

Analysis of Malware behavior: Type classification using machine learning

Radu-Stefan Pirscoveanu, Steven Strandlund Hansen, Thor Mark Tampus Larsen, Matija Stevanovic, Jens Myrup Pedersen, Alexandre Czech

Research output: Contribution to book/anthology/report/conference proceeding › Article in proceeding › Research › peer-review

91 Citations (Scopus)

Abstract

Malicious software has become a major threat to modern society, not only due to the increased complexity of the malware itself but also due to the exponential increase of new malware each day. This study tackles the problem of analyzing and classifying a high amount of malware in a scalable and automatized manner. We have developed a distributed malware testing environment by extending Cuckoo Sandbox that was used to test an extensive number of malware samples and trace their behavioral data. The extracted data was used for the development of a novel type classification approach based on supervised machine learning. The proposed classification approach employs a novel combination of features that achieves a high classification rate with a weighted average AUC value of 0.98 using Random Forests classifier. The approach has been extensively tested on a total of 42,000 malware samples. Based on the above results it is believed that the developed system can be used to pre-filter novel from known malware in a future malware analysis system.

Original language	English
Title of host publication	International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), 2015
Number of pages	7
Publisher	IEEE
Publication date	Aug 2015
ISBN (Print)	9781467367974
DOIs	https://doi.org/10.1109/CyberSA.2015.7166115
Publication status	Published - Aug 2015
Event	International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), 2015 - London, United Kingdom Duration: 8 Jun 2015 → 9 Jun 2015

Conference

Conference	International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), 2015
Country/Territory	United Kingdom
City	London
Period	08/06/2015 → 09/06/2015

Series	International Conference on Cyber Situational Awareness, Data Analytics and Assessment Proceedings. (cyberSA)

Keywords

Malware
Type-Classification
Dynamic Analysis
Scalability
Cuckoo Sandbox
Random Forests
API call
Feature Selection
Supervised Machine Learning

Access to Document

10.1109/CyberSA.2015.7166115

AUB Link

Search for the material in Aalborg University Library's search engine

Cite this

Pirscoveanu, Radu-Stefan ; Hansen, Steven Strandlund ; Larsen, Thor Mark Tampus et al. / Analysis of Malware behavior : Type classification using machine learning. International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), 2015. IEEE, 2015. (International Conference on Cyber Situational Awareness, Data Analytics and Assessment Proceedings. (cyberSA)).

@inproceedings{3d856ddfa4364a7b850664b5526cef70,

title = "Analysis of Malware behavior: Type classification using machine learning",

abstract = "Malicious software has become a major threat to modern society, not only due to the increased complexity of the malware itself but also due to the exponential increase of new malware each day. This study tackles the problem of analyzing and classifying a high amount of malware in a scalable and automatized manner. We have developed a distributed malware testing environment by extending Cuckoo Sandbox that was used to test an extensive number of malware samples and trace their behavioral data. The extracted data was used for the development of a novel type classification approach based on supervised machine learning. The proposed classification approach employs a novel combination of features that achieves a high classification rate with a weighted average AUC value of 0.98 using Random Forests classifier. The approach has been extensively tested on a total of 42,000 malware samples. Based on the above results it is believed that the developed system can be used to pre-filter novel from known malware in a future malware analysis system.",

keywords = "Malware, Type-Classification, Dynamic Analysis, Scalability, Cuckoo Sandbox, Random Forests, API call, Feature Selection, Supervised Machine Learning",

author = "Radu-Stefan Pirscoveanu and Hansen, {Steven Strandlund} and Larsen, {Thor Mark Tampus} and Matija Stevanovic and Pedersen, {Jens Myrup} and Alexandre Czech",

year = "2015",

month = aug,

doi = "10.1109/CyberSA.2015.7166115",

language = "English",

isbn = "9781467367974",

series = "International Conference on Cyber Situational Awareness, Data Analytics and Assessment Proceedings. (cyberSA)",

publisher = "IEEE",

booktitle = "International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), 2015",

address = "United States",

note = "International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), 2015 ; Conference date: 08-06-2015 Through 09-06-2015",

}

Pirscoveanu, R-S, Hansen, SS, Larsen, TMT, Stevanovic, M, Pedersen, JM & Czech, A 2015, Analysis of Malware behavior: Type classification using machine learning. in International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), 2015. IEEE, International Conference on Cyber Situational Awareness, Data Analytics and Assessment Proceedings. (cyberSA), International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), 2015 , London, United Kingdom, 08/06/2015. https://doi.org/10.1109/CyberSA.2015.7166115

Analysis of Malware behavior: Type classification using machine learning. / Pirscoveanu, Radu-Stefan; Hansen, Steven Strandlund; Larsen, Thor Mark Tampus et al.
International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), 2015. IEEE, 2015. (International Conference on Cyber Situational Awareness, Data Analytics and Assessment Proceedings. (cyberSA)).

Research output: Contribution to book/anthology/report/conference proceeding › Article in proceeding › Research › peer-review

TY - GEN

T1 - Analysis of Malware behavior

T2 - International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), 2015

AU - Pirscoveanu, Radu-Stefan

AU - Hansen, Steven Strandlund

AU - Larsen, Thor Mark Tampus

AU - Stevanovic, Matija

AU - Pedersen, Jens Myrup

AU - Czech, Alexandre

PY - 2015/8

Y1 - 2015/8

N2 - Malicious software has become a major threat to modern society, not only due to the increased complexity of the malware itself but also due to the exponential increase of new malware each day. This study tackles the problem of analyzing and classifying a high amount of malware in a scalable and automatized manner. We have developed a distributed malware testing environment by extending Cuckoo Sandbox that was used to test an extensive number of malware samples and trace their behavioral data. The extracted data was used for the development of a novel type classification approach based on supervised machine learning. The proposed classification approach employs a novel combination of features that achieves a high classification rate with a weighted average AUC value of 0.98 using Random Forests classifier. The approach has been extensively tested on a total of 42,000 malware samples. Based on the above results it is believed that the developed system can be used to pre-filter novel from known malware in a future malware analysis system.

AB - Malicious software has become a major threat to modern society, not only due to the increased complexity of the malware itself but also due to the exponential increase of new malware each day. This study tackles the problem of analyzing and classifying a high amount of malware in a scalable and automatized manner. We have developed a distributed malware testing environment by extending Cuckoo Sandbox that was used to test an extensive number of malware samples and trace their behavioral data. The extracted data was used for the development of a novel type classification approach based on supervised machine learning. The proposed classification approach employs a novel combination of features that achieves a high classification rate with a weighted average AUC value of 0.98 using Random Forests classifier. The approach has been extensively tested on a total of 42,000 malware samples. Based on the above results it is believed that the developed system can be used to pre-filter novel from known malware in a future malware analysis system.

KW - Malware

KW - Type-Classification

KW - Dynamic Analysis

KW - Scalability

KW - Cuckoo Sandbox

KW - Random Forests

KW - API call

KW - Feature Selection

KW - Supervised Machine Learning

UR - https://www.scopus.com/record/display.uri?eid=2-s2.0-84963755556&origin=inward&txGid=0

U2 - 10.1109/CyberSA.2015.7166115

DO - 10.1109/CyberSA.2015.7166115

M3 - Article in proceeding

SN - 9781467367974

T3 - International Conference on Cyber Situational Awareness, Data Analytics and Assessment Proceedings. (cyberSA)

BT - International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), 2015

PB - IEEE

Y2 - 8 June 2015 through 9 June 2015

ER -

Pirscoveanu R-S, Hansen SS, Larsen TMT, Stevanovic M, Pedersen JM, Czech A. Analysis of Malware behavior: Type classification using machine learning. In International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), 2015. IEEE. 2015. (International Conference on Cyber Situational Awareness, Data Analytics and Assessment Proceedings. (cyberSA)). doi: 10.1109/CyberSA.2015.7166115

Analysis of Malware behavior: Type classification using machine learning

Abstract

Conference

Keywords

Access to Document

AUB Link

Other files and links

Fingerprint

Cite this