Analysis of Malware behavior: Type classification using machine learning

Radu-Stefan Pirscoveanu, Steven Strandlund Hansen, Thor Mark Tampus Larsen, Matija Stevanovic, Jens Myrup Pedersen, Alexandre Czech

Research output: Contribution to book/anthology/report/conference proceedingArticle in proceedingResearchpeer-review

91 Citations (Scopus)

Abstract

Malicious software has become a major threat to modern society, not only due to the increased complexity of the malware itself but also due to the exponential increase of new malware each day. This study tackles the problem of analyzing and classifying a high amount of malware in a scalable and automatized manner. We have developed a distributed malware testing environment by extending Cuckoo Sandbox that was used to test an extensive number of malware samples and trace their behavioral data. The extracted data was used for the development of a novel type classification approach based on supervised machine learning. The proposed classification approach employs a novel combination of features that achieves a high classification rate with a weighted average AUC value of 0.98 using Random Forests classifier. The approach has been extensively tested on a total of 42,000 malware samples. Based on the above results it is believed that the developed system can be used to pre-filter novel from known malware in a future malware analysis system.
Original languageEnglish
Title of host publicationInternational Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), 2015
Number of pages7
PublisherIEEE
Publication dateAug 2015
ISBN (Print)9781467367974
DOIs
Publication statusPublished - Aug 2015
EventInternational Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), 2015 - London, United Kingdom
Duration: 8 Jun 20159 Jun 2015

Conference

ConferenceInternational Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), 2015
Country/TerritoryUnited Kingdom
CityLondon
Period08/06/201509/06/2015
SeriesInternational Conference on Cyber Situational Awareness, Data Analytics and Assessment Proceedings. (cyberSA)

Keywords

  • Malware
  • Type-Classification
  • Dynamic Analysis
  • Scalability
  • Cuckoo Sandbox
  • Random Forests
  • API call
  • Feature Selection
  • Supervised Machine Learning

Fingerprint

Dive into the research topics of 'Analysis of Malware behavior: Type classification using machine learning'. Together they form a unique fingerprint.

Cite this