Detecting Unknown DGAs without Context Information

05/30/2022
by   Arthur Drichel, et al.
0

New malware emerges at a rapid pace and often incorporates Domain Generation Algorithms (DGAs) to avoid blocking the malware's connection to the command and control (C2) server. Current state-of-the-art classifiers are able to separate benign from malicious domains (binary classification) and attribute them with high probability to the DGAs that generated them (multiclass classification). While binary classifiers can label domains of yet unknown DGAs as malicious, multiclass classifiers can only assign domains to DGAs that are known at the time of training, limiting the ability to uncover new malware families. In this work, we perform a comprehensive study on the detection of new DGAs, which includes an evaluation of 59,690 classifiers. We examine four different approaches in 15 different configurations and propose a simple yet effective approach based on the combination of a softmax classifier and regular expressions (regexes) to detect multiple unknown DGAs with high probability. At the same time, our approach retains state-of-the-art classification performance for known DGAs. Our evaluation is based on a leave-one-group-out cross-validation with a total of 94 DGA families. By using the maximum number of known DGAs, our evaluation scenario is particularly difficult and close to the real world. All of the approaches examined are privacy-preserving, since they operate without context and exclusively on a single domain to be classified. We round up our study with a thorough discussion of class-incremental learning strategies that can adapt an existing classifier to newly discovered classes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/02/2023

CNS-Net: Conservative Novelty Synthesizing Network for Malware Recognition in an Open-set Scenario

We study the challenging task of malware recognition on both known and n...
research
10/04/2018

Detecting DGA domains with recurrent neural networks and side information

Modern malware typically makes use of a domain generation algorithm (DGA...
research
08/06/2020

Intercepting Hail Hydra: Real-Time Detection of Algorithmically Generated Domains

A crucial technical challenge for cybercriminals is to keep control over...
research
07/01/2020

Making Use of NXt to Nothing: The Effect of Class Imbalances on DGA Detection Classifiers

Numerous machine learning classifiers have been proposed for binary clas...
research
06/23/2021

First Step Towards EXPLAINable DGA Multiclass Classification

Numerous malware families rely on domain generation algorithms (DGAs) to...
research
09/24/2021

The More, the Better? A Study on Collaborative Machine Learning for DGA Detection

Domain generation algorithms (DGAs) prevent the connection between a bot...
research
06/19/2020

Analyzing the Real-World Applicability of DGA Classifiers

Separating benign domains from domains generated by DGAs with the help o...

Please sign up or login with your details

Forgot password? Click here to reset