Attack Tactic Identification by Transfer Learning of Language Model

by   Ling-Hsuan Lin, et al.

Cybersecurity has become a primary global concern with the rapid increase in security attacks and data breaches. Artificial intelligence is promising to help humans analyzing and identifying attacks. However, labeling millions of packets for supervised learning is never easy. This study aims to leverage transfer learning technique that stores the knowledge gained from well-defined attack lifecycle documents and applies it to hundred thousands of unlabeled attacks (packets) for identifying their attack tactics. We anticipate the knowledge of an attack is well-described in the documents, and the cutting edge transformer-based language model can embed the knowledge into a high-dimensional latent space. Then, reusing the information from the language model for the learning of attack tactic carried by packets to improve the learning efficiency. We propose a system, PELAT, that fine-tunes BERT model with 1,417 articles from MITRE ATT CK lifecycle framework to enhance its attack knowledge (including syntax used and semantic meanings embedded). PELAT then transfers its knowledge to perform semi-supervised learning for unlabeled packets to generate their tactic labels. Further, when a new attack packet arrives, the packet payload will be processed by the PELAT language model with a downstream classifier to predict its tactics. In this way, we can effectively reduce the burden of manually labeling big datasets. In a one-week honeypot attack dataset (227 thousand packets per day), PELAT performs 99 recall, and F1 on testing dataset. PELAT can infer over 99 other testing datasets (while nearly 90


page 1

page 2

page 3

page 4


Rethinking Backdoor Data Poisoning Attacks in the Context of Semi-Supervised Learning

Semi-supervised learning methods can train high-accuracy machine learnin...

Towards Understanding Man-on-the-Side Attacks (MotS) in SCADA Networks

We describe a new class of packet injection attacks called Man-on-the-Si...

Targeted Attack on GPT-Neo for the SATML Language Model Data Extraction Challenge

Previous work has shown that Large Language Models are susceptible to so...

Deep Learning for Bias Detection: From Inception to Deployment

To create a more inclusive workplace, enterprises are actively investing...

Teacher Model Fingerprinting Attacks Against Transfer Learning

Transfer learning has become a common solution to address training data ...

The Time for Reconstructing the Attack Graph in DDoS Attacks

Despite their frequency, denial-of-service (DoSDenial of Service (DoS), ...

Please sign up or login with your details

Forgot password? Click here to reset