Resilient and Adaptive Framework for Large Scale Android Malware Fingerprinting using Deep Learning and NLP Techniques

by   ElMouatez Billah Karbab, et al.

Android malware detection is a significat problem that affects billions of users using millions of Android applications (apps) in existing markets. This paper proposes PetaDroid, a framework for accurate Android malware detection and family clustering on top of static analyses. PetaDroid automatically adapts to Android malware and benign changes over time with resilience to common binary obfuscation techniques. The framework employs novel techniques elaborated on top of natural language processing (NLP) and machine learning techniques to achieve accurate, adaptive, and resilient Android malware detection and family clustering. PetaDroid identifies malware using an ensemble of convolutional neural network (CNN) on proposed Inst2Vec features. The framework clusters the detected malware samples into malware family groups utilizing sample feature digests generated using deep neural auto-encoder. For change adaptation, PetaDroid leverages the detection confidence probability during deployment to automatically collect extension datasets and periodically use them to build new malware detection models. Besides, PetaDroid uses code-fragment randomization during the training to enhance the resiliency to common obfuscation techniques. We extensively evaluated PetaDroid on multiple reference datasets. PetaDroid achieved a high detection rate (98-99 under different evaluation settings with high homogeneity in the produced clusters (96 state-of-the-art solutions MaMaDroid, DroidAPIMiner, MalDozer, in which PetaDroid outperforms them under all the evaluation settings.


Obfuscation-resilient Android Malware Analysis Based on Contrastive Learning

Due to its open-source nature, Android operating system has been the mai...

Android Malware Characterization using Metadata and Machine Learning Techniques

Android Malware has emerged as a consequence of the increasing popularit...

Comment on "AndrODet: An adaptive Android obfuscation detector"

We have identified a methodological problem in the empirical evaluation ...

Android Security using NLP Techniques: A Review

Android is among the most targeted platform by attackers. While attacker...

hybrid-Flacon: Hybrid Pattern Malware Detection and Categorization with Network Traffic andProgram Code

Nowadays, Android is the most dominant operating system in the mobile ec...

R2-D2: ColoR-inspired Convolutional NeuRal Network (CNN)-based AndroiD Malware Detections

Machine Learning (ML) has found it particularly useful in malware detect...

Graph Neural Network-based Android Malware Classification with Jumping Knowledge

This paper presents a new Android malware detection method based on Grap...

Please sign up or login with your details

Forgot password? Click here to reset