A Family of Droids: Analyzing Behavioral Model based Android Malware Detection via Static and Dynamic Analysis
As smartphones play an increasingly central role in our everyday lives, the number of applications (apps) designed for the mobile ecosystem has skyrocketed. These apps are designed to meet diverse user needs, e.g., banking, communication, social networking, and as such often handle sensitive information. As a result, a growing number of cybercriminals are targeting the mobile ecosystem, by designing and distributing malicious apps that steal information or cause harm to the device's owner. Aiming to counter them, a number of techniques, based on either static or dynamic analysis, have been used to model Android malware and build detection tools. However, we often lack a good understanding of what analysis method works best and in which cases. In this paper, we analyze the performance of static and dynamic analysis methods, using the same modeling approach. Specifically, we build on MaMaDroid, a state-of-the-art detection system that relies on static analysis to create a behavioral model from the sequences of abstracted API calls. To port it to dynamic analysis, we modify CHIMP, a platform recently proposed to crowdsource human inputs for app testing, in order to extract API calls' sequences from the traces produced while executing the app on a CHIMP virtual device. We call this system AuntieDroid and instantiate it by using both automated (Monkey) and user-generated inputs. We find that combining both static and dynamic analysis yields the best performance, with F-measure reaching 0.92. We also show that static analysis is at least as effective as dynamic analysis, depending on how apps are stimulated during execution, and investigate the reasons for inconsistent misclassifications across methods. Finally, we examine the prevalence of dynamic code loading, which constitutes a possible limitation of static analysis based tools.
READ FULL TEXT