Expected Exploitability: Predicting the Development of Functional Vulnerability Exploits

by   Octavian Suciu, et al.

Assessing the exploitability of software vulnerabilities at the time of disclosure is difficult and error-prone, as features extracted via technical analysis by existing metrics are poor predictors for exploit development. Moreover, exploitability assessments suffer from a class bias because "not exploitable" labels could be inaccurate. To overcome these challenges, we propose a new metric, called Expected Exploitability (EE), which reflects, over time, the likelihood that functional exploits will be developed. Key to our solution is a time-varying view of exploitability, a departure from existing metrics, which allows us to learn EE using data-driven techniques from artifacts published after disclosure, such as technical write-ups, proof-of-concept exploits, and social media discussions. Our analysis reveals that prior features proposed for related exploit prediction tasks are not always beneficial for predicting functional exploits, and we design novel feature sets to capitalize on previously under-utilized artifacts. This view also allows us to investigate the effect of the label biases on the classifiers. We characterize the noise-generating process for exploit prediction, showing that our problem is subject to class- and feature-dependent label noise, considered the most challenging type. By leveraging domain-specific observations, we then develop techniques to incorporate noise robustness into learning EE. On a dataset of 103,137 vulnerabilities, we show that EE increases precision from 49% to 86% over existing metrics, including two state-of-the-art exploit classifiers, while the performance of our metric also improving over time. EE scores capture exploitation imminence, by distinguishing exploits which are going to be developed in the near future.


page 1

page 2

page 3

page 4


Enhancing Vulnerability Prioritization: Data-Driven Exploit Predictions with Community-Driven Insights

The number of disclosed vulnerabilities has been steadily increasing ove...

Deep Learning based Vulnerability Detection: Are We There Yet?

Automated detection of software vulnerabilities is a fundamental problem...

Feature-Oriented Defect Prediction: Scenarios, Metrics, and Classifiers

Several software defect prediction techniques have been developed over t...

Classifying Web Exploits with Topic Modeling

This short empirical paper investigates how well topic modeling and data...

Competency Problems: On Finding and Removing Artifacts in Language Data

Much recent work in NLP has documented dataset artifacts, bias, and spur...

Noisy Label Learning for Security Defects

Data-driven software engineering processes, such as vulnerability predic...

Predicting litigation likelihood and time to litigation for patents

Patent lawsuits are costly and time-consuming. An ability to forecast a ...

Please sign up or login with your details

Forgot password? Click here to reset