SeqTrans: Automatic Vulnerability Fix via Sequence to Sequence Learning

10/21/2020
by   Jianlei Chi, et al.
0

Software vulnerabilities are now reported at an unprecedented speed due to the recent development of automated vulnerability hunting tools. However, fixing vulnerabilities still mainly depends on programmers' manual efforts. Developers need to deeply understand the vulnerability and try to affect the system's functions as little as possible. In this paper, with the advancement of Neural Machine Translation (NMT) techniques, we provide a novel approach called SeqTrans to exploit historical vulnerability fixes to provide suggestions and automatically fix the source code. To capture the contextual information around the vulnerable code, we propose to leverage data flow dependencies to construct code sequences and fed them into the state-of-the-art transformer model. Attention and copy mechanisms are both exploited in SeqTrans. We evaluate SeqTrans on both single line and multiple line vulnerability fixes on a dataset containing 1,282 commits that fix 624 vulnerabilities in 205 Java projects. Results show that the accuracy of SeqTrans can achieve 77.6 In the meantime, we look deep inside the result and observe that NMT model performs very well in certain kinds of vulnerabilities like CWE-287 (Improper Authentication) and CWE-863 (Incorrect Authorization).

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset