Code Attention: Translating Code to Comments by Exploiting Domain Features
Appropriate comments of code snippets provide insight for code functionality, which are helpful for program comprehension. However, due to the great cost of authoring with the comments, many code projects do not contain adequate comments. Automatic comment generation techniques have been proposed to generate comments from pieces of code in order to alleviate the human efforts in annotating the code. Most existing approaches attempt to exploit certain correlations (usually manually given) between code and generated comments, which could be easily violated if the coding patterns change and hence the performance of comment generation declines. In this paper, we first build C2CGit, a large dataset from open projects in GitHub, which is more than 20× larger than existing datasets. Then we propose a new attention module called Code Attention to translate code to comments, which is able to utilize the domain features of code snippets, such as symbols and identifiers. We make ablation studies to determine effects of different parts in Code Attention. Experimental results demonstrate that the proposed module has better performance over existing approaches in both BLEU and METEOR.
READ FULL TEXT