Learning Programmatic Idioms for Scalable Semantic Parsing

04/19/2019
by   Srinivasan Iyer, et al.
0

Programmers typically organize executable source code using high-level coding patterns or idiomatic structures such as nested loops, exception handlers and recursive blocks, rather than as individual code tokens. In contrast, state of the art semantic parsers still map natural language instructions to source code by building the code syntax tree one node at a time. In this paper, we introduce an iterative method to extract code idioms from large source code corpora by repeatedly collapsing most-frequent depth-2 subtrees of their syntax trees, and we train semantic parsers to apply these idioms during decoding. We apply this idiom-based code generation to a recent context-dependent semantic parsing task, and improve the state of the art by 2.2 reducing training time by more than 50 scale up the model by training on an extended training set that is 5x times larger, to further move up the state of the art by an additional 2.3 0.9

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/26/2019

Program Synthesis and Semantic Parsing with Learned Code Idioms

Program synthesis of general-purpose source code from natural language s...
research
04/06/2017

A Syntactic Neural Model for General-Purpose Code Generation

We consider the problem of parsing natural language descriptions into so...
research
04/25/2017

Abstract Syntax Networks for Code Generation and Semantic Parsing

Tasks like code generation and semantic parsing require mapping unstruct...
research
06/19/2019

Automatic Source Code Summarization with Extended Tree-LSTM

Neural machine translation models are used to automatically generate a d...
research
02/14/2022

Source Code Summarization with Structural Relative Position Guided Transformer

Source code summarization aims at generating concise and clear natural l...
research
12/23/2021

Towards Fully Declarative Program Analysis via Source Code Transformation

Advances in logic programming and increasing industrial uptake of Datalo...
research
03/18/2020

Improving the Robustness to Data Inconsistency between Training and Testing for Code Completion by Hierarchical Language Model

In the field of software engineering, applying language models to the to...

Please sign up or login with your details

Forgot password? Click here to reset