Natural Language to Code Translation with Execution

04/25/2022
by   Freda Shi, et al.
2

Generative models of code, pretrained on large corpora of programs, have shown great success in translating natural language to code (Chen et al., 2021; Austin et al., 2021; Li et al., 2022, inter alia). While these models do not explicitly incorporate program semantics (i.e., execution results) during training, they are able to generate correct solutions for many problems. However, choosing a single correct program from among a generated set for each problem remains challenging. In this work, we introduce execution result–based minimum Bayes risk decoding (MBR-EXEC) for program selection and show that it improves the few-shot performance of pretrained code models on natural-language-to-code tasks. We select output programs from a generated candidate set by marginalizing over program implementations that share the same semantics. Because exact equivalence is intractable, we execute each program on a small number of test inputs to approximate semantic equivalence. Across datasets, execution or simulated execution significantly outperforms the methods that do not involve program semantics. We find that MBR-EXEC consistently improves over all execution-unaware selection methods, suggesting it as an effective approach for natural language to code translation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/16/2023

LEVER: Learning to Verify Language-to-Code Generation with Execution

The advent of large language models trained on code (code LLMs) has led ...
research
11/08/2018

A Program Logic for First-Order Encapsulated WebAssembly

WebAssembly (Wasm) is the first new programming language in over 20 year...
research
09/29/2022

I Speak, You Verify: Toward Trustworthy Neural Program Synthesis

We develop an approach for improving the trustworthiness and overall acc...
research
12/10/2020

Infusing Finetuning with Semantic Dependencies

For natural language processing systems, two kinds of evidence support t...
research
06/04/2022

Fault-Aware Neural Code Rankers

Large language models (LLMs) have demonstrated an impressive ability to ...
research
07/11/2016

sk_p: a neural program corrector for MOOCs

We present a novel technique for automatic program correction in MOOCs, ...
research
08/23/2017

More declarative tabling in Prolog using multi-prompt delimited control

Several Prolog implementations include a facility for tabling, an altern...

Please sign up or login with your details

Forgot password? Click here to reset