Multilingual Code Co-Evolution Using Large Language Models

07/27/2023
by   Jiyang Zhang, et al.
0

Many software projects implement APIs and algorithms in multiple programming languages. Maintaining such projects is tiresome, as developers have to ensure that any change (e.g., a bug fix or a new feature) is being propagated, timely and without errors, to implementations in other programming languages. In the world of ever-changing software, using rule-based translation tools (i.e., transpilers) or machine learning models for translating code from one language to another provides limited value. Translating each time the entire codebase from one language to another is not the way developers work. In this paper, we target a novel task: translating code changes from one programming language to another using large language models (LLMs). We design and implement the first LLM, dubbed Codeditor, to tackle this task. Codeditor explicitly models code changes as edit sequences and learns to correlate changes across programming languages. To evaluate Codeditor, we collect a corpus of 6,613 aligned code changes from 8 pairs of open-source software projects implementing similar functionalities in two programming languages (Java and C#). Results show that Codeditor outperforms the state-of-the-art approaches by a large margin on all commonly used automatic metrics. Our work also reveals that Codeditor is complementary to the existing generation-based models, and their combination ensures even greater performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/01/2021

The Comprehensive Blub Archive Network: Towards Design Principals for Open Source Programming Language Repositories

Many popular open source programming languages (Perl, Ruby or Python for...
research
03/27/2018

Proactive Empirical Assessment of New Language Feature Adoption via Automated Refactoring: The Case of Java 8 Default Methods

Programming languages and platforms improve over time, sometimes resulti...
research
03/17/2023

ESP32: QEMU Emulation within a Docker Container

The ESP32 is a popular microcontroller from Espressif that can be used i...
research
02/26/2022

A Systematic Evaluation of Large Language Models of Code

Large language models (LMs) of code have recently shown tremendous promi...
research
06/17/2022

On the Bug-proneness of Structures Inspired by Functional Programming in JavaScript Projects

Language constructs inspired by functional programming have made their w...
research
02/15/2021

Investigating and Recommending Co-Changed Entities for JavaScript Programs

JavaScript (JS) is one of the most popular programming languages due to ...
research
08/26/2022

Expansion and evolution of the R programming language

Change in language use is driven by cultural forces; it is unclear wheth...

Please sign up or login with your details

Forgot password? Click here to reset