BayLing: Bridging Cross-lingual Alignment and Instruction Following through Interactive Translation for Large Language Models

06/19/2023
by   Shaolei Zhang, et al.
0

Large language models (LLMs) have demonstrated remarkable prowess in language understanding and generation. Advancing from foundation LLMs to instructionfollowing LLMs, instruction tuning plays a vital role in aligning LLMs to human preferences. However, the existing LLMs are usually focused on English, leading to inferior performance in non-English languages. In order to improve the performance for non-English languages, it is necessary to collect language-specific training data for foundation LLMs and construct language-specific instructions for instruction tuning, both of which are heavy loads. To minimize human workload, we propose to transfer the capabilities of language generation and instruction following from English to other languages through an interactive translation task. We have developed BayLing, an instruction-following LLM by utilizing LLaMA as the foundation LLM and automatically constructing interactive translation instructions for instructing tuning. Extensive assessments demonstrate that BayLing achieves comparable performance to GPT-3.5-turbo, despite utilizing a considerably smaller parameter size of only 13 billion. Experimental results on translation tasks show that BayLing achieves 95 to GPT-4 with automatic evaluation and 96 capability compared to GPT-3.5-turbo with human evaluation. To estimate the performance on general tasks, we created a multi-turn instruction test set called BayLing-80. The experimental results on BayLing-80 indicate that BayLing achieves 89 demonstrates outstanding performance on knowledge assessment of Chinese GaoKao and English SAT, second only to GPT-3.5-turbo among a multitude of instruction-following LLMs. Demo, homepage, code and models of BayLing are available.

READ FULL TEXT

page 6

page 7

page 8

page 9

page 11

page 22

page 23

research
08/09/2023

Extrapolating Large Language Models to Non-English by Aligning Languages

Due to the unbalanced training data distribution, the language ability o...
research
08/27/2023

Empowering Cross-lingual Abilities of Instruction-tuned Large Language Models by Translation-following demonstrations

The language ability of Large Language Models (LLMs) is often unbalanced...
research
08/24/2023

Improving Translation Faithfulness of Large Language Models via Augmenting Instructions

Large Language Models (LLMs) present strong general capabilities, and a ...
research
05/22/2023

llm-japanese-dataset v0: Construction of Japanese Chat Dataset for Large Language Models and its Methodology

This study constructed a Japanese chat dataset for tuning large language...
research
07/12/2023

PolyLM: An Open Source Polyglot Large Language Model

Large language models (LLMs) demonstrate remarkable ability to comprehen...
research
07/28/2023

Exploring Format Consistency for Instruction Tuning

Instruction tuning has emerged as a promising approach to enhancing larg...
research
09/10/2019

Human Languages in Source Code: Auto-Translation for Localized Instruction

Computer science education has promised open access around the world, bu...

Please sign up or login with your details

Forgot password? Click here to reset