Adapters for Enhanced Modeling of Multilingual Knowledge and Text

by   Yifan Hou, et al.

Large language models appear to learn facts from the large text corpora they are trained on. Such facts are encoded implicitly within their many parameters, making it difficult to verify or manipulate what knowledge has been learned. Language models have recently been extended to multilingual language models (MLLMs), enabling knowledge to be learned across hundreds of languages. Meanwhile, knowledge graphs contain facts in an explicit triple format, which require careful and costly curation and are only available in a few high-resource languages, restricting their research and application. To address these issues, we propose to enhance MLLMs with knowledge from multilingual knowledge graphs (MLKGs) so as to tackle language and knowledge graph tasks across many languages, including low-resource ones. Specifically, we introduce a lightweight adapter set to enhance MLLMs with cross-lingual entity alignment and facts from MLKGs for many languages. Experiments on common benchmarks show that such enhancement benefits both MLLMs and MLKGs, achieving: (1) comparable or improved performance for knowledge graph completion and entity alignment relative to baselines, especially for low-resource languages (for which knowledge graphs are unavailable); and (2) improved MLLM performance on language understanding tasks that require multilingual factual knowledge; all while maintaining performance on other general language tasks.


Massively Multilingual Language Models for Cross Lingual Fact Extraction from Low Resource Indian Languages

Massive knowledge graphs like Wikidata attempt to capture world knowledg...

The Geometry of Multilingual Language Models: An Equality Lens

Understanding the representations of different languages in multilingual...

Crawling the Internal Knowledge-Base of Language Models

Language models are trained on large volumes of text, and as a result th...

Knowledge Based Multilingual Language Model

Knowledge enriched language representation learning has shown promising ...

QuoteKG: A Multilingual Knowledge Graph of Quotes

Quotes of public figures can mark turning points in history. A quote can...

Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Neurons

Pre-trained language models (PLMs) contain vast amounts of factual knowl...

Does Transliteration Help Multilingual Language Modeling?

As there is a scarcity of large representative corpora for most language...

Please sign up or login with your details

Forgot password? Click here to reset