Constantine Lignos

research

∙ 05/04/2023

What changes when you randomly choose BPE merge operations? Not much

We introduce three simple randomized variants of byte pair encoding (BPE...

0 Jonne Sälevä, et al. ∙

research

∙ 12/19/2022

LR-Sum: Summarization for Less-Resourced Languages

This preprint describes work in progress on LR-Sum, a new permissively-l...

0 Chester Palen-Michel, et al. ∙

research

∙ 06/10/2022

Borrowing or Codeswitching? Annotating for Finer-Grained Distinctions in Language Mixing

We present a new corpus of Twitter data annotated for codeswitching and ...

0 Elena Alvarez-Mellado, et al. ∙

research

∙ 03/30/2022

Detecting Unassimilated Borrowings in Spanish: An Annotated Corpus and Approaches to Modeling

This work presents a new resource for borrowing identification and analy...

0 Elena Alvarez-Mellado, et al. ∙

research

∙ 02/28/2022

ParaNames: A Massively Multilingual Entity Name Corpus

This preprint describes work in progress on ParaNames, a multilingual pa...

0 Jonne Sälevä, et al. ∙

research

∙ 02/24/2022

Toward More Meaningful Resources for Lower-resourced Languages

In this position paper, we describe our perspective on how meaningful re...

0 Constantine Lignos, et al. ∙

research

∙ 01/14/2022

Multilingual Open Text 1.0: Public Domain News in 44 Languages

We present a new multilingual corpus containing text in 44 languages, ma...

0 Chester Palen-Michel, et al. ∙

research

∙ 10/29/2021

Overview of ADoBo 2021: Automatic Detection of Unassimilated Borrowings in the Spanish Press

This paper summarizes the main findings of the ADoBo 2021 shared task, p...

0 Elena Alvarez-Mellado, et al. ∙

research

∙ 07/29/2021

Addressing Barriers to Reproducible Named Entity Recognition Evaluation

To address what we believe is a looming crisis of unreproducible evaluat...

0 Chester Palen-Michel, et al. ∙

research

∙ 04/12/2021

Macro-Average: Rare Types Are Important Too

While traditional corpus-level evaluation metrics for machine translatio...

2 Thamme Gowda, et al. ∙

research

∙ 04/01/2021

Mining Wikidata for Name Resources for African Languages

This work supports further development of language technology for the la...

0 Jonne Sälevä, et al. ∙

research

∙ 03/23/2021

TMR: Evaluating NER Recall on Tough Mentions

We propose the Tough Mentions Recall (TMR) metrics to supplement traditi...

0 Jingxuan Tu, et al. ∙

research

∙ 03/22/2021

MasakhaNER: Named Entity Recognition for African Languages

We take a step towards addressing the under-representation of the Africa...

5 David Ifeoluwa Adelani, et al. ∙

research

∙ 03/20/2021

The Effectiveness of Morphology-aware Segmentation in Low-Resource Neural Machine Translation

This paper evaluates the performance of several modern subword segmentat...

0 Jonne Sälevä, et al. ∙

Constantine Lignos

Featured Co-authors

Sign in with Google

Consider DeepAI Pro