How Universal is Genre in Universal Dependencies?

12/09/2021
by   Max Müller-Eberstein, et al.
0

This work provides the first in-depth analysis of genre in Universal Dependencies (UD). In contrast to prior work on genre identification which uses small sets of well-defined labels in mono-/bilingual setups, UD contains 18 genres with varying degrees of specificity spread across 114 languages. As most treebanks are labeled with multiple genres while lacking annotations about which instances belong to which genre, we propose four methods for predicting instance-level genre using weak supervision from treebank metadata. The proposed methods recover instance-level genre better than competitive baselines as measured on a subset of UD with labeled instances and adhere better to the global expected distribution. Our analysis sheds light on prior work using UD genre metadata for treebank selection, finding that metadata alone are a noisy signal and must be disentangled within treebanks before it can be universally applied.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/30/2015

Love Thy Neighbors: Image Annotation by Exploiting Image Metadata

Some images that are difficult to recognize on their own may become more...
research
08/16/2021

Weakly Supervised Classification Using Group-Level Labels

In many applications, finding adequate labeled data to train predictive ...
research
08/24/2023

npm-follower: A Complete Dataset Tracking the NPM Ecosystem

Software developers typically rely upon a large network of dependencies ...
research
04/12/2021

Self-Training with Weak Supervision

State-of-the-art deep neural networks require large-scale labeled traini...
research
10/15/2018

Marrying Universal Dependencies and Universal Morphology

The Universal Dependencies (UD) and Universal Morphology (UniMorph) proj...
research
08/29/2021

Mischievous Nominal Constructions in Universal Dependencies

While the highly multilingual Universal Dependencies (UD) project provid...
research
07/26/2021

TriPoll: Computing Surveys of Triangles in Massive-Scale Temporal Graphs with Metadata

Understanding the higher-order interactions within network data is a key...

Please sign up or login with your details

Forgot password? Click here to reset