As Cool as a Cucumber: Towards a Corpus of Contemporary Similes in Serbian

05/20/2016
by   Nikola Milosevic, et al.
0

Similes are natural language expressions used to compare unlikely things, where the comparison is not taken literally. They are often used in everyday communication and are an important part of cultural heritage. Having an up-to-date corpus of similes is challenging, as they are constantly coined and/or adapted to the contemporary times. In this paper we present a methodology for semi-automated collection of similes from the world wide web using text mining techniques. We expanded an existing corpus of traditional similes (containing 333 similes) by collecting 446 additional expressions. We, also, explore how crowdsourcing can be used to extract and curate new similes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/22/2018

Creating a contemporary corpus of similes in Serbian by using natural language processing

Simile is a figure of speech that compares two things through the use of...
research
08/09/2016

Neural Generation of Regular Expressions from Natural Language with Minimal Domain Knowledge

This paper explores the task of translating natural language queries int...
research
06/14/2021

Contemporary Amharic Corpus: Automatically Morpho-Syntactically Tagged Amharic Corpus

We introduced the contemporary Amharic corpus, which is automatically ta...
research
03/28/2023

Carolina: a General Corpus of Contemporary Brazilian Portuguese with Provenance, Typology and Versioning Information

This paper presents the first publicly available version of the Carolina...
research
05/01/2016

Text-mining the NeuroSynth corpus using Deep Boltzmann Machines

Large-scale automated meta-analysis of neuroimaging data has recently es...
research
02/01/2021

Gamified Crowdsourcing for Idiom Corpora Construction

Learning idiomatic expressions is seen as one of the most challenging st...

Please sign up or login with your details

Forgot password? Click here to reset