Nine Features in a Random Forest to Learn Taxonomical Semantic Relations

03/29/2016
by   Enrico Santus, et al.
0

ROOT9 is a supervised system for the classification of hypernyms, co-hyponyms and random words that is derived from the already introduced ROOT13 (Santus et al., 2016). It relies on a Random Forest algorithm and nine unsupervised corpus-based features. We evaluate it with a 10-fold cross validation on 9,600 pairs, equally distributed among the three classes and involving several Parts-Of-Speech (i.e. adjectives, nouns and verbs). When all the classes are present, ROOT9 achieves an F1 score of 90.7 (vector cosine). When the classification is binary, ROOT9 achieves the following results against the baseline: hypernyms-co-hyponyms 95.7 hypernyms-random 91.8 order to compare the performance with the state-of-the-art, we have also evaluated ROOT9 in subsets of the Weeds et al. (2014) datasets, proving that it is in fact competitive. Finally, we investigated whether the system learns the semantic relation or it simply learns the prototypical hypernyms, as claimed by Levy et al. (2015). The second possibility seems to be the most likely, even though ROOT9 can be trained on negative examples (i.e., switched hypernyms) to drastically reduce this bias.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/29/2016

ROOT13: Spotting Hypernyms, Co-Hyponyms and Randoms

In this paper, we describe ROOT13, a supervised system for the classific...
research
04/10/2019

NLPR@SRPOL at SemEval-2019 Task 6 and Task 5: Linguistically enhanced deep learning offensive sentence classifier

The paper presents a system developed for the SemEval-2019 competition T...
research
04/18/2018

Exact Distributed Training: Random Forest with Billions of Examples

We introduce an exact distributed algorithm to train Random Forest model...
research
07/19/2018

A Projection Pursuit Forest Algorithm for Supervised Classification

This paper presents a new ensemble learning method for classification pr...
research
11/30/2021

Modelling hetegeneous treatment effects by quantitle local polynomial decision tree and forest

To further develop the statistical inference problem for heterogeneous t...
research
10/14/2021

A Survey of Machine Learning Algorithms for Detecting Ransomware Encryption Activity

A survey of machine learning techniques trained to detect ransomware is ...
research
04/16/2018

ClaiRE at SemEval-2018 Task 7 - Extended Version

In this paper we describe our post-evaluation results for SemEval-2018 T...

Please sign up or login with your details

Forgot password? Click here to reset