Delta-training: Simple Semi-Supervised Text Classification using Pretrained Word Embeddings

01/22/2019
by   Hwiyeol Jo, et al.
0

We propose a novel and simple method for semi-supervised text classification. The method starts from a hypothesis that a classifier with pretrained word embeddings always outperforms the same classifier with randomly initialized word embeddings, as empirically observed in NLP tasks. Our method first builds two sets of classifiers as a form of model ensemble, and then initializes their word embeddings differently: one using random, the other using pretrained word embeddings. We focus on different predictions between the two classifiers on unlabeled data while following the self-training framework. We also introduce label refinement and early-stopping in meta-epoch for better confidence on the label-by-prediction. We experiment on 4 different classification datasets, showing that our method performs better than the method using only the training set. Delta-training also outperforms the conventional self-training method in multi-class classification, showing robust performance against error accumulation.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro