Optimizing small BERTs trained for German NER

04/23/2021
by   Jochen Zöllner, et al.
0

Currently, the most widespread neural network architecture for training language models is the so called BERT which led to improvements in various NLP tasks. In general, the larger the number of parameters in a BERT model, the better the results obtained in these NLP tasks. Unfortunately, the memory consumption and the training duration drastically increases with the size of these models, though. In this article, we investigate various training techniques of smaller BERT models and evaluate them on five public German NER tasks of which two are introduced by this article. We combine different methods from other BERT variants like ALBERT, RoBERTa, and relative positional encoding. In addition, we propose two new fine-tuning techniques leading to better performance: CSE-tagging and a modified form of LCRF. Furthermore, we introduce a new technique called WWA which reduces BERT memory usage and leads to a small increase in performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/07/2023

German BERT Model for Legal Named Entity Recognition

The use of BERT, one of the most popular language models, has led to imp...
research
03/12/2021

Comparing the Performance of NLP Toolkits and Evaluation measures in Legal Tech

Recent developments in Natural Language Processing have led to the intro...
research
03/31/2020

Give your Text Representation Models some Love: the Case for Basque

Word embeddings and pre-trained language models allow to build rich repr...
research
06/24/2023

Comparison of Pre-trained Language Models for Turkish Address Parsing

Transformer based pre-trained models such as BERT and its variants, whic...
research
11/22/2021

Finding the Winning Ticket of BERT for Binary Text Classification via Adaptive Layer Truncation before Fine-tuning

In light of the success of transferring language models into NLP tasks, ...
research
01/28/2021

BERTaú: Itaú BERT for digital customer service

In the last few years, three major topics received increased interest: d...
research
10/17/2020

HABERTOR: An Efficient and Effective Deep Hatespeech Detector

We present our HABERTOR model for detecting hatespeech in large scale us...

Please sign up or login with your details

Forgot password? Click here to reset