On Cross-Domain Pre-Trained Language Models for Clinical Text Mining: How Do They Perform on Data-Constrained Fine-Tuning?

by   Yuping Wu, et al.

Pre-trained language models (PLMs) have been deployed in many natural language processing (NLP) tasks and in various domains. Language model pre-training from general or mixed domain rich data plus fine-tuning using small amounts of available data in a low resource domain demonstrated beneficial results by researchers. In this work, we question this statement and verify if BERT-based PLMs from the biomedical domain can perform well in clinical text mining tasks via fine-tuning. We test the state-of-the-art models, i.e. Bioformer which is pre-trained on a large amount of biomedical data from PubMed corpus. We use a historical n2c2 clinical NLP challenge dataset for fine-tuning its task-adapted version (BioformerApt), and show that their performances are actually very low. We also present our own end-to-end model, TransformerCRF, which is developed using Transformer and conditional random fields (CRFs) as encoder and decoder. We further create a new variation model by adding a CRF layer on top of PLM Bioformer (BioformerCRF). We investigate the performances of TransformerCRF on clinical text mining tasks by training from scratch using a limited amount of data, as well as the model BioformerCRF. Experimental evaluation shows that, in a constrained setting, all tested models are far from ideal regarding extreme low-frequency special token recognition, even though they can achieve relatively higher accuracy on overall text tagging. Our models including source codes will be hosted at <https://github.com/poethan/TransformerCRF>.


page 7

page 12

page 13


MedMine: Examining Pre-trained Language Models on Medication Mining

Automatic medication mining from clinical and biomedical text has become...

An Investigation into the Effects of Pre-training Data Distributions for Pathology Report Classification

Pre-trained transformer models have demonstrated success across many nat...

Using Bottleneck Adapters to Identify Cancer in Clinical Notes under Low-Resource Constraints

Processing information locked within clinical health records is a challe...

A Flexible Clustering Pipeline for Mining Text Intentions

Mining the latent intentions from large volumes of natural language inpu...

Just Tell Me: Prompt Engineering in Business Process Management

GPT-3 and several other language models (LMs) can effectively address va...

BERT got a Date: Introducing Transformers to Temporal Tagging

Temporal expressions in text play a significant role in language underst...

MiniALBERT: Model Distillation via Parameter-Efficient Recursive Transformers

Pre-trained Language Models (LMs) have become an integral part of Natura...

Please sign up or login with your details

Forgot password? Click here to reset