Pre-trained language models as knowledge bases for Automotive Complaint Analysis

12/04/2020

∙

Recently it has been shown that large pre-trained language models like BERT (Devlin et al., 2018) are able to store commonsense factual knowledge captured in its pre-training corpus (Petroni et al., 2019). In our work we further evaluate this ability with respect to an application from industry creating a set of probes specifically designed to reveal technical quality issues captured as described incidents out of unstructured customer feedback in the automotive industry. After probing the out-of-the-box versions of the pre-trained models with fill-in-the-mask tasks we dynamically provide it with more knowledge via continual pre-training on the Office of Defects Investigation (ODI) Complaints data set. In our experiments the models exhibit performance regarding queries on domain-specific topics compared to when queried on factual knowledge itself, as Petroni et al. (2019) have done. For most of the evaluated architectures the correct token is predicted with a Precision@1 (P@1) of above 60%, while for P@5 and P@10 even values of well above 80% and up to 90% respectively are reached. These results show the potential of using language models as a knowledge base for structured analysis of customer feedback.

READ FULL TEXT

Pre-trained language models as knowledge bases for Automotive Complaint Analysis

Task-specific Pre-training and Prompt Decomposition for Knowledge Graph Population with Language Models

Re-Evaluating GermEval17 Using German Pre-Trained Language Models

Factual Probing Is [MASK]: Learning vs. Learning to Recall

DeepA2: A Modular Framework for Deep Argument Analysis with Pretrained Neural Text2Text Language Models

Attending to Entities for Better Text Understanding

ModuleFormer: Learning Modular Large Language Models From Uncurated Data

Benchmarking down-scaled (not so large) pre-trained language models

Pre-trained language models as knowledge bases for Automotive Complaint Analysis

Related Research

Task-specific Pre-training and Prompt Decomposition for Knowledge Graph Population with Language Models

Re-Evaluating GermEval17 Using German Pre-Trained Language Models

Factual Probing Is [MASK]: Learning vs. Learning to Recall

DeepA2: A Modular Framework for Deep Argument Analysis with Pretrained Neural Text2Text Language Models

Attending to Entities for Better Text Understanding

ModuleFormer: Learning Modular Large Language Models From Uncurated Data

Benchmarking down-scaled (not so large) pre-trained language models