Joint Keyphrase Chunking and Salience Ranking with BERT
An effective keyphrase extraction system requires to produce self-contained high quality phrases that are also key to the document topic. This paper presents BERT-JointKPE, a multi-task BERT-based model for keyphrase extraction. JointKPE employs a chunking network to identify high-quality phrases and a ranking network to learn their salience in the document. The model is trained jointly on the chunking task and the ranking task, balancing the estimation of keyphrase quality and salience. Experiments on two benchmarks demonstrate JointKPE's robust effectiveness with different BERT variants. Our analyses show that JointKPE has advantages in predicting long keyphrases and extracting phrases that are not entities but also meaningful. The source code of this paper can be obtained from https://github.com/thunlp/BERT-KPE
READ FULL TEXT