Chinese Word Segmentation with Heterogeneous Graph Neural Network

01/22/2022
by   Xuemei Tang, et al.
6

In recent years, deep learning has achieved significant success in the Chinese word segmentation (CWS) task. Most of these methods improve the performance of CWS by leveraging external information, e.g., words, sub-words, syntax. However, existing approaches fail to effectively integrate the multi-level linguistic information and also ignore the structural feature of the external information. Therefore, in this paper, we proposed a framework to improve CWS, named HGNSeg. It exploits multi-level external information sufficiently with the pre-trained language model and heterogeneous graph neural network. The experimental results on six benchmark datasets (e.g., Bakeoff 2005, Bakeoff 2008) validate that our approach can effectively improve the performance of Chinese word segmentation. Importantly, in cross-domain scenarios, our method also shows a strong ability to alleviate the out-of-vocabulary (OOV) problem.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset