Dianzi Jishu Yingyong (Jan 2018)

An approach to named entity recognition towards micro-blog

  • Li Gang,
  • Huang Yongfeng

DOI
https://doi.org/10.16157/j.issn.0258-7998.179024
Journal volume & issue
Vol. 44, no. 1
pp. 118 – 120

Abstract

Read online

Named entity recognition is a fundamental technology in natural language processing(NLP). In recent years, rapid development of social network platforms such as microblog presents new challenges to the traditional named entity recognition(NER) technology because of the unique form. In this paper, an improved method based on the conditional random field(CRF) model is proposed for microblog texts. Due to the short texts and semantic ambiguity, external data resources are introduced to generate the topic feature and word representation feature for training the model. Due to the large-scale of microblog data and the high cost of manual standardization, an active learning algorithm based on least confidence is adopted to enhance the training effect at a lower cost of labor. Experiments on a Sina weibo data set show that this method improves the F-score by 4.54% compared to the traditional CRF methods.

Keywords