Open Access Open Access  Restricted Access Subscription Access

VLSP Shared Task: Named Entity Recognition

Huyen T M Nguyen, Quyen T Ngo, Luong X Vu, Vu M Tran, Hien T T Nguyen

Abstract


Named entities (NE) are phrases that contain the names of persons, organizations, locations, times and quantities, monetary values, percentages, etc. Named Entity Recognition (NER) is the task of recognizing named entities in documents. NER is an important subtask of Information Extraction, which has attracted researchers all over the world since 1990s. For Vietnamese language, although there exists some research projects and publications on NER task before 2016, no systematic comparison of the performance of NER systems has been done. In 2016, the organizing committee of the VLSP workshop decided to launch the first NER shared task, in order to get an objective evaluation of Vietnamese NER systems and to promote the development of high quality systems. As a result, the first dataset with morpho-syntactic and NE annotations has been released for benchmarking NER systems. At VLSP 2018, the NER shared task has been organized for the second time, providing a bigger dataset containing texts from various domains, but without morpho-syntactic annotation. These resources are available for research purpose via the VLSP website vlsp.org.vn/resources. In this paper, we describe the datasets as well as the evaluation results obtained from these two campaigns.

Keywords


CoNLL format; evaluation; named entity; named entity recognition; shared task; Vietnamese; VLSP workshop

Full Text:

PDF

References


N. T. Dong, An investigation of vietnamese nested entity recognition models," in in The Fifth International Workshop on Vietnamese Language and Speech Processing (VLSP 2018), 2018. [Online]. Available: http://vlsp.org.vn/archives

P. L. Hong, Vietnamese named entity recognition using token regular expressions and bidirectional inference," in in The Fourth International Workshop on Vietnamese Language and Speech Processing (VLSP 2016), 2016. [Online]. Available: http://vlsp.org.vn/archives

T. H. Le, T. T. T. Nguyen, T. H. Do, and X. T. Nguyen, "Named entity recognition in vietnamese text," in in The Fourth International Workshop on Vietnamese Language and Speech Processing (VLSP 2016), 2016. [Online]. Available: http://vlsp.org.vn/archives

V.-T. Luong and L. K. Pham, Za-ner: Vietnamese named entity recognition at vlsp 2018 evaluation campaign," in in The Fifth International Workshop on Vietnamese Language and Speech Processing (VLSP 2018), 2018. [Online]. Available: http://vlsp.org.vn/archives

P. Q. N. Minh, A feature-based model for nested named-entity recognition at vlsp-2018 ner evaluation campaign," in in The Fifth International Workshop on Vietnamese Language and Speech Processing (VLSP 2018), 2018. [Online]. Available: http://vlsp.org.vn/archives

D. B. Nguyen, S. H. Hoang, S. B. Pham, and T. P. Nguyen, "Named entity recognition for vietnamese," in Intelligent Information and Database Systems, N. T. Nguyen, M. T. Le, and J. Swiatek, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp. 205-214.

H. Nguyen and T. Cao, "Named entity disambiguation: A hybrid approach," International Journal of Computational Intelligence Systems, vol. 5, no. 6, pp. 1052-1067, 2012.

T. C. V. Nguyen, T. S. Pham, T. H. Vuong, N. V. Nguyen, and M. V. Tran, "Dsktlab-ner: Nested named entity recognition in vietnamese text," in in The Fourth International Workshop on Vietnamese Language and Speech Processing (VLSP 2016), 2016. [Online]. Available:

http://vlsp.org.vn/archives

T. T. V. Nguyen and H. T. Cao, "Vn-kim ie: Automatic extraction of vietnamese named-entities on the web," Journal of New Generation Computing, vol. 25, no. 3, pp. 277-292, 2007.

T. S. Nguyen, L. M. Nguyen, and X. C. Tran, Vietnamese named entity recognition @vlsp 2016 evaluation campaign," in in The Fourth International Workshop on Vietnamese Language and Speech Processing (VLSP 2016), 2016. [Online]. Available: http://vlsp.org.vn/archives

Q. H. Pham, M.-L. Nguyen, B. T. Nguyen, and N. V. Cuong, "Semi-supervised learning for vietnamese named entity recognition using online conditional random elds," in Proceedings of the Fifth Named Entity Workshop, joint with 53rd ACL and the 7th IJCNLP, Beijing, China,

July 2015, pp. 50-55.

T. Pham, L. M. Nguyen, and Q. Ha, "Named entity recognition for Vietnamese documents using semi-supervised learning method of crfs with generalized expectation criteria," in 2012 International Conference on Asian Language Processing, Nov 2012, pp. 85-88.

E. F. T. K. Sang and F. D. Meulder, "Introduction to the conll-2003 shared task: Language-independent named entity recognition," in Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, 2003. [Online]. Available: http://www.aclweb.org/anthology/W03-0419

B. M. Sundheim, "Overview of results of the muc-6 evaluation," in Proceedings of the 6th Conference on Message Understanding, ser. MUC6 '95. Stroudsburg, PA, USA: Association for Computational Linguistics, 1995, pp. 13-31. [Online]. Available: https://doi.org/10.3115/1072399.1072402

P. T. X. Thao, T. Q. Tri, D. Dien, and N. Collier, "Named entity recognition in Vietnamese using classier voting," ACM Transactions on Asian Language Information Processing (TALIP), vol. 6, no. 4, pp. 3:1-3:18, Dec. 2007. [Online]. Available: http://doi.acm.org/10.1145/1316457.1316460

E. F. Tjong Kim Sang, "Introduction to the conll-2002 shared task: Language-independent named entity recognition," in Proceedings of the 6th Conference on Natural Language Learning - Volume 20, ser. COLING-02. Stroudsburg, PA, USA: Association for Computational Linguistics, 2002, pp. 155{158. [Online]. Available: https://doi.org/10.3115/1118853.1118877




DOI: https://doi.org/10.15625/1813-9663/34/4/13161

Journal of Computer Science and Cybernetics ISSN: 1813-9663

Published by Vietnam Academy of Science and Technology