A FORMULA TO CALCULATE PRUNING THRESHOLD FOR THE PART-OF-SPEECH TAGGING PROBLEM

Nguyen Chi Hieu

Abstract


The exact tagging of the words in the texts is a very important task in the natural language processing. It can support parsing the text, contribute to the solution of the polysemous word, and help to access a semantic information, etc. One of crucial factors in the POS (Part-of-Speech) tagging approaches based on the statistical method is the processing time. In this paper, we propose an approach to calculate the pruning threshold, which can apply into the Viterbi algorithm of Hidden Markov model for tagging the texts in the natural language processing. Experiment on the 1.000.000 words on the tag of the Wall Street Journal corpus showed that our proposed solution is satisfactory.


Keywords


Hidden Markov model, Part-of-speech tagging, Viterbi algorithm, Beam search.

Full Text:

PDF


DOI: https://doi.org/10.15625/2525-2518/54/3A/11959 Display counter: Abstract : 55 views. PDF : 39 views.

Refbacks

  • There are currently no refbacks.


Index: Google Scholar; Crossref; VCGate; Asean Citation Index

Published by Vietnam Academy of Science and Technology