Phrasal semantic distance for vietnamese textual document retrieval


  • Tuyen Thi-Thanh Do University of Information Technology, VNU-HCM
  • Dang Tuan Nguyen University of Information Technology, VNU-HCM



Lexicon ontology, phrasal semantic analysis, semantic class, semantic distance, semantic information retrieval.


In this paper, a computational semantic method is proposed to estimate the phrasal semantic distance used in our model of a Vietnamese document retrieval system. The semantic distances between phrases are defined in terms of semantic classes and semantic relations to ensure that it can reflect how different two certain phrases are. To estimate the semantic distance, the semantic classes of a phase are identified by using the n-gram model. After identification of the semantic classes, their semantic relations are also identified by using a Vietnamese Lexicon Ontology. This handcrafted ontology contains defined semantic classes and their potential relations in Vietnamese language explicitly. For the evaluation purpose, a phrasal semantic retrieval system has been built to test with a data set of 720 phrases and 30 queries. The evaluation shows the precision of 96.6% and the recall of 78.4% on experimental results.


Metrics Loading ...




How to Cite

T. T.-T. Do and D. T. Nguyen, “Phrasal semantic distance for vietnamese textual document retrieval”, JCC, vol. 31, no. 3, p. 185, Sep. 2015.



Computer Science