French-Vietnamese statistical machine translation combining with chunk alignments

Ngoc Tan Le, Ngoc Tien Le, Dien Dinh
Author affiliations

Authors

  • Ngoc Tan Le Industrial University of Ho Chi Minh
  • Ngoc Tien Le Industrial University of Ho Chi Minh
  • Dien Dinh University of Natural Sciences of Ho Chi Minh

DOI:

https://doi.org/10.15625/1813-9663/29/4/4343

Keywords:

Bilingual corpus, statistical machine translation, chunk alignment

Abstract

Nowadays, among Statistical Machine Translation (SMT) models, the phrase-based SMT is highly appreciated. However, this model is still lacked of linguistics knowledge at higher levels such as morphological, syntactic and semantic information. Consequently, the results of this approach are still limited in cases of long sentences. So, using morphological information from such as phrase chunking on the purpose of reducing the length of sentences to improve the translation quality is a promising approach, and hence to disambiguate the chunk alignment in the long sentences. In this paper, we present an approach of a chunk alignment applied to French-Vietnamese SMT. We have tested our model system with a French-Vietnamese bilingual corpus which consists of 10,000 pairs and assessed the metrics measures. The result of the model of French-Vietnamese SMT based on chunk alignment is considerable with the BLEU metric measure which increases almost 2% in comparison of the baseline model.

Metrics

Metrics Loading ...

Published

03-12-2013

How to Cite

[1]
N. T. Le, N. T. Le, and D. Dinh, “French-Vietnamese statistical machine translation combining with chunk alignments”, JCC, vol. 29, no. 4, pp. 358–368, Dec. 2013.

Issue

Section

Computer Science