Neural Machine Translation between Vietnamese and English: an Empirical Study

Hong-Hai Phan-Vu; Viet Trung Tran; Van Nam Nguyen; Hoang Vu Dang; Phan Thuan Do

doi:10.15625/1813-9663/35/2/13233

Author affiliations

Authors

Hong-Hai Phan-Vu Hanoi University of Science and Technology
Viet Trung Tran
Van Nam Nguyen
Hoang Vu Dang
Phan Thuan Do

DOI:

https://doi.org/10.15625/1813-9663/35/2/13233

Keywords:

NMT, neural machine translation, ConvS2S, Transformer model, LSTM, RNN

Abstract

Machine translation is shifting to an end-to-end approach based on deep neural networks. The state of the art achieves impressive results for popular language pairs such as English - French or English - Chinese. However for English - Vietnamese the shortage of parallel corpora and expensive hyper-parameter search present practical challenges to neural-based approaches. This paper highlights our efforts on improving English-Vietnamese translations in two directions: (1) Building the largest open Vietnamese - English corpus to date, and (2) Extensive experiments with the latest neural models to achieve the highest BLEU scores. Our experiments provide practical examples of effectively employing different neural machine translation models with low-resource language pairs.

Metrics

Metrics Loading ...

References

Bao, H. T., Khanh, P. N., Le, H. T., & Thao, N. T. P. (2009). Issues and First Development Phase of the English-Vietnamese Translation System EVSMT1.0. In Proceedings of the third Hanoi Forum on Information — Communication Technology.

Scrapy, A. (2016). Fast and Powerful Scraping and Web Crawling Framework. Scrapy. Org. Np.

Krogh, A., & Vedelsby, J. (1994). Neural Network Ensembles, Cross Validation and Active Learning. In Proceedings of the 7th International Conference on Neural Information Processing Systems (pp. 231–238). Cambridge, MA, USA: MIT Press. Retrieved from http://dl.acm.org/citation.cfm?id=2998687.2998716 http://dl.acm.org/citation.cfm?id=2998687.2998716">

Ngo, Q. H., Winiwarter, W., & Wloka, B. (2013). EVBCorpus-a multi-layer English-Vietnamese bilingual corpus for studying tasks in comparative linguistics. In Proceedings of the 11th Workshop on Asian Language Resources (pp. 1–9).

Dauphin, Y. N., Fan, A., Auli, M., & Grangier, D. (2016). Language Modeling with Gated Convolutional Networks. CoRR, abs/1612.0. Retrieved from http://arxiv.org/abs/1612.08083 http://arxiv.org/abs/1612.08083">

Huang, P.-S., Wang, C., Zhou, D., & Deng, L. (2017). Neural Phrase-based Machine Translation. CoRR, abs/1706.0.

Kalchbrenner, N., & Blunsom, P. (2013). Recurrent Continuous Translation Models. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Seattle: Association for Computational Linguistics.

Hutchins, W. J. (2001). Machine Translation over fifty years. Histoire Épistémologie Langage, 23(1), 7–31. https://doi.org/10.3406/hel.2001.2815 https://doi.org/10.3406/hel.2001.2815">

Papineni, K., Roukos, S., Ward, T., & Zhu, W.-J. (2002). BLEU: A Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (pp. 311–318). Stroudsburg, PA, USA: Association for Computational Linguistics.

Dinh, D., Ngan, N. L. T., Quang, D. X., & Nam, V. C. (2003). A Hybrid Approach to Word Order Transfer in the English-to-Vietnamese Machine Translation. In Proceedings of the Machine Translation Summit IX.

Popel, M., & Bojar, O. (2018). Training Tips for the Transformer Model.

CoRR, abs/1804.0.

Brown, P. F., Pietra, V. J. Della, Pietra, S. A. Della, & Mercer, R. L. (1993). The Mathematics of Statistical Machine Translation: Parameter Estimation. Comput. Linguist., 19(2), 263–311. Retrieved from http://dl.acm.org/citation.cfm?id=972470.972474 http://dl.acm.org/citation.cfm?id=972470.972474">

Dinh, D. (2002). Building a Training Corpus for Word Sense Disambiguation in English-to-Vietnamese Machine Translation. In Proceedings of the 2002 COLING Workshop on Machine Translation in Asia - Volume 16 (pp. 1–7). Stroudsburg, PA, USA: Association for Computational Linguistics. https://doi.org/10.3115/1118794.1118801 https://doi.org/10.3115/1118794.1118801">

Cho, K., van Merrienboer, B., Bahdanau, D., & Bengio, Y. (2014). On the Properties of Neural Machine Translation: Encoder-Decoder Approaches. CoRR, abs/1409.1. Retrieved from http://arxiv.org/abs/1409.1259 http://arxiv.org/abs/1409.1259">

Smith, S. L., Kindermans, P.-J., & Le, Q. V. (2017). Don’t Decay the Learning Rate, Increase the Batch Size. CoRR, abs/1711.0.

Gehring, J., Auli, M., Grangier, D., Yarats, D., & Dauphin, Y. N. (2017). Convolutional Sequence to Sequence Learning. CoRR, abs/1705.0. Retrieved from http://arxiv.org/abs/1705.03122 http://arxiv.org/abs/1705.03122">

Kandaswamy, C., Silva, L. M., Alexandre, L. A., Santos, J. M., & de Sá, J. M. (2014). Improving deep neural network performance by reusing features trained with transductive transference. In International Conference on Artificial Neural Networks (pp. 265–272).

Sukhbaatar, S., Szlam, A., Weston, J., & Fergus, R. (2015). Weakly Supervised Memory Networks. CoRR, abs/1503.0. Retrieved from http://arxiv.org/abs/1503.08895 http://arxiv.org/abs/1503.08895">

Dang, V. B., & Ho, B.-Q. (2007). Automatic Construction of English-Vietnamese Parallel Corpus through Web Mining. In RIVF (pp. 261–266). IEEE. Retrieved from http://dblp.uni-trier.de/db/conf/rivf/rivf2007.html#DangH07 http://dblp.uni-trier.de/db/conf/rivf/rivf2007.html#DangH07">

Zhou, J., Cao, Y., Wang, X., Li, P., & Xu, W. (2016). Deep Recurrent Models

with Fast-Forward Connections for Neural Machine Translation. TACL, 4, 371–383.

Jean, S., Cho, K., Memisevic, R., & Bengio, Y. (2015). On Using Very Large Target Vocabulary for Neural Machine Translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) (pp. 1–10). Beijing, China: Association for Computational Linguistics. Retrieved from http://www.aclweb.org/anthology/P15-1001 http://www.aclweb.org/anthology/P15-1001">

Dinh, D., Kiem, H., & Hovy, E. (2003). BTL: a Hybrid Model for English-Vietnamese Machine Translation. In Proceedings of the Machine Translation Summit IX.

Wolk, K., & Marasek, K. (2015). {PJAIT} Systems for the {IWSLT} 2015 Evaluation Campaign Enhanced by Comparable Corpora. CoRR, abs/1512.0.

Luong, M.-T., Sutskever, I., Le, Q. V, Vinyals, O., & Zaremba, W. (2015). Addressing the Rare Word Problem in Neural Machine Translation. Retrieved from http://arxiv.org/abs/1410.8206 http://arxiv.org/abs/1410.8206">

Luong, M.-T., & Manning, C. D. (2015). Stanford Neural Machine Translation Systems for Spoken Language Domains. In Proceeding of the 12th International Workshop on Spoken Language Translation 2015.

Lei Ba, J., Kiros, J. ~R., & Hinton, G. ~E. (2016). Layer Normalization. ArXiv E-Prints.

US Census Bureau, American community survey. (2010, October). Retrieved from http://www.census.gov http://www.census.gov">

Phuoc, N. Q., Quan, Y., & Ock, C.-Y. (2016). Building a Bidirectional English-Vietnamese Statistical Machine Translation System by Using MOSES. International Journal of Computer and Electrical Engineering, (2), 161–168.

Luong, M.-T., Brevdo, E., & Zhao, R. (2017). Neural Machine Translation (seq2seq) Tutorial. Https://Github.Com/Tensorflow/Nmt.

Hochreiter, S., & Schmidhuber, J. (1996). LSTM Can Solve Hard Long Time Lag Problems. In Proceedings of the 9th International Conference on Neural Information Processing Systems (pp. 473–479). Cambridge, MA, USA: MIT Press. Retrieved from http://dl.acm.org/citation.cfm?id=2998981.2999048 http://dl.acm.org/citation.cfm?id=2998981.2999048">

Miller, G. A. (1995). WordNet: A Lexical Database for English. Commun. ACM, 38(11), 39–41. https://doi.org/10.1145/219717.219748 https://doi.org/10.1145/219717.219748">

Viet Tran Hong Huyen Vu Thuong, V. N. Van, & Tien, T. Le. (2015). The English-Vietnamese Machine Translation System for IWSLT 2015. In Proceeding of the 12th International Workshop on Spoken Language Translation 2015 (pp. p80--p83).

He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep Residual Learning for Image Recognition. CoRR, abs/1512.0. Retrieved from http://arxiv.org/abs/1512.03385 http://arxiv.org/abs/1512.03385">

Jean, S., Firat, O., Cho, K., Memisevic, R., & Bengio, Y. (2015). Montreal Neural Machine Translation Systems for WMT15. In Proceedings of the Tenth Workshop on Statistical Machine Translation (pp. 134–140). Lisbon, Portugal: Association for Computational Linguistics. Retrieved from http://aclweb.org/anthology/W15-3014 http://aclweb.org/anthology/W15-3014">

Dien, D., & Kiem, H. (2003). POS-tagger for English-Vietnamese Bilingual Corpus. In Proceedings of the HLT-NAACL 2003 Workshop on Building and Using Parallel Texts: Data Driven Machine Translation and Beyond - Volume 3 (pp. 88–95). Stroudsburg, PA, USA: Association for Computational Linguistics. https://doi.org/10.3115/1118905.1118921 https://doi.org/10.3115/1118905.1118921">

Dietterich, T. G. (2000). Ensemble Methods in Machine Learning. In Proceedings of the First International Workshop on Multiple Classifier Systems (pp. 1–15). London, UK, UK: Springer-Verlag. Retrieved from http://dl.acm.org/citation.cfm?id=648054.743935 http://dl.acm.org/citation.cfm?id=648054.743935">

Phan-Vu, H.-H., Nguyen, V.-N., Tran, V.-T., & Do, P.-T. (2017). Towards State-of-the-art English-Vietnamese Neural Machine Translation. In SoICT (pp. 120–126). ACM.

Sennrich, R., Haddow, B., & Birch, A. (2016). Edinburgh Neural Machine Translation Systems for {WMT} 16. CoRR, abs/1606.0.

Cettolo, M., Niehues, J., Stuker, S., Bentivogli, L., Cattoni, R., & Federico, M. (2015). The IWSLT 2015 Evaluation Campaign. In Proceeding of the 12th International Workshop on Spoken Language Translation 2015.

Koehn, P., Och, F. J., & Marcu, D. (2003). Statistical Phrase-based Translation. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1 (pp. 48–54). Stroudsburg, PA, USA: Association for Computational Linguistics. https://doi.org/10.3115/1073445.1073462 https://doi.org/10.3115/1073445.1073462">

Sutskever, I., Martens, J., Dahl, G., & Hinton, G. (2013). On the Importance of Initialization and Momentum in Deep Learning. In Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28 (p. III-1139--III-1147). JMLR.org. Retrieved from http://dl.acm.org/citation.cfm?id=3042817.3043064 http://dl.acm.org/citation.cfm?id=3042817.3043064">

Building basic resources and tools for Vietnamese language and speech processing. (2010). In VLSP Projects. Retrieved from http://vlsp.vietlp.org:8080/demo/?page=resources http://vlsp.vietlp.org:8080/demo/?page=resources">

Bojar, O., Graham, Y., & Kamran, A. (2017). Results of the WMT17 Metrics Shared Task. In WMT.

Le, K. H. (2003). One method of interlingual translation. In Proceedings of National Conference on IT Research, Development and Applications.

Schuster, M., & Paliwal, K. K. (1997). Bidirectional Recurrent Neural Networks. Trans. Sig. Proc., 45(11), 2673–2681. https://doi.org/10.1109/78.650093 https://doi.org/10.1109/78.650093">

Ho, T. B. (2005). Current status of machine translation research in Vietnam towards Asian wide multi language machine translation project. In Proceedings of Vietnamese Language and Speech Processing Workshop.

Kalchbrenner, N., Espeholt, L., Simonyan, K., van den Oord, A., Graves, A., & Kavukcuoglu, K. (2016). Neural Machine Translation in Linear Time. CoRR, abs/1610.1.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … Polosukhin, I. (2017). Attention Is All You Need. CoRR, abs/1706.0. Retrieved from http://arxiv.org/abs/1706.03762 http://arxiv.org/abs/1706.03762">

Vietnam Population. (2017, June). Retrieved from http://countrymeters.info/en/Vietnam http://countrymeters.info/en/Vietnam">

Vaswani, A., Bengio, S., Brevdo, E., Chollet, F., Gomez, A. N., Gouws, S., … Uszkoreit, J. (2018). Tensor2Tensor for Neural Machine Translation. CoRR, abs/1803.0. Retrieved from http://arxiv.org/abs/1803.07416 http://arxiv.org/abs/1803.07416">

Huang, J., Rathod, V., Sun, C., Zhu, M., Korattikara, A., Fathi, A., … Murphy, K. (2016). Speed/accuracy trade-offs for modern convolutional object detectors. CoRR, abs/1611.1.

Britz, D., Goldie, A., Luong, M.-T., & Le, Q. (2017). Massive Exploration of Neural Machine Translation Architectures. Retrieved from http://arxiv.org/abs/1703.03906 http://arxiv.org/abs/1703.03906">

Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.

Berglund, M., Raiko, T., Honkala, M., Kärkkäinen, L., Vetek, A., & Karhunen, J. T. (2015). Bidirectional recurrent neural networks as generative models. In Advances in Neural Information Processing Systems (pp. 856–864).

Sennrich, R., Haddow, B., & Birch, A. (2015). Neural machine translation of rare words with subword units. ArXiv Preprint ArXiv:1508.07909.

Luong, M.-T., Pham, H., & Manning, C. D. (2015). Effective approaches to attention-based neural machine translation. ArXiv Preprint ArXiv:1508.04025.

Cettolo, M., Girardi, C., & Federico, M. (2012). Wit3: Web inventory of transcribed and translated talks. In Proceedings of the 16th Conference of the European Association for Machine Translation (EAMT) (Vol. 261, p. 268).

Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., … others. (2007). Moses: Open source toolkit for statistical machine translation. In Proceedings of the 45th annual meeting of the ACL on interactive poster and demonstration sessions (pp. 177–180).

Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in neural information processing systems (pp. 3104–3112).

Wu, Y., Schuster, M., Chen, Z., Le, Q. V, Norouzi, M., Macherey, W., … others. (2016). Google’s neural machine translation system: Bridging the gap between human and machine translation. ArXiv Preprint ArXiv:1609.08144.

Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. ArXiv Preprint ArXiv:1409.0473.