A study of data augmentation and accuracy improvement in machine translation for Vietnamese sign language

Thi Bich Diep Nguyen; Trung Nghia Phung; Tat Thang Vu

doi:10.15625/1813-9663/18025

Author affiliations

Authors

Thi Bich Diep Nguyen Graduate University of Science and Technology, Vietnam Academy of Science and Technology, 18 Hoang Quoc Viet Street, Cau Giay District, Ha Noi, Viet Nam
Trung Nghia Phung Thai Nguyen University of Information and Communication Technology, Z115 Street, Quyet Thang Ward, Thai Nguyen City, Viet Nam
Tat Thang Vu Institute of Information Technology, Vietnam Academy of Science and Technology, 18 Hoang Quoc Viet Street, Cau Giay District, Ha Noi, Viet Nam

DOI:

https://doi.org/10.15625/1813-9663/18025

Keywords:

Natural language processing, machine translation, Vietnamese sign language, data augmentation.

Abstract

Sign languages are independent languages of deaf communities. The translation from normal languages (i.e., Vietnamese Language - VL) as long as other sign languages to Vietnamese sign language (VSL) is a meaningful task that breaks down communication barriers and improves the quality of life for the deaf community. In this paper, we experimented with and proposed several methods for building and improving models for the VL to VSL translation task. We presented a data augmentation method to improve the performance of our neural machine translation models. Using an initial dataset of 10k bilingual sentence pairs, we were able to obtain a new dataset of 60k sentence pairs with a perplexity score no more than 1.5 times that of the original dataset. Experiments on the original dataset showed that rule-based models achieved the highest BLEU score of 68.02 among the translation models. However, with the augmented dataset, the Transformer model achieved the best performance with a BLEU score of 89.23, which is significantly better than that of other conventional approach methods.

Metrics

PDF views

332

References

. Sandler W., Lillo-Martin. Sign Language and Linguistic Universals. J. Linguist. page 738-742, 2006. DOI: https://doi.org/10.1017/CBO9781139163910

Achraf Othman and Mohamed Jemni, Statistical Sign Language Machine Translation from Englishwritten text to American Sign Language Gloss, IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 5, No 3, 2011.

Stephen Cox, Michael Lincoln, Judy Tryggvason, Melanie Nakisa, Mark Wells, Marcus Tutt, “Tessa, a system to aid communication with deaf people”, Proceedings of the fifth international ACM conference on Assistive technologies, 2002.

J. A. Bangham, S. J. Cox, R. Elliot, J. R. W. Glauert, I. Marshall, S. Rankov, and M. Wells, “Virtual signing: Capture, animation, storage and transmission – An overview of the ViSiCAST project” , IEEE Seminar on Speech and language processing for disabled and elderly people, 2000. DOI: https://doi.org/10.1049/ic:20000136

Angus Grieve-Smith, SignSynth: A Sign Language Synthesis Application Using Web3D and Perl, Conference: Revised Papers from the International Gesture Workshop on Gesture and Sign Languages in Human-Computer Interaction, 2002. DOI: https://doi.org/10.1007/3-540-47873-6_14

Bernd Krieg-Brückner, Jan Peleska, Ernst-Rüdiger Olderog, Alexander Baer, The Uniform Workbench, a Universal Development Environment for Formal Methods, Lecture Notes in Computer Science 1709, Springer 1999. DOI: https://doi.org/10.1007/3-540-48118-4_13

Gouri Sankar Mishra, Ashok Kumar Sahoo and Kiran Kumar Ravulakollu, “Word based statistical machine translation from english text to indian sign language”, ARPN Journal of Engineering and Applied Sciences, VOL. 12, NO. 2, 2017.

Thi Bich Diep Nguyen and Trung-Nghia Phung, “Some issues on syntax transformation in Vietnamese sign language translation”. Sign Language Studies. IJCSNS International Journal of Computer Science and Network Security, VOL.17 No.5, 2017.

Thi Bich Diep Nguyen, Trung-Nghia Phung, Tat-Thang Vu, “A rule-based method for text shortening in Vietnamese sign language translation”. Springer AISC, Vol. 672, Proc. of INDIA, 2017.

https://www.nltk.org/howto/wordnet.html.

Kayo Yin, Jesse Read, “Better Sign Language Translation with STMC-Transformer”, Proceedings of the 28th International Conference on Computational Linguistics, 2020. DOI: https://doi.org/10.18653/v1/2020.coling-main.525

Diederik P. Kingma, Jimmy Lei Ba, “Adam: a method for stochastic optimization”, International Conference on Learning Representations, 2015

Papineni K., Roukos S., Ward T., Zhu Z-J, “BLEU: A method for Automatic Evaluation of Machine Translation”, Proceedings of the 20th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, p.p 311-318, 2001 DOI: https://doi.org/10.3115/1073083.1073135

https://github.com/moses-smt/mosesdecoder/tree/master/scripts.

Minh-Thang Luong, Hieu Pham, Christopher D. Manning, “Effective Approaches to Attention-based Neural Machine Translation”, arXiv:1508.04025v5 [cs.CL], 2015. DOI: https://doi.org/10.18653/v1/D15-1166

A study of data augmentation and accuracy improvement in machine translation for Vietnamese sign language

Authors

DOI:

Keywords:

Abstract

Metrics

References

Downloads

Published

How to Cite

Issue

Section

License