A STUDY OF DATA AUGMENTATION AND ACCURACY IMPROVEMENT IN MACHINE TRANSLATION FOR VIETNAMESE SIGN LANGUAGE
Author affiliations
DOI:
https://doi.org/10.15625/1813-9663/18025Keywords:
Natural Language Processing, Machine translation, , Vietnamese Sign Language, Data Augmentation, Data EnrichmentAbstract
Sign languages are independent languages of deaf communities. The translation from normal languages (i.e., Vietnamese Language - VL) as long as other sign languages to Vietnamese sign language (VSL) is a meaningful task that breaks down communication barriers and improves the quality of life for the deaf community. In this paper, we experimented with and proposed several methods for building and improving models for the VL to VSL translation task. We presented a data augmentation method to improve the performance of our neural machine translation models. Using an initial dataset of 10k bilingual sentence pairs, we were able to obtain a new dataset of 60k sentence pairs with a perplexity score no more than 1.5 times that of the original dataset. Experiments on the original dataset showed that rule-based models achieved the highest BLEU score of 68.02 among the translation models. However, with the augmented dataset, the Transformer model achieved the best performance with a BLEU score of 89.23, which is significantly better than that of other conventional approach methods.
Metrics
References
. Sandler W., Lillo-Martin. Sign Language and Linguistic Universals. J. Linguist. page 738-742, 2006. DOI: https://doi.org/10.1017/CBO9781139163910
Achraf Othman and Mohamed Jemni, Statistical Sign Language Machine Translation from Englishwritten text to American Sign Language Gloss, IJCSI International Journal of Computer Science Issues, Vol. 8, Issue 5, No 3, 2011.
Stephen Cox, Michael Lincoln, Judy Tryggvason, Melanie Nakisa, Mark Wells, Marcus Tutt, “Tessa, a system to aid communication with deaf people”, Proceedings of the fifth international ACM conference on Assistive technologies, 2002.
J. A. Bangham, S. J. Cox, R. Elliot, J. R. W. Glauert, I. Marshall, S. Rankov, and M. Wells, “Virtual signing: Capture, animation, storage and transmission – An overview of the ViSiCAST project” , IEEE Seminar on Speech and language processing for disabled and elderly people, 2000. DOI: https://doi.org/10.1049/ic:20000136
Angus Grieve-Smith, SignSynth: A Sign Language Synthesis Application Using Web3D and Perl, Conference: Revised Papers from the International Gesture Workshop on Gesture and Sign Languages in Human-Computer Interaction, 2002. DOI: https://doi.org/10.1007/3-540-47873-6_14
Bernd Krieg-Brückner, Jan Peleska, Ernst-Rüdiger Olderog, Alexander Baer, The Uniform Workbench, a Universal Development Environment for Formal Methods, Lecture Notes in Computer Science 1709, Springer 1999. DOI: https://doi.org/10.1007/3-540-48118-4_13
Gouri Sankar Mishra, Ashok Kumar Sahoo and Kiran Kumar Ravulakollu, “Word based statistical machine translation from english text to indian sign language”, ARPN Journal of Engineering and Applied Sciences, VOL. 12, NO. 2, 2017.
Thi Bich Diep Nguyen and Trung-Nghia Phung, “Some issues on syntax transformation in Vietnamese sign language translation”. Sign Language Studies. IJCSNS International Journal of Computer Science and Network Security, VOL.17 No.5, 2017.
Thi Bich Diep Nguyen, Trung-Nghia Phung, Tat-Thang Vu, “A rule-based method for text shortening in Vietnamese sign language translation”. Springer AISC, Vol. 672, Proc. of INDIA, 2017.
https://www.nltk.org/howto/wordnet.html.
Kayo Yin, Jesse Read, “Better Sign Language Translation with STMC-Transformer”, Proceedings of the 28th International Conference on Computational Linguistics, 2020. DOI: https://doi.org/10.18653/v1/2020.coling-main.525
Diederik P. Kingma, Jimmy Lei Ba, “Adam: a method for stochastic optimization”, International Conference on Learning Representations, 2015
Papineni K., Roukos S., Ward T., Zhu Z-J, “BLEU: A method for Automatic Evaluation of Machine Translation”, Proceedings of the 20th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, p.p 311-318, 2001 DOI: https://doi.org/10.3115/1073083.1073135
https://github.com/moses-smt/mosesdecoder/tree/master/scripts.
Minh-Thang Luong, Hieu Pham, Christopher D. Manning, “Effective Approaches to Attention-based Neural Machine Translation”, arXiv:1508.04025v5 [cs.CL], 2015. DOI: https://doi.org/10.18653/v1/D15-1166
Downloads
Published
How to Cite
Issue
Section
License
1. We hereby assign copyright of our article (the Work) in all forms of media, whether now known or hereafter developed, to the Journal of Computer Science and Cybernetics. We understand that the Journal of Computer Science and Cybernetics will act on my/our behalf to publish, reproduce, distribute and transmit the Work.2. This assignment of copyright to the Journal of Computer Science and Cybernetics is done so on the understanding that permission from the Journal of Computer Science and Cybernetics is not required for me/us to reproduce, republish or distribute copies of the Work in whole or in part. We will ensure that all such copies carry a notice of copyright ownership and reference to the original journal publication.
3. We warrant that the Work is our results and has not been published before in its current or a substantially similar form and is not under consideration for another publication, does not contain any unlawful statements and does not infringe any existing copyright.
4. We also warrant that We have obtained the necessary permission from the copyright holder/s to reproduce in the article any materials including tables, diagrams or photographs not owned by me/us.