THINH, Nguyen Van; LANG, Tran Van; THANH, Van The. RGTranCNet: Effective image captioning model using cross-attention and semantic knowledge. Vietnam Journal of Science and Technology, Hanoi, VN, 2025. DOI: 10.15625/2525-2518/22381. Disponível em: https://vjs.ac.vn/jst/article/view/22381. Acesso em: 10 jan. 2026.