Thinh, Nguyen Van, et al. “RGTranCNet: Effective Image Captioning Model Using Cross-Attention and Semantic Knowledge”. Vietnam Journal of Science and Technology, July 2025, doi:10.15625/2525-2518/22381.