NGUYEN VAN THINH; LANG, T. V.; VAN, V. T. T. OD-VR-Cap: Image captioning based on detecting and predicting relationships between objects. Journal of Computer Science and Cybernetics, [S. l.], v. 40, n. 4, p. 327–346, 2024. DOI: 10.15625/1813-9663/20929. Disponível em: https://vjs.ac.vn/index.php/jcc/article/view/20929. Acesso em: 13 apr. 2025.