Nguyen Van Thinh, T. V. Lang, and V. T. T. Van. “OD-VR-Cap: Image Captioning Based on Detecting and Predicting Relationships Between Objects”. Journal of Computer Science and Cybernetics, vol. 40, no. 4, Dec. 2024, p. 327-46, doi:10.15625/1813-9663/20929.