[1]

Nguyen Van Thinh, T. V. Lang, and V. T. T. Van, “OD-VR-Cap: Image captioning based on detecting and predicting relationships between objects”, J. Comput. Sci. Cybern., vol. 40, no. 4, p. 327–346, Dec. 2024.