Fusionnetx: A highly effective multimodal framework for skin cancer detection

Lam Hung Nguyen, Thang Cap, Huong Bui, Tuong Le
Author affiliations

Authors

  • Lam Hung Nguyen University of Information Technology, Vietnam National University Ho Chi Minh City, Quarter 34, Linh Xuan Ward, Ho Chi Minh City, Viet Nam
  • Thang Cap University of Information Technology, Vietnam National University Ho Chi Minh City, Quarter 34, Linh Xuan Ward, Ho Chi Minh City, Viet Nam
  • Huong Bui Faculty of Information Technology, HUTECH University, 475A Dien Bien Phu, Thanh My Tay Ward, Ho Chi Minh City, Viet Nam
  • Tuong Le Faculty of Information Technology, HUTECH University, 475A Dien Bien Phu, Thanh My Tay Ward, Ho Chi Minh City, Viet Nam

DOI:

https://doi.org/10.15625/1813-9663/22005

Keywords:

Skin cancer detection, ISIC 2024 dataset, CNN and transformer integration, multimodal model.

Abstract

Early detection of skin cancer significantly improves patient outcomes by allowing for timely intervention. This study introduces the FusionNetX framework, which is a robust multimodal model using both image data and metadata for skin cancer detection, leveraging the ISIC 2024 dataset. Our approach integrates convolutional neural networks (CNNs) and Transformer-based models to extract features from single-lesion images cropped from 3D Total Body Photographs (3D-TBP). These images, resembling close-up smartphone photos, are integrated with metadata analyzed using tree-based classifiers to enhance diagnostic accuracy. To address the extreme class imbalance in the dataset, we employ advanced sampling techniques and use stratified group cross-validation to ensure our model generalizes well across diverse patient groups. Our model demonstrates competitive performance, achieving a partial area under the Receiver Operating Characteristic curve (pAUC) of 0.18380 on cross-validation and securing the top rank with a private score of 0.17295 on the private test set. This top-ranking performance highlights the model’s ability to maintain high true positive rates (TPR ≥ 80%) while outperforming all other teams based on private scores. These results underscore the effectiveness of our multimodal approach, offering a promising solution for improving early skin cancer detection and enhancing patient outcomes.

Downloads

Published

12-11-2025

How to Cite

[1]L. H. Nguyen, T. Cap, H. Bui, and T. Le, “Fusionnetx: A highly effective multimodal framework for skin cancer detection”, J. Comput. Sci. Cybern., Nov. 2025.

Issue

Section

Articles

Similar Articles

1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.