Machine learning for survival outcome in head and neck squamous cell carcinoma : a multicenter validation study

dc.contributor.authorAlabi, Rasheed Omobolaji
dc.contributor.authorGuntinas-Lichius, Orlando
dc.contributor.authorElmusrati, Mohammed
dc.contributor.authorAlmangush, Alhadi
dc.contributor.authorTiblom Ehrsson, Ylva
dc.contributor.authorLaurell, Göran
dc.contributor.authorMäkitie, Antti A.
dc.contributor.departmentDigital Economy
dc.contributor.facultyfi=Tekniikan ja innovaatiojohtamisen yksikkö|en=School of Technology and Innovations|
dc.contributor.organizationfi=Vaasan yliopisto|en=University of Vaasa|
dc.date.accessioned2026-01-07T11:55:37Z
dc.date.issued2025-11-29
dc.description.abstractMost head and neck squamous cell carcinoma (HNSCC) cases are diagnosed late, with an increased risk of recurrence and distant metastasis. In recent years, there has been a surge in the development of prognostic and predictive machine learning (ML) models for personalized treatment planning. However, only a small number of these have been externally validated. This study aimed to build a prognostic system by combining clinicopathological parameters and treatment-related factors as integrative inputs to build a machine learning (ML) model using data from the Surveillance, Epidemiology, and End Results (SEER, United States) program. We further validated the developed model using multicenter data obtained from the Thuringian Cancer Registry (Germany) and a multicenter prospective observational study obtained from the Uppsala University Hospital (Sweden) to estimate the overall survival (OS) of patients with HNSCC. Additionally, we explored the complementary prognostic potentials of these input parameters using permutation feature importance (PFI). A total of 40,164 patients with HNSCC were recruited from the SEER database and validated with 3950 cases obtained from the Thuringian Cancer Registry and 323 cases recruited from three University Hospitals in Sweden. We evaluated the prognostic significance of the input variables to predict OS in patients with HNSCC using permutation feature importance. The voting ensemble ML algorithm gave an area under receiving operating characteristics curve (AUC) of 0.76 and an accuracy of 70.0%. Independent external validation of the validation model with data from the Thuringian Cancer Registry and the Uppsala University Hospital gave AUCs of 0.68 and 0.76, with decreased performance accuracy in both cohorts. The PFI analysis of the base model showed that age at diagnosis, T stage, tumor site, marital status, and surgical treatment were the most important parameters for the predictive ability of the model for OS. External independent geographic validation is important for performance reproducibility and model generalization before recommending the model for further clinical evaluation. External independent geographic validation may not necessarily increase the performance accuracy. However, it can reveal and demonstrate the performance of the model outside the development data. A generalized ML can lead to individualized risk-based therapeutic decision-making. While independently validating the model may be possible during model development, data privacy and security-related issues may prevent including it as a prerequisite in the ML model development pipeline.
dc.description.notification© The Author(s) 2025. This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
dc.description.reviewstatusfi=vertaisarvioitu|en=peerReviewed|
dc.format.contentfi=kokoteksti|en=fulltext|
dc.format.extent10
dc.identifier.urihttps://osuva.uwasa.fi/handle/11111/19577
dc.identifier.urnURN:NBN:fi-fe202601071621
dc.language.isoeng
dc.publisherSpringer
dc.relation.doi10.1038/s41598-025-29295-6
dc.relation.funderSigrid Jusélius Foundation
dc.relation.funderFinska Läkaresällskapet
dc.relation.funderFinnish State Research Funding
dc.relation.funderSwedish Cancer Society
dc.relation.funderHelsinki University Library
dc.relation.grantnumber2015/363
dc.relation.grantnumber2018/502
dc.relation.grantnumber21 1419 Pj
dc.relation.grantnumber24 3394 Pj
dc.relation.ispartofjournalScientific reports
dc.relation.issn2045-2322
dc.relation.urlhttps://doi.org/10.1038/s41598-025-29295-6
dc.relation.volume16
dc.rightsCC BY-NC-ND 4.0
dc.subjectMachine learning; Head and Neck Squamous Cell Carcinoma (HNSCC); Overall survival; External validation; Validation study
dc.subject.disciplinefi=Tietoliikennetekniikka|en=Telecommunications Engineering|
dc.titleMachine learning for survival outcome in head and neck squamous cell carcinoma : a multicenter validation study
dc.type.okmfi=A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä|en=A1 Peer-reviewed original journal article|sv=A1 Originalartikel i en vetenskaplig tidskrift|
dc.type.publicationarticle
dc.type.versionpublishedVersion

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
Osuva_Alabi_Guntinas-Lichius_Elmusrati_Almangush_TiblomEhrsson_Laurell_Mäkitie_2025.pdf
Size:
1.61 MB
Format:
Adobe Portable Document Format

Kokoelmat