Using Permutation-Based Feature Importance for Improved Machine Learning Model Performance at Reduced Costs

dc.contributor.authorKhan, Adam
dc.contributor.authorAli, Asad
dc.contributor.authorKhan, Jahangir
dc.contributor.authorUllah, Fasee
dc.contributor.authorFaheem, Muhammad
dc.contributor.orcidhttps://orcid.org/0000-0003-4628-4486
dc.date.accessioned2026-02-03T12:08:00Z
dc.date.issued2025
dc.description.abstractIn Software Quality Assurance (SQA), predicting defect-prone software modules is essential for ensuring software reliability and consistency. This task is commonly achieved through Machine Learning (ML) techniques, but improving model performance typically incurs significant computational costs. These high computational costs and uncertain payoffs make most Software engineering researchers reluctant to optimize ML models. This creates a need for novel techniques that can achieve near-optimal performance of hyperparameter settings while maintaining the computational efficiency of default settings. To address this, we employed five ML models, Decision Tree, Ranger, Random Forest, Support Vector Machine, and k-nearest Neighbors, and optimized their parameters using the random search technique. Our experiments covered six diverse Software Fault Prediction (SFP) datasets, encompassing various software features, application domains, and defect patterns, to evaluate the approach’s generalizability and effectiveness. Moreover, the Permutation Feature Importance (PFI)-based model-agnostic method was employed to identify the top ten features most critical for model accuracy and efficiency. These selected features were used to retrain the ML models without hyperparameters (default settings) to determine whether similar performance could be achieved at low computational cost. The results show an average accuracy improvement of 77.39% and a 92.02% reduction in computational cost. The most important case attained a 99.25% accuracy improvement and a 96.77% cost reduction. Such results clearly show that PFI-based feature selection is capable of high performance at a fraction of computational cost, offering an efficient solution for software engineers to optimize ML models.en
dc.description.notification© 2025 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
dc.description.reviewstatusfi=vertaisarvioitu|en=peerReviewed|
dc.format.pagerange36421-36435
dc.identifier.urihttps://osuva.uwasa.fi/handle/11111/19737
dc.identifier.urnURN:NBN:fi-fe2026020310977
dc.language.isoen
dc.publisherIEEE
dc.relation.doihttps://doi.org/10.1109/ACCESS.2025.3544625
dc.relation.ispartofjournalIEEE access
dc.relation.issn2169-3536
dc.relation.urlhttps://doi.org/10.1109/ACCESS.2025.3544625
dc.relation.urlhttps://urn.fi/URN:NBN:fi-fe2026020310977
dc.relation.volume13
dc.rightshttps://creativecommons.org/licenses/by/4.0/
dc.source.identifierWOS:001435462200028
dc.source.identifier2-s2.0-85218874952
dc.source.identifier71e02d93-8810-499e-944a-b7cba1ba8253
dc.source.metadataSoleCRIS
dc.subjectComputational modeling
dc.subjectFeature extraction
dc.subjectAccuracy
dc.subjectComputational efficiency
dc.subjectPredictive models
dc.subjectOptimization
dc.subjectSupport vector machines
dc.subjectRandom forests
dc.subjectRadio frequency
dc.subjectDecision trees
dc.subjectModel-agnostic techniques
dc.subjectpermutation feature importance (PFI)
dc.subjectsoftware fault prediction (SFP)
dc.subjectpredictive accuracy
dc.subjectmachine learning (ML)
dc.subjectcomputational cost
dc.subjectdefault settings
dc.subjecthyperparameter
dc.subject.disciplinefi=Tietotekniikka tekn|en=Information Technology tech|
dc.titleUsing Permutation-Based Feature Importance for Improved Machine Learning Model Performance at Reduced Costs
dc.type.okmfi=A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä (vertaisarvioitu)|en=A1 Journal article (peer-reviewed)|
dc.type.publicationarticle
dc.type.versionpublishedVersion

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
nbnfi-fe2026020310977.pdf
Size:
1.95 MB
Format:
Adobe Portable Document Format

Kokoelmat