Using Permutation-Based Feature Importance for Improved Machine Learning Model Performance at Reduced Costs

Khan, Adam; Ali, Asad; Khan, Jahangir; Ullah, Fasee; Faheem, Muhammad

Using Permutation-Based Feature Importance for Improved Machine Learning Model Performance at Reduced Costs

dc.contributor.author	Khan, Adam
dc.contributor.author	Ali, Asad
dc.contributor.author	Khan, Jahangir
dc.contributor.author	Ullah, Fasee
dc.contributor.author	Faheem, Muhammad
dc.contributor.orcid	https://orcid.org/0000-0003-4628-4486
dc.date.accessioned	2026-02-03T12:08:00Z
dc.date.issued	2025
dc.description.abstract	In Software Quality Assurance (SQA), predicting defect-prone software modules is essential for ensuring software reliability and consistency. This task is commonly achieved through Machine Learning (ML) techniques, but improving model performance typically incurs significant computational costs. These high computational costs and uncertain payoffs make most Software engineering researchers reluctant to optimize ML models. This creates a need for novel techniques that can achieve near-optimal performance of hyperparameter settings while maintaining the computational efficiency of default settings. To address this, we employed five ML models, Decision Tree, Ranger, Random Forest, Support Vector Machine, and k-nearest Neighbors, and optimized their parameters using the random search technique. Our experiments covered six diverse Software Fault Prediction (SFP) datasets, encompassing various software features, application domains, and defect patterns, to evaluate the approach’s generalizability and effectiveness. Moreover, the Permutation Feature Importance (PFI)-based model-agnostic method was employed to identify the top ten features most critical for model accuracy and efficiency. These selected features were used to retrain the ML models without hyperparameters (default settings) to determine whether similar performance could be achieved at low computational cost. The results show an average accuracy improvement of 77.39% and a 92.02% reduction in computational cost. The most important case attained a 99.25% accuracy improvement and a 96.77% cost reduction. Such results clearly show that PFI-based feature selection is capable of high performance at a fraction of computational cost, offering an efficient solution for software engineers to optimize ML models.	en
dc.description.notification	© 2025 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
dc.description.reviewstatus	fi=vertaisarvioitu\|en=peerReviewed\|
dc.format.pagerange	36421-36435
dc.identifier.uri	https://osuva.uwasa.fi/handle/11111/19737
dc.identifier.urn	URN:NBN:fi-fe2026020310977
dc.language.iso	en
dc.publisher	IEEE
dc.relation.doi	https://doi.org/10.1109/ACCESS.2025.3544625
dc.relation.ispartofjournal	IEEE access
dc.relation.issn	2169-3536
dc.relation.url	https://doi.org/10.1109/ACCESS.2025.3544625
dc.relation.url	https://urn.fi/URN:NBN:fi-fe2026020310977
dc.relation.volume	13
dc.rights	https://creativecommons.org/licenses/by/4.0/
dc.source.identifier	WOS:001435462200028
dc.source.identifier	2-s2.0-85218874952
dc.source.identifier	71e02d93-8810-499e-944a-b7cba1ba8253
dc.source.metadata	SoleCRIS
dc.subject	Computational modeling
dc.subject	Feature extraction
dc.subject	Accuracy
dc.subject	Computational efficiency
dc.subject	Predictive models
dc.subject	Optimization
dc.subject	Support vector machines
dc.subject	Random forests
dc.subject	Radio frequency
dc.subject	Decision trees
dc.subject	Model-agnostic techniques
dc.subject	permutation feature importance (PFI)
dc.subject	software fault prediction (SFP)
dc.subject	predictive accuracy
dc.subject	machine learning (ML)
dc.subject	computational cost
dc.subject	default settings
dc.subject	hyperparameter
dc.subject.discipline	fi=Tietotekniikka tekn\|en=Information Technology tech\|
dc.title	Using Permutation-Based Feature Importance for Improved Machine Learning Model Performance at Reduced Costs
dc.type.okm	fi=A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä (vertaisarvioitu)\|en=A1 Journal article (peer-reviewed)\|
dc.type.publication	article
dc.type.version	publishedVersion

Tiedostot

Näytetään 1 - 1 / 1

Name:: nbnfi-fe2026020310977.pdf
Size:: 1.95 MB
Format:: Adobe Portable Document Format

Lataa

Kokoelmat

Artikkelit