Abstract: The deployment of machine learning models in critical decision making requires reliable explanations that remain stable under varying data conditions. While SHapley Additive exPlanations (SHAP) provides theoretically grounded feature importance rankings, the stability of these explanations when models encounter corrupted or degraded data remains poorly understood. This study investigates the robustness of SHAP feature importance rankings under controlled data corruption scenarios across three classification algorithms and datasets of varying complexity. The methodology employs optimally regularized Logistic Regression, Random Forest, and XGBoost models trained on medical, financial, and text classification datasets. Controlled corruption mechanisms combining 5% random sample removal and Gaussian noise injection with standard deviation equal to 0.1 times feature standard deviation simulate realistic data quality degradation. Stability metrics including Spearman correlation, Kendall tau, and top k feature overlap quantify ranking preservation. Results demonstrate that properly regularized models maintain substantial SHAP stability, with Spearman correlations exceeding 0.89 across all configurations. Random Forest exhibits superior stability with near perfect correlation (0.999) on structured data, while maintaining correlations above 0.95 across all scenarios. The findings establish that appropriate regularization and model selection enable reliable SHAP explanations even under moderate data corruption, providing practical guidelines for deploying interpretable machine learning in production environments where data quality cannot be guaranteed.

Keywords: SHAP, explainable AI, feature importance, model interpretability, data corruption, robustness analysis.


Downloads: PDF | DOI: 10.17148/IARJSET.2025.12810

How to Cite:

[1] Stow, May and Stewart, Ashley Ajumoke, "Empirical Analysis of SHAP Stability Under Data Corruption Across Datasets and Model Architectures," International Advanced Research Journal in Science, Engineering and Technology (IARJSET), DOI: 10.17148/IARJSET.2025.12810

Open chat