Abstract: Customer churn prediction in Business-to-Business (B2B) Software-as-a-Service (SaaS) environments presents unique analytical challenges that differ fundamentally from consumer-facing churn contexts. Subscription-based enterprise software companies face heightened churn risk at contract renewal boundaries, where complex organizational buying decisions involve multiple stakeholders and extensive switching-cost evaluations. This study investigates and systematically compares five machine learning classification methodologies — Logistic Regression, Decision Trees, Random Forests, Gradient Boosting (XGBoost), and Multi-Layer Perceptron Neural Networks — applied to B2B SaaS enterprise client behavioral data for predicting customer churn with operational precision. The research employs a comprehensive preprocessing pipeline encompassing median imputation for missing values, Interquartile Range (IQR)-based Winsorization for outlier treatment, Min-Max normalization, and Synthetic Minority Over-sampling Technique (SMOTE) for class imbalance mitigation. Feature engineering and dimensionality reduction are performed using chi-square statistical testing and Random Forest importance scoring. Experimental evaluation across stratified 10-fold cross-validation demonstrates that ensemble methods, particularly Gradient Boosting, consistently achieve superior classification performance — attaining AUC-ROC of 0.934, Precision of 0.843, Recall of 0.962, and F1-Score of 0.899 on the importance-selected five-feature subset. Feature importance analysis identifies CustomerCount and Products as the primary churn drivers, collectively accounting for over 75% of cumulative predictive importance, revealing that operational dependency breadth and platform integration depth are the fundamental determinants of enterprise client retention. Statistical significance of performance differences is confirmed via the Friedman test and Nemenyi post-hoc analysis. Findings provide actionable guidance for customer success teams, enabling data-driven prioritization of retention interventions and proactive risk mitigation in enterprise SaaS environments.
Keywords: customer churn prediction, B2B SaaS, machine learning, Random Forest, Gradient Boosting, XGBoost, feature selection, enterprise software, SMOTE, imbalanced classification, predictive analytics.
Downloads:
|
DOI:
10.17148/IARJSET.2026.13468
[1] Pratham Mehta, Mrs. S. Niveditha, "B2B SaaS Customer Churn Prediction: A Machine Learning Approach to Identifying At-Risk Enterprise Clients," International Advanced Research Journal in Science, Engineering and Technology (IARJSET), DOI: 10.17148/IARJSET.2026.13468