Abstract: Customer churn prediction is an essential business tool that allows organizations to predict customers who will likely stop using their services. This prediction is essential in reducing revenue loss and improving customer retention. This research work presents a machine learning-based customer churn prediction system that combines the latest resampling methods and Natural Language Processing (NLP) techniques. To handle the issue of class imbalance, Random Oversampling, Random Undersampling, SMOTE, and ADASYN are applied. In addition, BERT (Bidirectional Encoder Representations from Transformers) is applied for the extraction of contextual features from customer feedback and text data. The preprocessed features are then used as input for classification models like Random Forest, XGBoost, and Logistic Regression. The experimental results based on accuracy, precision, recall, F1-score, and ROC-AUC show improved predictive accuracy. The combination of resampling techniques and BERT-based feature extraction is highly effective in improving the accuracy of customer churn prediction.
Keywords: Customer Churn, Machine Learning, Random Forest, XGBoost, Logistic Regression, BERT, SMOTE, ADASYN, Class Imbalance, NLP.
Downloads:
|
DOI:
10.17148/IARJSET.2026.13340
[1] Manoj A, Dr. K. Thenmozhi, "CUSTOMER CHRUN PREDICTION USING MACHINE LERANING," International Advanced Research Journal in Science, Engineering and Technology (IARJSET), DOI: 10.17148/IARJSET.2026.13340