Bank Customer Churn Prediction

This project aims to predict customer churn for a bank using machine learning models. The dataset contains information about the bank's customers and various features related to their transactions, demographics, and account activity. The main objective is to build and tune machine learning models to accurately predict whether a customer will churn or not.

Highlights

Large-scale dataset: 355,190 records × 116 features
Extensive feature selection using correlation, SHAP, and LIME
Trained Logistic Regression & SVM with hyperparameter tuning (GridSearchCV)
Deployed with Flask + Gunicorn + Streamlit UI for real-time predictions

Introduction

Customer churn is a critical issue for banks, as retaining existing customers is often more cost-effective than acquiring new ones. This project leverages machine learning to predict which customers are likely to churn based on their historical data and behavior patterns.

Dataset

Records: 355,190
Features: 116
Target variable: TARGET → 1 (churned), 0 (retained)
Data includes: Demographics, product usage, account activity, and more

Data Preprocessing

Handled missing values and duplicates
One-hot encoded categorical variables
Normalized numerical columns
Split into training and test sets

Feature Engineering

Significant features were identified through various techniques, including correlation analysis, SHAP, and LIME. The top features selected for the model included:

REST_AVG_CUR
LDEAL_ACT_DAYS_PCT_AAVG
REST_DYNAMIC_IL_3M
CR_PROD_CNT_IL_5
CR_PROD_CNT_TOVR_4
REST_DYNAMIC_CUR_1M
CR_PROD_CNT_TOVR_5
CR_PROD_CNT_PIL_4
TURNOVER_DYNAMIC_IL_3M
TURNOVER_DYNAMIC_IL_1M
APP_MARITAL_STATUS_Civil Union
CR_PROD_CNT_CC_9
PACK_109
CR_PROD_CNT_VCU_3
CR_PROD_CNT_TOVR_6

Model Training and Tuning

Trained the following models:

Logistic Regression
Support Vector Machine (SVM)

Used GridSearchCV for hyperparameter tuning
Evaluated with:

Accuracy
Precision
Recall
F1-score
ROC-AUC

Interpretability

The interpretability of the models was analyzed using LIME. These methods provided insights into the most important features driving the predictions:

LIME (Local Interpretable Model-agnostic Explanations) was used to explain individual predictions by approximating the model locally.

Deployment

Backend: Flask app running with Gunicorn
Frontend: Streamlit UI for real-time predictions
Input form for customer details → instant churn prediction in real-time

Usage

To run the project locally:

Clone the repository.
Install the required dependencies.
Run the Flask app using Gunicorn.
Access the Streamlit interface to input customer data and view predictions.

Results

Model	Accuracy	Precision	Recall	F1 Score	ROC-AUC
Logistic Regression	~78%	0.76	0.72	0.74	0.76
SVM	~80%	0.77	0.79	0.78	0.80

The Support Vector Machine model was selected for deployment due to its higher recall and ROC-AUC, making it more effective for minimizing false negatives in churn prediction.

Conclusion

This project demonstrates the effectiveness of machine learning in predicting customer churn. By understanding the key features contributing to churn, banks can develop targeted strategies to retain customers and reduce churn rates.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Logistic Model - Final		Logistic Model - Final
SVM Model - Final		SVM Model - Final
Customer_Churn_Prediction.ipynb		Customer_Churn_Prediction.ipynb
Final_metrics1.csv		Final_metrics1.csv
LR_model_final.pkl		LR_model_final.pkl
ML_Assignment2_.ipynb		ML_Assignment2_.ipynb
README.md		README.md
SVM_model_final.pkl		SVM_model_final.pkl
SVM_sf.pkl		SVM_sf.pkl
SVM_undersample.pkl		SVM_undersample.pkl
logistic_model.pkl		logistic_model.pkl
logistic_model_sf.pkl		logistic_model_sf.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Bank Customer Churn Prediction

Highlights

Table of Contents

Introduction

Dataset

Data Preprocessing

Feature Engineering

Model Training and Tuning

Interpretability

Deployment

Usage

Results

Conclusion

About

Uh oh!

Releases

Packages

Languages

dev-kanika/Customer_Churn_Prediction

Folders and files

Latest commit

History

Repository files navigation

Bank Customer Churn Prediction

Highlights

Table of Contents

Introduction

Dataset

Data Preprocessing

Feature Engineering

Model Training and Tuning

Interpretability

Deployment

Usage

Results

Conclusion

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages