ChurnGuard: Predicting Customer Attrition

A Comparative Study of SVM and Logistic Regression

This project aims to predict customer churn using a dataset obtained from Kaggle. The goal is to build a predictive model that can classify customers as 'churn' or 'not churn'.

Get in Touch

About Me

Project Breakdown

The project workflow includes data preprocessing, train-test split, Support Vector Machine (SVM) model training & evaluation, regularized logistic regression, and model evaluation using various metrics.

Feature selection and hyperparameter tuning significantly improve model performance.

Use Accuracy, Precision, Recall, F1-score, Confusion Matrices, and ROC Curves to determine which model provides the best classification performance.

Build Churn Prediction Model

Build an effective churn prediction model that accurately identifies customers likely to leave.

Evaluate SVM Performance

Compare different SVM kernels (Linear, Polynomial, RBF) to determine the best-performing model.

Compare Against Logistic Regression

Investigate whether regularized logistic regression can outperform SVM models in churn prediction.

Key Findings

Polynomial Kernel SVM is the best-performing model for churn prediction. Logistic Regression provides a simpler, more interpretable alternative with comparable performance.

Polynomial Kernel SVM Outperformed Other Models

Achieved the highest recall and AUC (0.83), making it the best choice for predicting customer churn.

Linear Kernel SVM Struggled to Capture Complex Relationships

AUC was only 0.69, indicating that a linear decision boundary is insufficient for this problem.

Hyperparameter Tuning Significantly Improved SVM Performance

GridSearchCV optimization of C, degree (Polynomial), and gamma (RBF) led to better classification accuracy.

built with

pixelesq