ChurnGuard: Predicting Customer Attrition
A Comparative Study of SVM and Logistic Regression
This project aims to predict customer churn using a dataset obtained from Kaggle. The goal is to build a predictive model that can classify customers as 'churn' or 'not churn'.

Project Breakdown
The project workflow includes data preprocessing, train-test split, Support Vector Machine (SVM) model training & evaluation, regularized logistic regression, and model evaluation using various metrics.
Feature selection and hyperparameter tuning significantly improve model performance.
Use Accuracy, Precision, Recall, F1-score, Confusion Matrices, and ROC Curves to determine which model provides the best classification performance.
Build Churn Prediction Model
Build an effective churn prediction model that accurately identifies customers likely to leave.
Evaluate SVM Performance
Compare different SVM kernels (Linear, Polynomial, RBF) to determine the best-performing model.
Compare Against Logistic Regression
Investigate whether regularized logistic regression can outperform SVM models in churn prediction.
Key Findings
Polynomial Kernel SVM is the best-performing model for churn prediction. Logistic Regression provides a simpler, more interpretable alternative with comparable performance.
Polynomial Kernel SVM Outperformed Other Models
Achieved the highest recall and AUC (0.83), making it the best choice for predicting customer churn.
Linear Kernel SVM Struggled to Capture Complex Relationships
AUC was only 0.69, indicating that a linear decision boundary is insufficient for this problem.
Hyperparameter Tuning Significantly Improved SVM Performance
GridSearchCV optimization of C, degree (Polynomial), and gamma (RBF) led to better classification accuracy.
