The main goal of the project is to identify potential reasons behind customer attrition at the bank, and to predict future customers who might leave, that that will allow the bank to be creative and proactive in their engagement with the customers.
As a data objective, we would ideally also like to identify the machine learning algorithm that has the best accuracy and what are some of the most important factors considered by the algorithm.
Performed EDA and identified some indicators and red flags that can be used as signals to highlights customers that might churn.
Data Preprocessing including (Feature Transformation, Feature Scaling, and Addressing imbalanced dataset)
Ran linear (Logistic Regression, SVM) and Non Linear models (Decision Trees, Random Forest, Catboost) to select model with best Accuracy and Recall score.
Optimizing and Hyperparameter tuning for the best models, using Grid Search CV to further improve the model performance.
Built a client facing dashboard using Power BI, to track the most important feature and metrics from the modeling and EDA conclusions.