Data Science
Duration:
10.2022 - 01.2023
Client:
Anonymous client in the telecommunication industry
Technologies:
Python, Pandas, XGBoost, scikit-learn, mlflow, FAST API, Gradio, Docker, kubernetes
Situation
A client in the telecommunications sector was experiencing significant client churn, leading to a substantial loss in market share.
Task
The client required a reliable predictive system to identify customers at risk of churning, enabling them to offer targeted special deals and retain those customers. Moreover, the client requested a system with a large degree of interpretability to accurately identify causes of churning.
Action
I conducted an exploratory data analysis to identify anomalies in the dataset. Subsequently, I developed a comprehensive system to classify customers into "low risk" and "high risk" categories. This system included:
Customer data import via spreadsheet format or SQL database.
Data cleaning to ensure accuracy and consistency.
Training ensembles of decision trees.
Cross-validation to ensure model robustness.
Performance evaluation and ongoing monitoring using MLflow.
Model serving via FastAPI to facilitate integration with the clientโs existing infrastructure.
Appealing visualization of churn likelihoods of customers, most important input features as well as underlying ML models via gradio
Identification of causal relationships between input data and churn likelihood
Deployment in the client's Kubernetes environment for scalability.
Result
The client could automatically identify and approach at-risk customers through a targeted email campaign. This proactive approach significantly reduced customer churn and helped the client maintain and eventually secure market shares.
More Projects