Data Science
Duration:
10.2022 - 01.2023
Client:
Anonymous client in the telecommunication industry
Technologies:
Python, Pandas, XGBoost, scikit-learn, mlflow, FAST API, Gradio, Docker, kubernetes
Situation
A client in the telecommunications sector was experiencing significant client churn, leading to a substantial loss in market share.
Task
The client required a reliable predictive system to identify customers at risk of churning, enabling them to offer targeted special deals and retain those customers. Moreover, the client requested a system with a large degree of interpretability to accurately identify causes of churning.
Action
I conducted an exploratory data analysis to identify anomalies in the dataset. Subsequently, I developed a comprehensive system to classify customers into "low risk" and "high risk" categories. This system included:
- Customer data import via spreadsheet format or SQL database. 
- Data cleaning to ensure accuracy and consistency. 
- Training ensembles of decision trees. 
- Cross-validation to ensure model robustness. 
- Performance evaluation and ongoing monitoring using MLflow. 
- Model serving via FastAPI to facilitate integration with the clientโs existing infrastructure. 
- Appealing visualization of churn likelihoods of customers, most important input features as well as underlying ML models via gradio 
- Identification of causal relationships between input data and churn likelihood 
- Deployment in the client's Kubernetes environment for scalability. 
Result
The client could automatically identify and approach at-risk customers through a targeted email campaign. This proactive approach significantly reduced customer churn and helped the client maintain and eventually secure market shares.
More Projects






