top of page

Real – Time Customer Churn Prediction in Telecom Companies

This project was developed with the purpose of predicting in real time the turnover of customers of a Telecom Company providing metrics for the company to act more quickly in preventing customer losses and also to retain satisfied consumer

States.png
Churn.png

Using Pyhton programming language together with Apache Spark (cluster computing platform) large amounts of data were collected in order to implement a Supervised Machine Learning Predictive Model in historical data.

With the data provided I developed: data loading, Feature engineering, cleaning and transformation, processing in memory ensuring speed, exploratory analyzes evidencing opportunities for the company, Machine Learning and a series of other practices in Spark.

After exploring all the information provided by the data, two predictive classification models were implemented in test data to train and analyze the model's accuracy. When selecting the best machine learning model, new and unknown data was introduced in the model to test its efficiency by presenting the results through a confusion matrix.

             This project's differential will also be the presentation of Spark Jobs being executed in real time through PySparkShell, a powerful tool for interactive analysis of data behavior.

Download Full Project

bottom of page