top of page

Predicting Santander Bank Customers’ Satisfaction in Real - Time

This project was developed with the purpose of forecasting Santander Bank Customers' Satisfaction in Real Time, providing metrics for the company to act more quickly in preventing customer losses and also to retain satisfied consumer

Santander Customer Satisfaction Dashboar

Using Pyhton programming language together with Apache Spark (cluster computing platform) large amounts of data were collected in order to implement a Supervised Machine Learning Predictive Model in historical data.

With the data provided I developed: data loading, Feature engineering, cleaning and transformation, processing in memory ensuring speed, exploratory analyzes evidencing opportunities for the company, Machine Learning and a series of other practices in Spark.

After selecting the best machine learning model, I simulated a real environment with a batch of data acquisition (previously unknown) for a whole week (Monday to Friday) presenting the results on PowerBI visualization platform with interactive graphics showing unsatisfied consumers and generating action opportunities for decision makers.

             This project's differential will also be the presentation of Spark Jobs being executed in real time through PySparkShell, a powerful tool for interactive analysis of data behavior.

Download Full Project

bottom of page