The objective of this project is to analyze the data collected by the company in order to optimize the mailing campaigns performed for their customers. The project also helps Arvato see who is likely to be a customer through clustering and comparison of the general public and existing customers.
This project consisted of 3 main steps:
- Data exploration and cleaning.
- Clustering. This helped Arvato check out how the population characteristics compare to the existing customer characteristics.
- Customer conversion through supervised learning, to predict how likely a typical person in the population is likely to convert to a customer.
- Medium Report explaining the project. You can find the medium article here
The data has been provided by Udacity and Arvato Financial Solutions. It contains 4 data files and 2 description files. The description files have information about the features:
- Customer Segmentation General Population demographics Customer demographics
- Customer Acquisition Training data Test data
- Description files
- Data exploration Assessing data Cleaning data
- Customers segmentation Dimension reduction (PCA) Clustering
- Customers Conversion Predictions Prédicting clients responses Training model Make prediction
- Python 3.8+
- Machine Learning Libraries: NumPy, Seaborn, Pandas, Scikit-Learn, Matplotlib.pyplot
The process has been detailed in the Arvato Project Workbook.ipynb file. Feel free to consult it directly.