ML Task Round
This project demonstrates data analysis and machine learning techniques on a cleaned dataset. The work is divided into two main tasks:
- K-Means Clustering and Statistical Analysis
- Multi-Layer Perceptron (MLP) Training and Imputation
analysis.py
: The main Python script containing all the code for the project.Task1_Visualization.pdf
: Contains the visualizations for Task 1 (K-Means Clustering and statistical analysis).Task2_Visualization.pdf
: Contains the visualizations for Task 2 (MLP training loss and actual vs predicted scatterplot).
- Perform clustering on numeric features of the dataset using K-Means.
- Evaluate clusters using the Silhouette Score.
- Visualize data distributions and clustering results.
- Handle missing values using KNN imputation.
- Train a Multi-Layer Perceptron (MLP) regressor to predict target values.
- Evaluate model performance using RMSE (Root Mean Squared Error).
- Visualize the training loss curve and prediction results.
Ensure you have Python installed on your system. For Mac M1, it's recommended to use a version compatible with your architecture.
Install the required Python libraries using the command below:
pip install pandas numpy scikit-learn matplotlib