this API, developed using fastAPI, enables the automated training of machine learning models, specifically Support Vector Classification (SVC) and xgboost. By providing a structured interface for dataset ingestion, hyperparameter configuration, and performance evaluation, it facilitates efficient experimentation with different model architectures and optimization strategies.
- dynamic dataset ingestion via api endpoints
- implementation of SVC and xgboost classifiers
- optional hyperparameter tuning using Optuna
- configurable parameters, including test size and random state
- robust logging for monitoring and debugging
Ensure Python (>=3.8) is installed. Clone the repository and install dependencies:
pip install -r requirements.txt
To deploy the API locally, execute:
uvicorn main:app --host 0.0.0.0 --port 8000
POST /train/svc
data_file
(file, required): CSV file containing feature vectors and labelslabel_column
(string, required): Name of the target variable columntest_column
(string, optional): Name of a predefined test set columntest_size
(float, default: 0.2): proportion of the dataset allocated for validationrandom_state
(int, default: 42): Seed value for reproducibilityuse_optuna
(bool, default: False): Enable hyperparameter optimizationhyperparams
(string, optional): JSON-formatted string specifying hyperparametersn_trials
(int, default: 10): Number of trials for optuna tuning
{
"accuracy": 0.92,
"f1_score": 0.88,
"confusion_matrix": [[50, 5], [4, 41]]
}
POST /train/xgboost
{
"accuracy": 0.94,
"f1_score": 0.90,
"confusion_matrix": [[52, 3], [2, 43]]
}
- The api includes structured logging to track significant events and errors.
- gpu availability can be checked, but this functionality is currently disabled.
- standardize dataset storage, retrieval, and deletion
- implement early stopping based on performance metrics
- refine evaluation metric selection for improved generalization
this project is distributed under the BSD License.