fastAPI-based Machine Learning model training api

Overview

this API, developed using fastAPI, enables the automated training of machine learning models, specifically Support Vector Classification (SVC) and xgboost. By providing a structured interface for dataset ingestion, hyperparameter configuration, and performance evaluation, it facilitates efficient experimentation with different model architectures and optimization strategies.

Features

dynamic dataset ingestion via api endpoints
implementation of SVC and xgboost classifiers
optional hyperparameter tuning using Optuna
configurable parameters, including test size and random state
robust logging for monitoring and debugging

Installation

Ensure Python (>=3.8) is installed. Clone the repository and install dependencies:

pip install -r requirements.txt

Running the API

To deploy the API locally, execute:

uvicorn main:app --host 0.0.0.0 --port 8000

API Endpoints

Train support vector classifier (SVC)

POST /train/svc

Request Parameters (multipart/form-data):

data_file (file, required): CSV file containing feature vectors and labels
label_column (string, required): Name of the target variable column
test_column (string, optional): Name of a predefined test set column
test_size (float, default: 0.2): proportion of the dataset allocated for validation
random_state (int, default: 42): Seed value for reproducibility
use_optuna (bool, default: False): Enable hyperparameter optimization
hyperparams (string, optional): JSON-formatted string specifying hyperparameters
n_trials (int, default: 10): Number of trials for optuna tuning

Sample Response:

{
  "accuracy": 0.92,
  "f1_score": 0.88,
  "confusion_matrix": [[50, 5], [4, 41]]
}

Train xgboost classifier

POST /train/xgboost

Request Parameters (identical to SVC endpoint)

Sample Response:

{
  "accuracy": 0.94,
  "f1_score": 0.90,
  "confusion_matrix": [[52, 3], [2, 43]]
}

Logging & gpu Utilization

The api includes structured logging to track significant events and errors.
gpu availability can be checked, but this functionality is currently disabled.

Future Enhancements

standardize dataset storage, retrieval, and deletion
implement early stopping based on performance metrics
refine evaluation metric selection for improved generalization

License

this project is distributed under the BSD License.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
tabular_trainer_api		tabular_trainer_api
DockerFile		DockerFile
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fastAPI-based Machine Learning model training api

Overview

Features

Installation

Running the API

API Endpoints

Train support vector classifier (SVC)

Request Parameters (multipart/form-data):

Sample Response:

Train xgboost classifier

Request Parameters (identical to SVC endpoint)

Sample Response:

Logging & gpu Utilization

Future Enhancements

License

About

Releases

Packages

H-Ismael/tab_trainer

Folders and files

Latest commit

History

Repository files navigation

fastAPI-based Machine Learning model training api

Overview

Features

Installation

Running the API

API Endpoints

Train support vector classifier (SVC)

Request Parameters (multipart/form-data):

Sample Response:

Train xgboost classifier

Request Parameters (identical to SVC endpoint)

Sample Response:

Logging & gpu Utilization

Future Enhancements

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages