This repository shows how to build Machine Learning pipeline for a vision model (TensorFlow) from 🤗 Transformers using the TensorFlow Ecosystem. In particular, we use TensorFlow Extended(TFX), and there are TensorFlow Data Validation(TFDV), Transform(TFT), Model Analysis(TFMA), and Serving(TF Serving) besides TensorFlow itself internally involved.
NOTE: This is a follow-up projects of "Deploying Vision Models (TensorFlow) from 🤗 Transformers" which shows how to deploy ViT model locally, on kubernetes, and on a fully managed service Vertex AI.
We will show how to build ML pipeline with TFX in a step-by-step manner:
-
- as the first step, we show how to build ML pipeline with the most basic components, which are
ExampleGen
,Trainer
, andPusher
. These components are responsible for injecting raw dataset into the ML pipeline, training a TensorFlow model, and deploying a trained model.
- as the first step, we show how to build ML pipeline with the most basic components, which are
-
- as the second step, we show how to extend the ML pipeline from the first step by adding more components, which are
SchemaGen
,StatisticsGen
, andTransform
. These components are responsible for analyzing the structures of the dataset, analyzing the statistical traits of the features in the dataset, and data pre-processing.
- as the second step, we show how to extend the ML pipeline from the first step by adding more components, which are
-
- as the third step, we show how to extend the ML pipeline from the second step by adding more components, which are
Resolver
andEvaluator
. These components are responsible for importing existing Artifacts (such as previously trained model) and comparing the performance between two models (one from theResolver
and one from the current pipeline run).
- as the third step, we show how to extend the ML pipeline from the second step by adding more components, which are
-
- as the fourth step, we show how to extend the ML pipeline from the third step by adding one more additional component,
Tuner
. This component is responsible for running a set of experiments with different sets of hyperparameters with fewer epochs, and the found best hyperparameter combination will be passed to theTrainer
, andTrainer
will train the model longer time with that hyperparameter combinations as the starting point.
- as the fourth step, we show how to extend the ML pipeline from the third step by adding one more additional component,
-
- in this optional step, we show how to use custom TFX components for 🤗 Hub. In particular, we use
HFModelPusher
to push currently trained model to 🤗 Model Hub andHFSpacePusher
to automatically deploy Gradio application to 🤗 Space Hub.
- in this optional step, we show how to use custom TFX components for 🤗 Hub. In particular, we use
We are thankful to the ML Developer Programs team at Google that provided GCP support.