Skip to content

datavistics/encoder-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Contributors Stargazers Issues MIT License

Table of Contents

Introduction

This repository supports a blog post that helps users estimate costs for large-scale classification, embedding, or vision embedding tasks. It provides benchmarking tools for different GPU types, batch sizes, and inference methods, using michaelfeil/infinity and Inference Endpoints.

I considered a variety of things like:

  • GPU type
  • Infinity Image type
  • Varying Batch Sizes
  • Varying VUs amounts
  • Multiple Architectures

Installation

I used Python

  1. git clone https://github.com/datavistics/encoder-analysis.git
  2. cd encoder-analysis
  3. pip install -r requirements.txt
  4. pip install jupyterlab
  5. Install k6 based on your platform

Getting Started

Make sure you have the ability to deploy an Inference Endpoint

  1. Run jupyter lab
  2. Choose your task [classification, embedding, vision-embedding]
  3. Run <task>-optimization.ipynb to get the best configuration
  4. Run <task>-analysis.ipynb to visualize the results
  5. Alternatively run <task>-analysis-gradio.ipynb to have more interactive results

Project Structure

  • There are notebooks in the top level for convenience. Its probably cleaner to put them in ./notebooks but its annoying to add it to path, so I opted for user satisfaction rather than aesthetics
    • *-optimization.ipynb - These were used for generating and conducting the experiments
    • *-analysis.ipynb - These show the analysis in a clean notebook-centric way
    • *-analysis-gradio.ipynb - These show the analysis in an interactive gradio-centric way
  • src I abstracted a fair amount of code here. I tried to keep any important details in the notebooks
  • templates these are the k6 jinja templates that I use to generate each experiment
  • data, generated, and results are used to store non-version-controlled project files

How does it work?

Each of the *-optimization.ipynb notebooks facilitates this structure:

flowchart TD;
    subgraph Benchmarking Server
        A[k6 Load Testing]
        D[Instance Config]
    end

    subgraph Inference Endpoint
        C[Container Running Infinity]
        E[Next Inference Endpoint]
    end

    D -->|Defines Test Parameters| A
    D -->|Deploys Inference Endpoint| E
    A -->|Sends Test Data| C
    C -->|Processes and Returns| A
Loading
  1. Define the benchmarking parameters (GPU, batch size, VUs, etc)
  2. Deploy the inference server (Infinity on Hugging Face Endpoints)
  3. Run K6 performance tests to evaluate speed, cost, and efficiency
  4. Store and visualize results for optimization

Results

Do check out these notebooks in nbviewer, as I took a lot of effort to make sure they are interactive. Unfortunately they look better in light mode due to the tables. But follow your heart.

Classification

For lxyuan/distilbert-base-multilingual-cased-sentiments-student on a dataset like tyqiangz/multilingual-sentiments (using the text column) we can do 1 Billion classifications for only $253.82.

GPU Image Batch Size VUs Min Cost
nvidia-l4 default 64 448 $253.82

classification-results.png Interactive Version here

Embedding

For Alibaba-NLP/gte-modernbert-base on a dataset like sentence-transformers/trivia-qa-triplet (using the positive column) we can do 1 Billion embeddings for only $409.44.

GPU Batch Size VUs Min Cost
nvidia-l4 256 32 $409.44

embedding-results.png Interactive Version here

Vision Embedding

For vidore/colqwen2-v1.0-merged on a dataset like openbmb/RLAIF-V-Dataset (using the image column) we can do 1 Billion ColBERT style embeddings (late interaction) on images for $44496.51.

GPU Batch Size VUs Min Cost
nvidia-l4 4 4 $44496.51

vision-embedding-results.png Interactive Version here

References and Links

About

Analysis on the cost of encoder based models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published