arthurrasmusson
Follow
Principle AI Engineer at Weka.
Previously: Model Efficiency, Cohere.
Co-Founder, arccompute.io Co-Creator, GPU Virtual Machine (Linux-GVM.org) & LibVF.IO.
Popular repositories Loading
-
vLLM-Intercept
vLLM-Intercept PublicIntercept OpenAI's FastAPI and re-route requests to a local vLLM server. (future TRT-LLM support planned)
Python
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.