How does this compare to CoreWeave's Tensorizer? #15

alpayariyak · 2024-11-05T13:38:48Z

(In both local and cloud storage settings)

Also, for better distribution you might want to integrate with vLLM, similarly to how tensorizer is supported there

noa-neria · 2024-11-05T14:58:45Z

We performed benchmarks of runai-model-streamer, CoreWeave's Tensorizer and Safetensors library, and published the results here

A notable advantage of runai-model-streamer is that its concurrency level is not limited by the size of the largest tensor of the model, unlike Tensorizer. The storage layer in our streamer is independent of the tensor sizes and can be configured to work at any concurrency level which is optimal for the storage type. It is especially important in order to saturate the read bandwidth of the storage. This is the reason for the high performance in our benchmarks when reading from distributed storage such as S3, which was faster than Tensorizer by a factor of x7.6

Thank you for pointing out integration to vLLM, and in fact there is an open pull request for that in the vLLM repository

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How does this compare to CoreWeave's Tensorizer? #15

How does this compare to CoreWeave's Tensorizer? #15

alpayariyak commented Nov 5, 2024

noa-neria commented Nov 5, 2024 •

edited

Loading

How does this compare to CoreWeave's Tensorizer? #15

How does this compare to CoreWeave's Tensorizer? #15

Comments

alpayariyak commented Nov 5, 2024

noa-neria commented Nov 5, 2024 • edited Loading

noa-neria commented Nov 5, 2024 •

edited

Loading