Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ONNXRuntime taking up too much memory #483

Open
AlterHoodie opened this issue Feb 20, 2025 · 5 comments
Open

ONNXRuntime taking up too much memory #483

AlterHoodie opened this issue Feb 20, 2025 · 5 comments
Assignees

Comments

@AlterHoodie
Copy link

AlterHoodie commented Feb 20, 2025

ONNXRuntime takes up too much memory (more like accumulating cause I believe it's not freeing up unused memory), when trying to embed large collections of data.
Am I missing something or is this a problem with the runtime itself.
Iam trying to embed about 10000 documents (Average Size - 3000 characters) using the JinaAI Colbert Model (Late Interaction Model)
GPU - Tesla T4 16 GB VRam

@joein
Copy link
Member

joein commented Feb 20, 2025

What's the batch size you're using?
Are you keeping the embeddings in memory or are you uploading them somewhere else / write to disk?
Colbert embeddings are quite huge since colbert produce 128-dim embedding per token

@AlterHoodie
Copy link
Author

Dumping them into pickle files every 1000 docs, batch size is just 1

@joein
Copy link
Member

joein commented Feb 20, 2025

As for now, I was able to reproduce the issue and it indeed seems like a problem with onnxruntime not freeing up the space
However, we might need more time to investigate it, thank you for pointing it out

@AlterHoodie
Copy link
Author

Hey, any updates on this?

@joein
Copy link
Member

joein commented Mar 11, 2025

Yeah, we're working on a fix #493

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants