You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DragonflyDB supports vector embedding search using FT.SEARCH command. However it only supports embedding search where each vector has weights of float32 datatype (4 bytes per float).
Please prioritise supporting embedding search using 1-bit weights. It would also be nice to support intermediate levels like 2-bit and 4-bit embeddings, but 1-bit is what I'm most interested in.
It is not mandatory for you to provide any logic for how to generate or quantise the embeddings. Search is what is most important.
It is okay if your implementation has a search time that is 50 milliseconds worse than state-of-the-art. It will still be useful if it is as easy-to-use as the rest of dragonflyDB. Ease-of-use is an important factor that some libraries fail on.
Why?
In practice, quantising embedding drastically reduces RAM requirements without significantly loss in accuracy, hence many devs are quantising their embeddings instead.
Accuracy loss is not significant, there is empirical data you can google on this. Also I've observed this in my own dataset.
Hosting 32-bit embeddings versus 1-bit embeddings can mean the difference between paying $10,000/month and $300/month in hosting costs. Devs working on independent projects or in small startups will obviously pick the latter.
This is one of the highest revenue generating use cases of the AI revolution. Perplexity recently raised $500 million at $8 billion valuation just to build more embedding search. Pinecone is the most popular hosted solution for embedding search. It was last valued at $750 million.
You can use Perplexity or Claude or ChatGPT yourself to confirm that RAG and embedding search add significant real value to user's lives, they are not just buzz words.
Whichever library is able to support devs for this use case is very likely to increase in popularity.
Alternatives
I have been using Pinecone for smaller datasets as it is the fastest way to host embedding today IMO. However it is expensive for large datasets. Also it does not support 1-bit embeddings yet.
FAISS library by Facebook is currently one of the easy-to-use open source libraries for someone who wants to use 1-bit embedding search.
ANN Benchmarks tries to benchmark existing embedding search libraries. Not all these are easy to use though. And not all of them support quantisation.
The text was updated successfully, but these errors were encountered:
What?
DragonflyDB supports vector embedding search using FT.SEARCH command. However it only supports embedding search where each vector has weights of float32 datatype (4 bytes per float).
Please prioritise supporting embedding search using 1-bit weights. It would also be nice to support intermediate levels like 2-bit and 4-bit embeddings, but 1-bit is what I'm most interested in.
It is not mandatory for you to provide any logic for how to generate or quantise the embeddings. Search is what is most important.
It is okay if your implementation has a search time that is 50 milliseconds worse than state-of-the-art. It will still be useful if it is as easy-to-use as the rest of dragonflyDB. Ease-of-use is an important factor that some libraries fail on.
Why?
In practice, quantising embedding drastically reduces RAM requirements without significantly loss in accuracy, hence many devs are quantising their embeddings instead.
This is one of the highest revenue generating use cases of the AI revolution. Perplexity recently raised $500 million at $8 billion valuation just to build more embedding search. Pinecone is the most popular hosted solution for embedding search. It was last valued at $750 million.
You can use Perplexity or Claude or ChatGPT yourself to confirm that RAG and embedding search add significant real value to user's lives, they are not just buzz words.
Whichever library is able to support devs for this use case is very likely to increase in popularity.
Alternatives
I have been using Pinecone for smaller datasets as it is the fastest way to host embedding today IMO. However it is expensive for large datasets. Also it does not support 1-bit embeddings yet.
FAISS library by Facebook is currently one of the easy-to-use open source libraries for someone who wants to use 1-bit embedding search.
ANN Benchmarks tries to benchmark existing embedding search libraries. Not all these are easy to use though. And not all of them support quantisation.
The text was updated successfully, but these errors were encountered: