This is a simple vector database. Not for use in production.
Originally this was a basic vector database that uses k-means to reduce the number of vector lookups. It has since become more a playground for also other algorithms used in vector databases.
See examples for some code examples.
Pinecone has some great articles ("vector indexes") and especially some of the videos from James Briggs on the subject ("Faiss - Introduction to Similarity Search") made things easier to grasp.
These are also good resources
- Algorithms Powering our Vector Database
- Everything You Need to Know about Vector Index Basics
- Nearest Neighbor Indexes: What Are IVFFlat Indexes in Pgvector and How Do They Work
- Vector Search Explained
- Vector Database Basics: HNSW
- How we built a web-scale vector database
- Building a high recall vector database serving 1 billion embeddings from a single machine and the rest of that series of articles.
There are many other algorithms, check the ann-benchmark
apt install llvm-dev libclang-dev clang
pip install maturin
cd vec-db && make install