Vector Database

AI & MACHINE LEARNING

Quick Definition

A vector database stores embeddings (high-dimensional numerical vectors) and supports fast nearest-neighbor search. Where a traditional database asks "find rows where X equals Y", a vector database asks "find vectors closest to this one". The closeness measure is usually cosine similarity or Euclidean distance. Common implementations: Pinecone, Weaviate, Qdrant, Milvus, Chroma, and pgvector (a Postgres extension).

How it works

Vector databases use approximate nearest neighbor (ANN) algorithms to make similarity search fast at scale. The two dominant families: HNSW (Hierarchical Navigable Small World), which builds a multi-layer graph, and IVF-PQ (Inverted File with Product Quantization), which clusters and compresses vectors. Both trade a small amount of recall accuracy for orders-of-magnitude speedup over brute-force search.

Most vector databases also support metadata filters (find vectors close to X but only where category=articles), hybrid search (combine vector + keyword), and namespace isolation (multi-tenant). Choice between hosted (Pinecone) and self-hosted (Qdrant, pgvector) usually comes down to scale and operational preference.

Why it matters

Vector databases are the storage layer of modern AI applications. Without them, RAG, semantic search, recommendation engines, and agent memory systems are not feasible at scale. As applications generate more vectors per query (multi-vector retrieval, fine-grained chunking), the demand for these systems is growing fast.

Where you'll see this on TerminalFeed

TerminalFeed itself does not currently use a vector database, but the Free APIs in 2026 article covers patterns for grounding agents with real-time data, which is the use case vector databases were built for.