Vector Stores for RAG (Postgres + pgvector)¶
A vector store is where you keep: - your chunk text - its metadata (source, section, timestamps, tags) - its embedding vector
Then you run nearest-neighbor search to find the most relevant chunks for a user query.
This guide is pgvector-first: Postgres + the pgvector extension.
Why pgvector as the default¶
pgvector is a strong default for many teams because:
- You likely already run Postgres.
- You get metadata filters (JSONB) and SQL joins for free.
- You can do full-text search + vector search in one system.
- Fewer moving parts compared to running a separate vector database.
Trade-offs: - At very large scale, a specialized vector DB can be simpler to operate for pure ANN workloads. - You must be intentional about indexes and query patterns.
Enable pgvector¶
In Postgres:
CREATE EXTENSION IF NOT EXISTS vector;
Reference schema (documents + chunks)¶
This schema keeps documents separate from chunks, but you can also store chunks only.
Note
The vector(1536) dimension below assumes an embedding model that returns 1536-d vectors (commonly used for text-embedding-3-small). If you switch embedding models, update the dimension.
CREATE TABLE IF NOT EXISTS documents (
id UUID PRIMARY KEY,
source TEXT NOT NULL, -- file path or URL
title TEXT,
metadata JSONB NOT NULL DEFAULT '{}'::jsonb,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE TABLE IF NOT EXISTS chunks (
id BIGSERIAL PRIMARY KEY,
doc_id UUID NOT NULL REFERENCES documents(id) ON DELETE CASCADE,
chunk_index INT NOT NULL,
section_path TEXT,
content TEXT NOT NULL,
metadata JSONB NOT NULL DEFAULT '{}'::jsonb,
embedding vector(1536) NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
tsv tsvector GENERATED ALWAYS AS (to_tsvector('english', content)) STORED
);
CREATE INDEX IF NOT EXISTS chunks_doc_id_idx ON chunks (doc_id);
CREATE INDEX IF NOT EXISTS chunks_metadata_gin_idx ON chunks USING GIN (metadata);
CREATE INDEX IF NOT EXISTS chunks_tsv_gin_idx ON chunks USING GIN (tsv);
Similarity search queries¶
Top‑K dense retrieval (cosine distance)¶
pgvector provides distance operators. The most common:
- <=> cosine distance
- <-> L2 distance
- <#> inner product (negative)
Cosine distance example:
SELECT
id,
doc_id,
documents.source,
section_path,
content,
embedding <=> $1::vector AS distance
FROM chunks
JOIN documents ON documents.id = chunks.doc_id
ORDER BY embedding <=> $1::vector
LIMIT $2;
Add metadata filters¶
SELECT
chunks.id,
documents.source,
chunks.content
FROM chunks
JOIN documents ON documents.id = chunks.doc_id
WHERE chunks.metadata @> '{"product":"enterprise"}'::jsonb
ORDER BY chunks.embedding <=> $1::vector
LIMIT 8;
Indexing for speed (HNSW vs IVFFlat)¶
For small datasets, sequential scans can be fine.
For larger datasets, add an ANN index:
HNSW (recommended default)¶
CREATE INDEX IF NOT EXISTS chunks_embedding_hnsw_idx
ON chunks
USING hnsw (embedding vector_cosine_ops);
IVFFlat (works well with tuning)¶
CREATE INDEX IF NOT EXISTS chunks_embedding_ivfflat_idx
ON chunks
USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);
Notes:
- IVFFlat requires ANALYZE and benefits from careful tuning.
- HNSW is often easier to get good results from without heavy tuning.
Hybrid search teaser (dense + keyword)¶
Dense search can miss exact keywords (IDs, error codes). Postgres makes it easy to combine: - dense: pgvector - sparse: full-text search (tsvector)
See: SQL for RAG
Optional: framework integrations¶
LangChain
langchain_postgres/PGVectorwrappers can manage schema + inserts.
LlamaIndex
PGVectorStoreprovides a vector store abstraction on top of pgvector.
Next steps¶
- Improve retrieval beyond top‑K: Retrieval Strategies (Beyond Top‑K)
- Build an end-to-end system: Build a Complete RAG with pgvector