Vector Stores for RAG (Postgres + pgvector)¶

A vector store is where you keep: - your chunk text - its metadata (source, section, timestamps, tags) - its embedding vector

Then you run nearest-neighbor search to find the most relevant chunks for a user query.

This guide is pgvector-first: Postgres + the pgvector extension.

Why pgvector as the default¶

pgvector is a strong default for many teams because:

You likely already run Postgres.
You get metadata filters (JSONB) and SQL joins for free.
You can do full-text search + vector search in one system.
Fewer moving parts compared to running a separate vector database.

Trade-offs: - At very large scale, a specialized vector DB can be simpler to operate for pure ANN workloads. - You must be intentional about indexes and query patterns.

Enable pgvector¶

In Postgres:

CREATE EXTENSION IF NOT EXISTS vector;

Reference schema (documents + chunks)¶

This schema keeps documents separate from chunks, but you can also store chunks only.

Note

The vector(1536) dimension below assumes an embedding model that returns 1536-d vectors (commonly used for text-embedding-3-small). If you switch embedding models, update the dimension.

CREATE TABLE IF NOT EXISTS documents (
  id UUID PRIMARY KEY,
  source TEXT NOT NULL,          -- file path or URL
  title TEXT,
  metadata JSONB NOT NULL DEFAULT '{}'::jsonb,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);

CREATE TABLE IF NOT EXISTS chunks (
  id BIGSERIAL PRIMARY KEY,
  doc_id UUID NOT NULL REFERENCES documents(id) ON DELETE CASCADE,
  chunk_index INT NOT NULL,
  section_path TEXT,
  content TEXT NOT NULL,
  metadata JSONB NOT NULL DEFAULT '{}'::jsonb,
  embedding vector(1536) NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
  tsv tsvector GENERATED ALWAYS AS (to_tsvector('english', content)) STORED
);

CREATE INDEX IF NOT EXISTS chunks_doc_id_idx ON chunks (doc_id);
CREATE INDEX IF NOT EXISTS chunks_metadata_gin_idx ON chunks USING GIN (metadata);
CREATE INDEX IF NOT EXISTS chunks_tsv_gin_idx ON chunks USING GIN (tsv);

Similarity search queries¶

Top‑K dense retrieval (cosine distance)¶

pgvector provides distance operators. The most common: - <=> cosine distance - <-> L2 distance - <#> inner product (negative)

Cosine distance example:

SELECT
  id,
  doc_id,
  documents.source,
  section_path,
  content,
  embedding <=> $1::vector AS distance
FROM chunks
JOIN documents ON documents.id = chunks.doc_id
ORDER BY embedding <=> $1::vector
LIMIT $2;

Add metadata filters¶

SELECT
  chunks.id,
  documents.source,
  chunks.content
FROM chunks
JOIN documents ON documents.id = chunks.doc_id
WHERE chunks.metadata @> '{"product":"enterprise"}'::jsonb
ORDER BY chunks.embedding <=> $1::vector
LIMIT 8;

Indexing for speed (HNSW vs IVFFlat)¶

For small datasets, sequential scans can be fine.

For larger datasets, add an ANN index:

HNSW (recommended default)¶

CREATE INDEX IF NOT EXISTS chunks_embedding_hnsw_idx
  ON chunks
  USING hnsw (embedding vector_cosine_ops);

IVFFlat (works well with tuning)¶

CREATE INDEX IF NOT EXISTS chunks_embedding_ivfflat_idx
  ON chunks
  USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);

Notes: - IVFFlat requires ANALYZE and benefits from careful tuning. - HNSW is often easier to get good results from without heavy tuning.

Hybrid search teaser (dense + keyword)¶

Dense search can miss exact keywords (IDs, error codes). Postgres makes it easy to combine: - dense: pgvector - sparse: full-text search (tsvector)

See: SQL for RAG

Optional: framework integrations¶

LangChain

langchain_postgres / PGVector wrappers can manage schema + inserts.

LlamaIndex

PGVectorStore provides a vector store abstraction on top of pgvector.

Next steps¶

Improve retrieval beyond top‑K: Retrieval Strategies (Beyond Top‑K)
Build an end-to-end system: Build a Complete RAG with pgvector