pgvector Tutorial 2026: Vector Similarity Search in PostgreSQL
Add semantic search to your PostgreSQL database for RAG without extra infrastructure
pgvector Tutorial 2026: Vector Similarity Search in PostgreSQL
Add semantic search to your PostgreSQL database for RAG without extra infrastructure
Complete pgvector tutorial: installation, HNSW indexes, cosine similarity operators, hybrid BM25+vector search, and a full RAG pipeline in Python.
pgvector Tutorial 2026: Vector Search in PostgreSQL for RAG
pgvector adds vector similarity search to PostgreSQL, enabling RAG without a separate database.
Installation
sql
CREATE EXTENSION vector;
bash
Docker with pgvector
docker run -d -e POSTGRES_PASSWORD=pass -p 5432:5432 pgvector/pgvector:pg16
Schema
sql
CREATE TABLE documents (
id BIGSERIAL PRIMARY KEY,
content TEXT NOT NULL,
metadata JSONB DEFAULT '{}',
embedding vector(1536),
created_at TIMESTAMPTZ DEFAULT NOW()
);-- HNSW index (best performance)
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
Python Integration
python
import psycopg2
from openai import OpenAI
import jsonopenai = OpenAI()
conn = psycopg2.connect('postgresql://postgres:pass@localhost:5432/mydb')
cur = conn.cursor()
def embed(text: str) -> list:
return openai.embeddings.create(model='text-embedding-3-small', input=text).data[0].embedding
def insert(content: str, metadata: dict = None):
cur.execute(
'INSERT INTO documents (content, metadata, embedding) VALUES (%s, %s, %s) RETURNING id',
(content, json.dumps(metadata or {}), embed(content))
)
doc_id = cur.fetchone()[0]
conn.commit()
return doc_id
def search(query: str, top_k: int = 5):
q = embed(query)
cur.execute(
'SELECT content, 1 - (embedding <=> %s::vector) AS score FROM documents '
'ORDER BY embedding <=> %s::vector LIMIT %s',
(q, q, top_k)
)
return [{'content': r[0], 'score': float(r[1])} for r in cur.fetchall()]
Hybrid Search: BM25 + Vector
python
def hybrid_search(query: str, top_k: int = 5, vec_weight: float = 0.7):
q = embed(query)
bm25_w = 1 - vec_weight
cur.execute(
'SELECT content, %s * ts_rank(to_tsvector(\'english\', content), plainto_tsquery(\'english\', %s)) '
'+ %s * (1 - (embedding <=> %s::vector)) AS score '
'FROM documents '
"WHERE to_tsvector('english', content) @@ plainto_tsquery('english', %s) "
'ORDER BY score DESC LIMIT %s',
(bm25_w, query, vec_weight, q, query, top_k)
)
return [{'content': r[0], 'score': float(r[1])} for r in cur.fetchall()]
Full RAG Pipeline
python
def answer(question: str) -> str:
docs = hybrid_search(question, top_k=5)
context = '\n\n'.join(d['content'] for d in docs)
r = openai.chat.completions.create(
model='gpt-4o-mini',
messages=[
{'role': 'system', 'content': f'Answer using context:\n\n{context}'},
{'role': 'user', 'content': question}
]
)
return r.choices[0].message.contentinsert('PostgreSQL is a powerful relational database', {'source': 'docs'})
insert('pgvector adds vector similarity search to PostgreSQL', {'source': 'docs'})
print(answer('What is pgvector?'))
Performance Tuning
sql
SET hnsw.ef_search = 100; -- Higher recall, slower (default: 40)
EXPLAIN ANALYZE SELECT * FROM documents ORDER BY embedding <=> '...'::vector LIMIT 10;
Conclusion
pgvector is the pragmatic choice for PostgreSQL users. Hybrid BM25 + vector search covers most RAG needs with zero extra infrastructure.
相关工具
相关教程
Automatically classify, summarize, and draft replies to emails using AI
Build voice AI applications with natural-sounding TTS and custom voice cloning
Transcribe audio files, meetings, and real-time speech with Whisper