Hybrid Retrieval
Retrieval Architecture
OpenTier uses Reciprocal Rank Fusion (RRF) combining two complementary search strategies:
| Strategy | Index | Strength | Weakness |
|---|---|---|---|
| Vector search | HNSW (cosine distance) | Semantic similarity, handles synonyms | Misses exact keywords, opaque |
| Keyword search | GIN (tsvector BM25) | Exact terms, proper nouns, code | Fails on paraphrasing, semantic gap |
RRF normalizes both ranked lists into a single unified score, combining their strengths.
hybrid_search() SQL Function
Defined in migration 20260115000006:
CREATE OR REPLACE FUNCTION hybrid_search(
query_embedding vector(384), -- encoded query
query_text text, -- raw query text for keyword search
p_user_id text, -- restrict to user's documents + global
p_limit integer DEFAULT 20,
vector_weight float DEFAULT 0.7,
keyword_weight float DEFAULT 0.3
) RETURNS TABLE (
chunk_id uuid,
document_id uuid,
content text,
similarity_score float,
rank integer
)Algorithm:
-- Vector search branch
WITH vector_results AS (
SELECT dc.id, dc.document_id, dc.content,
1 - (dc.embedding <=> query_embedding) AS sim,
ROW_NUMBER() OVER (ORDER BY dc.embedding <=> query_embedding) AS vrank
FROM document_chunks dc
JOIN documents d ON dc.document_id = d.id
WHERE d.user_id = p_user_id OR d.is_global = true
ORDER BY dc.embedding <=> query_embedding
LIMIT p_limit * 2
),
-- Keyword search branch
keyword_results AS (
SELECT dc.id, dc.document_id, dc.content,
ts_rank_cd(to_tsvector('english', dc.content),
plainto_tsquery('english', query_text)) AS krank_score,
ROW_NUMBER() OVER (ORDER BY krank_score DESC) AS krank
FROM document_chunks dc
JOIN documents d ON dc.document_id = d.id
WHERE d.user_id = p_user_id OR d.is_global = true
AND to_tsvector('english', dc.content) @@ plainto_tsquery('english', query_text)
ORDER BY krank_score DESC
LIMIT p_limit * 2
),
-- Reciprocal Rank Fusion
rrf AS (
SELECT
COALESCE(v.id, k.id) AS chunk_id,
COALESCE(v.document_id, k.document_id) AS document_id,
COALESCE(v.content, k.content) AS content,
COALESCE(vector_weight / (60 + v.vrank), 0)
+ COALESCE(keyword_weight / (60 + k.krank), 0) AS rrf_score
FROM vector_results v
FULL OUTER JOIN keyword_results k ON v.id = k.id
)
SELECT chunk_id, document_id, content, rrf_score, ROW_NUMBER() OVER (ORDER BY rrf_score DESC)
FROM rrf
ORDER BY rrf_score DESC
LIMIT p_limitRRF formula: score = vector_weight / (60 + vector_rank) + keyword_weight / (60 + keyword_rank)
The constant 60 is the standard RRF smoothing parameter — prevents top-1 results from dominating when rank differences are large.
Pipeline: HybridSearchEngine
Index Specifications
Vector index (HNSW):
CREATE INDEX ON document_chunks USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);m = 16: number of bi-directional links per node (higher = more accurate, more memory)ef_construction = 64: search width during index build (higher = more accurate index)- Search operator:
<=>(cosine distance; 0 = identical, 2 = maximally different)
Full-text search index (GIN):
CREATE INDEX ON document_chunks USING gin(to_tsvector('english', content));'english'dictionary: stemming, stopword removalts_rank_cd: rank by cover density (proximity of matched terms)plainto_tsquery: converts raw user query to tsquery without requiring special syntax
Access Control in Search
The WHERE d.user_id = p_user_id OR d.is_global = true clause enforces document-level access control at the database layer:
- A user can only retrieve chunks from documents they own
- Global documents (
is_global = true, set by admins) are available to all users - There is no role-based document access beyond this binary global/private distinction
- Access control is enforced in SQL — not in application code — making it harder to bypass accidentally
QueryPipeline Weights
Default weights (configurable):
vector_weight = 0.7(70% semantic)keyword_weight = 0.3(30% keyword)
These can be overridden per-request via ChatConfig.context_limit (though the proto does not expose vector_weight directly — weights are currently service-level config, not per-request).