Search, Retrieval & Re-ranking

Go beyond simple vector search — hybrid search and re-ranking for better results

Search, Retrieval & Re-ranking

Two-Stage Retrieval

Query -> [Stage 1: Candidate Retrieval] -> Top 50 -> [Stage 2: Re-ranking] -> Top 5
              Fast, cheap                           Slower, more accurate

Stage 1: Hybrid Search

Vector search (semantic): Finds meaning, handles synonyms

BM25 (keyword): Matches exact terms, good for codes/names

Combine both: score = alpha * bm25 + (1-alpha) * vector

Stage 2: Re-ranking

A cross-encoder model scores (query, document) pairs directly — more accurate than embedding similarity.

Popular re-rankers: Cohere Rerank 3.5, BGE-reranker-v2

Putting It All Together

Query -> Embed -> Vector Search + BM25 -> Hybrid Fusion -> Top 50 -> Re-ranker -> Top 5 -> LLM

Your Turn!

Design a retrieval strategy:

python

apps = {
    'Legal document search': {'exact': True, 'semantic': True, 'rerank': True},
    'E-commerce product search': {'exact': True, 'semantic': True, 'rerank': False},
    'Chatbot memory': {'exact': False, 'semantic': True, 'rerank': False},
}

✏️ Code Editor

Loading Python...

📤 Output

Write your solution and click "Run Code" to test it!

← Agent Architectures — ReAct & Beyond Next: Memory, State & Production Agents →