context-retrieval

Installation
SKILL.md

Context Retrieval

Context retrieval is the process of finding and assembling the most relevant pieces of information from a knowledge base to ground an AI agent's responses in factual, up-to-date data. It is the backbone of Retrieval Augmented Generation (RAG) and ensures that generated outputs are accurate and verifiable rather than hallucinated.

Workflow

  1. Embed the Query: Convert the user's natural-language query into a dense vector representation using an embedding model (e.g., OpenAI text-embedding-3-small, Cohere embed-v3, or an open-source model like bge-large). The embedding captures the semantic meaning of the query so it can be compared against stored documents.

  2. Search the Vector Store: Send the query embedding to a vector database (Pinecone, Weaviate, Qdrant, Chroma, etc.) and perform an approximate nearest-neighbor (ANN) search. Request the top-k candidate chunks, typically k = 10–20 to give the reranker enough material to work with.

  3. Rerank the Results: Pass the candidate chunks through a cross-encoder reranker (e.g., Cohere Rerank, bge-reranker-large, or a ColBERT model). The reranker scores each chunk against the original query with full attention, producing much more accurate relevance scores than cosine similarity alone. Keep the top-n results (typically n = 3–5).

  4. Assemble the Context Window: Concatenate the selected chunks into a single context block, ordered by relevance score descending. Prepend source metadata (file path, URL, page number) to each chunk so the agent can cite its sources. Ensure the total token count fits the model's budget for the context section of the prompt.

  5. Generate the Response: Feed the assembled context into the LLM prompt alongside the original query and a system instruction that tells the model to answer only from the provided context. This grounds the response in retrieved facts and reduces hallucination.

  6. Validate and Cite: After generation, verify that the answer references information actually present in the retrieved chunks. Attach inline citations or a references section so the user can trace each claim back to a source document.

Key Concepts

Related skills
Installs
9
GitHub Stars
78
First Seen
Mar 19, 2026