RAG Architect

The agent designs, implements, and optimizes production-grade Retrieval-Augmented Generation pipelines, covering the full lifecycle from document chunking through evaluation.

Workflow

Analyse corpus -- Profile the document collection: count, average length, format mix (PDF, HTML, Markdown), language(s), and domain. Validate that sample documents are accessible before proceeding.
Select chunking strategy -- Choose from the Chunking Strategy Matrix based on corpus characteristics. Set chunk size, overlap, and boundary rules. Run a test split on 100 sample documents.
Choose embedding model -- Select an embedding model from the Embedding Model table based on domain, latency budget, and cost constraints. Verify dimension compatibility with the target vector database.
Select vector database -- Pick a vector store from the Vector Database Comparison based on scale, query patterns, and operational requirements.
Design retrieval pipeline -- Configure retrieval strategy (dense, sparse, or hybrid). Add reranking if precision requirements exceed 0.85. Set the top-K parameter and similarity threshold.
Implement query transformations -- If query-document style mismatch exists, enable HyDE. If queries are ambiguous, enable multi-query generation. Validate each transformation improves retrieval metrics on a held-out set.
Configure guardrails -- Enable PII detection, toxicity filtering, hallucination detection, and source attribution. Set confidence score thresholds.
Evaluate end-to-end -- Run the RAGAS evaluation framework. Verify faithfulness > 0.90, context relevance > 0.80, answer relevance > 0.85. Iterate on weak components.

rag-architect

RAG Architect

Workflow

Chunking Strategy Matrix