rag-architect

Installation
Summary

Production-grade RAG system design covering chunking, embeddings, vector stores, hybrid search, reranking, and retrieval evaluation.

  • Guides five core workflow steps: requirements analysis, vector store design, chunking strategy, retrieval pipeline configuration, and quality evaluation with checkpoints
  • Supports multiple vector databases (Pinecone, Weaviate, Chroma, pgvector, Qdrant) with schema design, indexing, and sharding strategies
  • Implements hybrid search combining dense vector retrieval with BM25 keyword search, plus reranking via Cohere for top-k result refinement
  • Includes evaluation framework using RAGAS metrics (context precision, recall, faithfulness, answer relevancy) to validate retrieval quality before LLM integration
  • Provides reference guides for embedding model selection, semantic chunking, query expansion, and multi-tenant filtering with deduplication via deterministic IDs
SKILL.md

RAG Architect

Core Workflow

  1. Requirements Analysis — Identify retrieval needs, latency constraints, accuracy requirements, and scale
  2. Vector Store Design — Select database, schema design, indexing strategy, sharding approach
  3. Chunking Strategy — Document splitting, overlap, semantic boundaries, metadata enrichment
  4. Retrieval Pipeline — Embedding selection, query transformation, hybrid search, reranking
  5. Evaluation & Iteration — Metrics tracking, retrieval debugging, continuous optimization

For each step, validate before moving on (see checkpoints below).

Reference Guide

Load detailed guidance based on context:

Topic Reference Load When
Vector Databases references/vector-databases.md Comparing Pinecone, Weaviate, Chroma, pgvector, Qdrant
Related skills

More from jeffallan/claude-skills

Installs
2.2K
GitHub Stars
9.0K
First Seen
Jan 21, 2026