rag-architect
Production-grade RAG system design covering chunking, embeddings, vector stores, hybrid search, reranking, and retrieval evaluation.
- Guides five core workflow steps: requirements analysis, vector store design, chunking strategy, retrieval pipeline configuration, and quality evaluation with checkpoints
- Supports multiple vector databases (Pinecone, Weaviate, Chroma, pgvector, Qdrant) with schema design, indexing, and sharding strategies
- Implements hybrid search combining dense vector retrieval with BM25 keyword search, plus reranking via Cohere for top-k result refinement
- Includes evaluation framework using RAGAS metrics (context precision, recall, faithfulness, answer relevancy) to validate retrieval quality before LLM integration
- Provides reference guides for embedding model selection, semantic chunking, query expansion, and multi-tenant filtering with deduplication via deterministic IDs
RAG Architect
Core Workflow
- Requirements Analysis — Identify retrieval needs, latency constraints, accuracy requirements, and scale
- Vector Store Design — Select database, schema design, indexing strategy, sharding approach
- Chunking Strategy — Document splitting, overlap, semantic boundaries, metadata enrichment
- Retrieval Pipeline — Embedding selection, query transformation, hybrid search, reranking
- Evaluation & Iteration — Metrics tracking, retrieval debugging, continuous optimization
For each step, validate before moving on (see checkpoints below).
Reference Guide
Load detailed guidance based on context:
| Topic | Reference | Load When |
|---|---|---|
| Vector Databases | references/vector-databases.md |
Comparing Pinecone, Weaviate, Chroma, pgvector, Qdrant |
More from jeffallan/claude-skills
laravel-specialist
Build and configure Laravel 10+ applications, including creating Eloquent models and relationships, implementing Sanctum authentication, configuring Horizon queues, designing RESTful APIs with API resources, and building reactive interfaces with Livewire. Use when creating Laravel models, setting up queue workers, implementing Sanctum auth flows, building Livewire components, optimising Eloquent queries, or writing Pest/PHPUnit tests for Laravel features.
13.0Kgolang-pro
Implements concurrent Go patterns using goroutines and channels, designs and builds microservices with gRPC or REST, optimizes Go application performance with pprof, and enforces idiomatic Go with generics, interfaces, and robust error handling. Use when building Go applications requiring concurrent programming, microservices architecture, or high-performance systems. Invoke for goroutines, channels, Go generics, gRPC integration, CLI tools, benchmarks, or table-driven testing.
12.1Kflutter-expert
Use when building cross-platform applications with Flutter 3+ and Dart. Invoke for widget development, Riverpod/Bloc state management, GoRouter navigation, platform-specific implementations, performance optimization.
10.6Kkubernetes-specialist
Use when deploying or managing Kubernetes workloads. Invoke to create deployment manifests, configure pod security policies, set up service accounts, define network isolation rules, debug pod crashes, analyze resource limits, inspect container logs, or right-size workloads. Use for Helm charts, RBAC policies, NetworkPolicies, storage configuration, performance optimization, GitOps pipelines, and multi-cluster management.
9.1Kphp-pro
Use when building PHP applications with modern PHP 8.3+ features, Laravel, or Symfony frameworks. Invokes strict typing, PHPStan level 9, async patterns with Swoole, and PSR standards. Creates controllers, configures middleware, generates migrations, writes PHPUnit/Pest tests, defines typed DTOs and value objects, sets up dependency injection, and scaffolds REST/GraphQL APIs. Use when working with Eloquent, Doctrine, Composer, Psalm, ReactPHP, or any PHP API development.
9.0Kspring-boot-engineer
Generates Spring Boot 3.x configurations, creates REST controllers, implements Spring Security 6 authentication flows, sets up Spring Data JPA repositories, and configures reactive WebFlux endpoints. Use when building Spring Boot 3.x applications, microservices, or reactive Java applications; invoke for Spring Data JPA, Spring Security 6, WebFlux, Spring Cloud integration, Java REST API design, or Microservices Java architecture.
5.6K