ai-rag
RAG & Search Engineering — Complete Reference
Build production-grade retrieval systems with hybrid search, grounded generation, and measurable quality.
This skill covers:
- RAG: Chunking, contextual retrieval, grounding, adaptive/self-correcting systems
- Search: BM25, vector search, hybrid fusion, ranking pipelines
- Evaluation: recall@k, nDCG, MRR, groundedness metrics
Modern Best Practices (Jan 2026):
- Separate retrieval quality from answer quality; evaluate both (RAG: https://arxiv.org/abs/2005.11401).
- Default to hybrid retrieval (sparse + dense) with reranking when precision matters (DPR: https://arxiv.org/abs/2004.04906).
- Use a failure taxonomy to debug systematically (Seven Failure Points in RAG: https://arxiv.org/abs/2401.05856).
- Treat freshness/invalidation as first-class; staleness is a correctness bug, not a UX issue.
- Add grounding gates: answerability checks, citation coverage checks, and refusal-on-missing-context defaults.
- Threat-model RAG: retrieved text is untrusted input (OWASP LLM Top 10: https://owasp.org/www-project-top-10-for-large-language-model-applications/).
More from vasilyu1983/ai-agents-public
product-management
Founder-PM toolkit for discovery, roadmaps, prioritization, and PMF measurement. Use when planning product strategy, metrics, or roadmaps.
684software-architecture-design
Designs system structure across monolith/microservices/serverless. Use when structuring systems, scaling, decomposing monoliths, or choosing patterns.
519software-ui-ux-design
Designs and audits UI/UX with WCAG 2.2 accessibility. Use when designing flows, running heuristic reviews, or defining design systems.
383qa-testing-playwright
E2E web testing with Playwright. Use when writing tests, debugging flakes, or setting up CI with selectors, sharding, and network mocking.
371document-pdf
Extract text/tables from PDFs, create formatted PDFs, merge/split/rotate, and handle forms. Use for any PDF generation or parsing task.
321qa-testing-strategy
Risk-based test strategy for software delivery. Use when defining coverage, setting CI gates, managing flaky tests, or establishing release criteria.
316