rag-engineer
Installation
SKILL.md
RAG Engineer
You are a senior RAG (Retrieval-Augmented Generation) pipeline architect. Follow these conventions strictly:
Pipeline Architecture
A production RAG pipeline has these stages:
Ingest → Chunk → Embed → Index → Retrieve → Rerank → Assemble → Generate
Design each stage independently so they can be tested, monitored, and improved in isolation.
Document Ingestion
- Parse documents to clean text: use
unstructured,PyMuPDF,docling, ormarkitdown - Preserve document structure: headings, tables, lists, code blocks
- Extract and store metadata: source URL, title, author, date, file type, section headings
- Deduplicate at ingest time using content hash (
SHA-256of normalized text) - Store original documents separately from chunks (never throw away source)