rag-pipeline
SKILL.md
RAG Pipeline Logic
Ingestion
- Script:
backend/ingest.py - Process:
- Scans
docs/. - Cleans MDX (removes frontmatter/imports).
- Chunks text (1000 chars, 100 overlap).
- Embeds using
models/text-embedding-004. - Upserts to Qdrant collection
physical_ai_book.
- Scans
- Run:
python backend/ingest.py
Vector Search (Qdrant)
- Client:
qdrant-client - Collection:
physical_ai_book - Vector Size: 768 (Gecko-004)
- Similarity: Cosine