golden-dataset-management
Installation
SKILL.md
Golden Dataset Management
Protect and maintain high-quality test datasets for AI/ML systems
Overview
A golden dataset is a curated collection of high-quality examples used for:
- Regression testing: Ensure new code doesn't break existing functionality
- Retrieval evaluation: Measure search quality (precision, recall, MRR)
- Model benchmarking: Compare different models/approaches
- Reproducibility: Consistent results across environments
When to use this skill:
- Building test datasets for RAG systems
- Implementing backup/restore for critical data
- Validating data integrity (URL contracts, embeddings)
- Migrating data between environments