generate-rag-dataset
Pass
Audited by Gen Agent Trust Hub on Mar 24, 2026
Risk Level: SAFE
Full Analysis
- [DATA_EXPOSURE]: The skill instructions involve reading the local codebase, including document files, database schemas, and vector store configurations. This access is necessary for the skill's primary function of generating a representative evaluation dataset from the user's specific knowledge base and is constrained to local analysis.\n- [INDIRECT_PROMPT_INJECTION]: The skill describes a workflow that ingests external document content (PDFs, markdown, text) and database schemas to generate question-answer pairs. While this establishes an ingestion surface for untrusted data, the operation is local and the output is a structured CSV file for evaluation purposes.\n
- Ingestion points: SKILL.md instructions to read local documents (PDF, markdown, text), database schemas, and vector store configs.\n
- Boundary markers: None specified; the agent is guided to categorize questions (factual, multi-hop, etc.) based on the source content.\n
- Capability inventory: Reading local files for knowledge base analysis; writing local CSV/Python DataFrame files for dataset export.\n
- Sanitization: No explicit sanitization or filtering of document content is mentioned in the instructions.
Audit Metadata