generate-rag-dataset

Pass

Audited by Gen Agent Trust Hub on Mar 24, 2026

Risk Level: SAFE
Full Analysis
  • [DATA_EXPOSURE]: The skill instructions involve reading the local codebase, including document files, database schemas, and vector store configurations. This access is necessary for the skill's primary function of generating a representative evaluation dataset from the user's specific knowledge base and is constrained to local analysis.\n- [INDIRECT_PROMPT_INJECTION]: The skill describes a workflow that ingests external document content (PDFs, markdown, text) and database schemas to generate question-answer pairs. While this establishes an ingestion surface for untrusted data, the operation is local and the output is a structured CSV file for evaluation purposes.\n
  • Ingestion points: SKILL.md instructions to read local documents (PDF, markdown, text), database schemas, and vector store configs.\n
  • Boundary markers: None specified; the agent is guided to categorize questions (factual, multi-hop, etc.) based on the source content.\n
  • Capability inventory: Reading local files for knowledge base analysis; writing local CSV/Python DataFrame files for dataset export.\n
  • Sanitization: No explicit sanitization or filtering of document content is mentioned in the instructions.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 24, 2026, 07:56 AM