The Agent Skills Directory

[COMMAND_EXECUTION]: The skill provides instructions for executing the agentflow eval CLI to manage evaluation suites, run trials, and generate reports.\n- [EXTERNAL_DOWNLOADS]: The documentation references setup scripts (npm run setup:eval-repos, npm run setup:realworld-evals) that clone external GitHub repositories to serve as test fixtures for workflow evaluations.\n- [REMOTE_CODE_EXECUTION]: The framework supports custom_script criteria, which execute suite-specific scripts to perform objective validations of workflow outputs and artifacts.\n- [PROMPT_INJECTION]: The skill is designed to analyze external, untrusted data from GitHub issues and repositories, which introduces a surface for indirect prompt injection.\n
Ingestion points: External data enters the context through cloned repositories and pinned GitHub issue metadata used in the eval scenarios (references/operations-and-dogfood.md).\n
Boundary markers: The skill guidelines recommend keeping oracle metadata and upstream PR patches hidden from the agent's context to maintain evaluation integrity (references/eval-patterns.md).\n
Capability inventory: The system executes Agentflow graphs that can perform tool calls and file system operations, and it runs suite-provided custom scripts (references/grading-and-reporting.md).\n
Sanitization: The provided documentation does not specify sanitization or filtering protocols for the ingested repository content.

agentflow-evals