eval-creator-ci
Warn
Audited by Gen Agent Trust Hub on Apr 24, 2026
Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
- [COMMAND_EXECUTION]: The skill defines a
command-checkverification method that executes shell commands and evaluates their exit codes or output. This provides a direct path for the agent to execute arbitrary code within the CI environment if the eval case definitions are modified or maliciously crafted. - [REMOTE_CODE_EXECUTION]: In its 'Create Evals' mode, the skill ingests data from upstream artifacts (specifically from
learning-aggregator-ci) to dynamically generate new eval case files. These files can containcommand-checkdefinitions, effectively allowing untrusted or compromised upstream data to dictate shell commands that will be executed in subsequent CI runs. - [DATA_EXFILTRATION]: The execution logic for
command-checkinvolves reporting 'actual' command output to PR comments and check annotations. This behavior could be exploited to exfiltrate sensitive environment variables or file contents if an eval case is configured to output such data (e.g., executingprintenvorcat .env). - [INDIRECT_PROMPT_INJECTION]: The skill has a significant attack surface for indirect injection:
- Ingestion points: Reads promotion candidates from external GitHub Action artifacts and gap reports.
- Boundary markers: None are specified to prevent the agent from interpreting instructions embedded within the pattern data as commands.
- Capability inventory: Full shell access via
command-checkand the ability to commit new files to the repository. - Sanitization: The instructions do not define validation or sanitization steps for the 'verification methods' or 'commands' extracted from external data sources.
Audit Metadata