ai-research-reproduction
Pass
Audited by Gen Agent Trust Hub on May 18, 2026
Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The orchestration script
scripts/orchestrate_repro.pyusessubprocess.runto execute shell commands extracted from third-party README files. - Evidence: The
maybe_run_commandfunction inscripts/orchestrate_repro.pyinvokessubprocess.runon command strings parsed from untrusted external documentation. - [PROMPT_INJECTION]: The skill treats arbitrary documentation from external repositories as a trusted source of executable instructions, leading to a risk of indirect prompt injection.
- Ingestion points: Content from the target repository's
README.mdis ingested and parsed by external scripts called fromscripts/orchestrate_repro.py. - Boundary markers: The skill does not implement delimiters or 'ignore' instructions when processing commands extracted from the README, meaning malicious instructions in the text could influence agent behavior.
- Capability inventory: The skill has access to the shell and can execute arbitrary commands, create files, and run Python scripts as seen in
scripts/orchestrate_repro.py. - Sanitization: While the script uses
shlex.splitfor argument parsing and a heuristiccommand_scoreto prioritize commands, it lacks a whitelist or verification mechanism to ensure commands are safe before execution.
Audit Metadata