ai-research-reproduction
Warn
Audited by Gen Agent Trust Hub on Apr 14, 2026
Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
- [COMMAND_EXECUTION]: The script
scripts/orchestrate_repro.pycontains several uses ofsubprocess.run. It executes both internal helper scripts and, more critically, commands extracted from the documentation of the repository being analyzed. - [REMOTE_CODE_EXECUTION]: The orchestration logic specifically extracts "documented commands" from a repository's
README.md(processed byextract_commands.py) and executes them inmaybe_run_commandandmaybe_run_training. This design allows content from an external, untrusted repository to dictate arbitrary code execution on the user's system. Although the skill enforces a 'minimal trustworthy target' policy, it remains a high-risk capability if used on malicious repositories. - [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection. A malicious actor could craft a
README.mdwith dangerous commands (e.g., system modification or data exfiltration) that are prioritized by the skill's heuristic scoring mechanism (command_scoreinscripts/orchestrate_repro.py) and then proposed for execution. The current implementation usesshlex.splitfor argument parsing but does not sanitize the command strings themselves against malicious intent.
Audit Metadata