ai-research-reproduction

Warn

Audited by Gen Agent Trust Hub on May 18, 2026

Risk Level: MEDIUMREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [REMOTE_CODE_EXECUTION]: The script scripts/orchestrate_repro.py extracts shell commands directly from a target repository's README file and executes them using subprocess.run. This provides a mechanism for executing untrusted code provided by external repository authors.\n- [COMMAND_EXECUTION]: The orchestration logic in scripts/orchestrate_repro.py dynamically builds and executes shell commands for environment setup and research reproduction. It relies on shlex.split for basic parsing but does not validate the security of the commands extracted from external sources.\n- [PROMPT_INJECTION]: The skill is susceptible to Indirect Prompt Injection. Because it is designed to interpret and act upon instructions found in an external repository's README to define its reproduction workflow, an attacker could craft a malicious README to induce the agent into performing harmful actions or exfiltrating data.\n- [PROMPT_INJECTION]: Mandatory Evidence Chain for Indirect Prompt Injection Findings:\n
  • Ingestion points: Ingests untrusted data from README.md, repository file structures, and configuration files (YAML, JSON, TOML, Python) parsed in scripts/orchestrate_repro.py.\n
  • Boundary markers: Lacks explicit boundary markers or instructions to ignore embedded commands when processing ingested content.\n
  • Capability inventory: Includes capabilities for file system modification, local script execution, and arbitrary shell command execution via the subprocess module.\n
  • Sanitization: Does not perform semantic validation or sanitization of the commands extracted from documentation before execution.
Audit Metadata
Risk Level
MEDIUM
Analyzed
May 18, 2026, 10:23 AM
Security Audit — agent-trust-hub — ai-research-reproduction