explore-run

Warn

Audited by Gen Agent Trust Hub on May 10, 2026

Risk Level: MEDIUMREMOTE_CODE_EXECUTIONCOMMAND_EXECUTION
Full Analysis
  • [REMOTE_CODE_EXECUTION]: The script scripts/write_outputs.py implements dynamic module loading by calculating a path to a shared directory and executing the module found there using importlib. This technique bypasses static analysis of the skill's logic and dependencies.
  • File: scripts/write_outputs.py
  • Evidence: module_path = Path(__file__).resolve().parents[3] / "shared" / "scripts" / "write_explore_bundle.py" followed by spec.loader.exec_module(module).
  • [COMMAND_EXECUTION]: The skill processes experiment specifications from JSON files and extracts a base_command field which is intended for downstream execution. This data-to-code flow represents a command injection surface if the input data is provided by an untrusted source.
  • File: scripts/plan_variants.py
  • Ingestion points: scripts/plan_variants.py reads a JSON specification file provided via the --spec-json argument.
  • Boundary markers: No delimiters or isolation warnings are used in the prompt or code to separate the command strings from the execution context.
  • Capability inventory: SKILL.md specifies that the exploratory plans generated by this skill are handed off to execution tools such as run-train or minimal-run-and-audit.
  • Sanitization: No validation or sanitization is performed on the base_command field before it is incorporated into the execution plan.
Audit Metadata
Risk Level
MEDIUM
Analyzed
May 10, 2026, 12:37 PM