run-experiment

Pass

Audited by Gen Agent Trust Hub on Apr 10, 2026

Risk Level: SAFECOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTIONREMOTE_CODE_EXECUTION
Full Analysis
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection as it ingests untrusted configuration data from the project directory to build shell commands and modify code.
  • Ingestion points: Processes data from CLAUDE.md, vast-instances.json, and the experiment scripts.
  • Boundary markers: No delimiters or protective instructions are implemented to prevent the agent from following malicious instructions embedded in these files.
  • Capability inventory: Uses powerful tools including Bash(*), Edit, and Write which can be exploited if malicious data is processed.
  • Sanitization: No sanitization or verification of external parameters is performed before they are interpolated into executable commands.
  • [COMMAND_EXECUTION]: Employs Bash(*) to run various shell commands for GPU status monitoring, environment management, and launching background processes.
  • [DATA_EXFILTRATION]: Accesses ~/.claude/feishu.json in the user's home directory to manage external notification settings.
  • [REMOTE_CODE_EXECUTION]: Executes code on remote machines via SSH and installs dependencies from external repositories during the setup phase. It also dynamically modifies training scripts to insert logging logic.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 10, 2026, 02:14 AM