monitor-experiment

Fail

Audited by Gen Agent Trust Hub on Apr 7, 2026

Risk Level: HIGHCOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTIONREMOTE_CODE_EXECUTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill interpolates the $ARGUMENTS variable directly into shell commands used for ssh and screen management. This allows an attacker to execute arbitrary local commands by providing input containing shell metacharacters such as semicolons, backticks, or pipes.
  • [DATA_EXFILTRATION]: The skill accesses local sensitive files, specifically ~/.claude/feishu.json and vast-instances.json, to read configuration and credentials. If the agent is influenced by malicious input, these secrets could be exposed or exfiltrated.
  • [PROMPT_INJECTION]: The skill has a significant surface for indirect prompt injection (Category 8). It ingests untrusted data from remote screen logs, JSON result files, and metrics from Weights & Biases (Ingestion points: SKILL.md). It lacks boundary markers to delimit external content and performs no sanitization or validation of the ingested data. The skill possesses powerful capabilities including ssh and Write access (Capability inventory: ssh, vastai, modal, Write, Edit), which could be abused if the agent follows instructions embedded in the logs.
  • [REMOTE_CODE_EXECUTION]: The skill executes dynamically constructed Python scripts on remote hosts via ssh. While the logic for Weights & Biases integration is defined in the skill, the use of placeholders for project and run IDs could be exploited to run unauthorized code on remote servers if those identifiers are sourced from untrusted data.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Apr 7, 2026, 12:25 PM