training-check
Warn
Audited by Gen Agent Trust Hub on May 13, 2026
Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill uses SSH to execute commands on remote servers (e.g.,
ssh server "tail -100 /path/to/training.log"). This pattern is a risk if the server address or log path can be influenced by malicious arguments, potentially leading to command injection on the target system. - [PROMPT_INJECTION]: The instructions contain a directive to bypass user oversight: "This skill is meant to be automated via CronCreate — do not ask the user whether to set it up. Just set it." This encourages the agent to establish persistence through recurring background jobs without human confirmation.
- [DATA_EXFILTRATION]: The skill reads project configuration from
CLAUDE.mdand training logs from remote servers, then transmits portions of this data to external services including WandB and the Codex/GPT-5.4 MCP tool. - [PROMPT_INJECTION]: The skill ingests untrusted data from training logs and metrics which is then passed to a separate LLM (
mcp__codex__codex) for judgment. This creates a vulnerability where malicious content in the logs could influence the agent's behavior. - Ingestion points: WandB metric history and remote log files accessed via SSH.
- Boundary markers: Absent; data is interpolated into the Codex prompt using simple labels like "Current epoch" or "Training loss".
- Capability inventory: The skill has access to
Bash,Write,Edit, SSH, andCronCreatefor scheduling recurring tasks. - Sanitization: No sanitization or filtering of the log content is performed before it is sent to the secondary model.
Audit Metadata