The Agent Skills Directory

[COMMAND_EXECUTION]: The skill instructs the agent to execute commands on remote servers using SSH (e.g., ssh server "tail -100 /path/to/training.log") to collect experiment logs. This provides a mechanism for remote execution and potential lateral movement if SSH access is configured in the environment.
[COMMAND_EXECUTION]: The workflow relies on executing local scripts such as tools/research_wiki.py and tools/save_trace.sh. These scripts are external to the skill definition, and their behavior cannot be verified within this scope.
[PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection. It ingests data from external sources (W&B API, remote logs, local project files) and interpolates it directly into a prompt for the mcp__codex__codex tool. An attacker with control over experiment logs or metadata could embed malicious instructions to manipulate the model's judgment.
Ingestion points: Data is gathered from EXPERIMENT_LOG.md, EXPERIMENT_TRACKER.md, docs/research_contract.md, remote logs via SSH, and the Weights & Biases API.
Boundary markers: Minimal use of headers (e.g., Results:, Baselines:) without explicit instructions to ignore embedded commands or clear delimiters.
Capability inventory: Access to Bash(*), Write, Edit, and MCP tools (mcp__codex__codex).
Sanitization: The skill does not perform any validation, escaping, or filtering of the ingested data before prompt construction.

result-to-claim