The Agent Skills Directory

[COMMAND_EXECUTION]: The script scripts/common.py defines run_shell_command which uses subprocess.run with shell=True to execute commands. This utility is used by scripts/run_task.py and scripts/common.py to interact with the system environment and run task benchmarks.
[REMOTE_CODE_EXECUTION]: The scripts/run_task.py script executes a user-provided command (the task_command) multiple times during evaluation trials. This framework is explicitly designed to facilitate the execution of code generated or modified by the agent itself.
[PROMPT_INJECTION]: The skill architecture creates an attack surface for indirect prompt injection. 1. Ingestion points: The meta-agent reads archive.jsonl and memory.jsonl containing previously generated code and performance history. 2. Boundary markers: No delimiters or explicit instructions to ignore embedded commands are present when loading variants. 3. Capability inventory: run_task.py and common.py facilitate subprocess calls for shell command execution. 4. Sanitization: No validation or filtering of generated content is performed before it is re-analyzed or executed.

hyperagent