hyperagent

Warn

Audited by Gen Agent Trust Hub on Apr 18, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The script scripts/common.py defines run_shell_command which uses subprocess.run with shell=True to execute commands. This utility is used by scripts/run_task.py and scripts/common.py to interact with the system environment and run task benchmarks.
  • [REMOTE_CODE_EXECUTION]: The scripts/run_task.py script executes a user-provided command (the task_command) multiple times during evaluation trials. This framework is explicitly designed to facilitate the execution of code generated or modified by the agent itself.
  • [PROMPT_INJECTION]: The skill architecture creates an attack surface for indirect prompt injection. 1. Ingestion points: The meta-agent reads archive.jsonl and memory.jsonl containing previously generated code and performance history. 2. Boundary markers: No delimiters or explicit instructions to ignore embedded commands are present when loading variants. 3. Capability inventory: run_task.py and common.py facilitate subprocess calls for shell command execution. 4. Sanitization: No validation or filtering of generated content is performed before it is re-analyzed or executed.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Apr 18, 2026, 03:40 AM