harness-security-bench

Warn

Audited by Gen Agent Trust Hub on Jun 25, 2026

Risk Level: MEDIUMEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [EXTERNAL_DOWNLOADS]: The skill fetches the @metaharness/darwin package from the NPM registry at runtime.
  • [REMOTE_CODE_EXECUTION]: Uses npx -y to automatically install and execute the remote package. This behavior facilitates the execution of external code that is not bundled with the skill or hosted on a verified/trusted platform.
  • [COMMAND_EXECUTION]: Invokes shell commands via Bash to run the benchmarking tool. It accepts user-provided arguments such as --population and --cycles which are directly passed to the command line.
  • [PROMPT_INJECTION]: The skill implements an indirect prompt injection surface by parsing a markdown report from the external metaharness-darwin tool to produce structured JSON for the agent.
  • Ingestion points: Parsing the markdown report output from the metaharness-darwin execution (SKILL.md).
  • Boundary markers: No explicit delimiters or instructions are used to ignore embedded commands within the parsed report.
  • Capability inventory: The skill has access to the Bash tool, allowing potential follow-on actions based on injected content.
  • Sanitization: No sanitization or validation of the tool's output is mentioned prior to JSON conversion.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Jun 25, 2026, 05:35 AM
Security Audit — agent-trust-hub — harness-security-bench