harness-security-bench
Warn
Audited by Gen Agent Trust Hub on Jun 25, 2026
Risk Level: MEDIUMEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [EXTERNAL_DOWNLOADS]: The skill fetches the
@metaharness/darwinpackage from the NPM registry at runtime. - [REMOTE_CODE_EXECUTION]: Uses
npx -yto automatically install and execute the remote package. This behavior facilitates the execution of external code that is not bundled with the skill or hosted on a verified/trusted platform. - [COMMAND_EXECUTION]: Invokes shell commands via Bash to run the benchmarking tool. It accepts user-provided arguments such as
--populationand--cycleswhich are directly passed to the command line. - [PROMPT_INJECTION]: The skill implements an indirect prompt injection surface by parsing a markdown report from the external
metaharness-darwintool to produce structured JSON for the agent. - Ingestion points: Parsing the markdown report output from the
metaharness-darwinexecution (SKILL.md). - Boundary markers: No explicit delimiters or instructions are used to ignore embedded commands within the parsed report.
- Capability inventory: The skill has access to the Bash tool, allowing potential follow-on actions based on injected content.
- Sanitization: No sanitization or validation of the tool's output is mentioned prior to JSON conversion.
Audit Metadata