bare-eval

Warn

Audited by Gen Agent Trust Hub on May 4, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill provides numerous shell command templates in SKILL.md and references/invocation-patterns.md that use string interpolation of variables such as $prompt, $grading_prompt, and $output_text. For instance, the pattern claude -p "$prompt" --bare creates a vulnerability where shell metacharacters in the input variables could result in arbitrary command execution within the host environment.
  • [PROMPT_INJECTION]: The skill facilitates the processing of untrusted data (like LLM outputs being graded or external prompts being classified) and interpolates them into agent instructions.
  • Ingestion points: Data enters the context via variables like $prompt, $output_text, and $assertions_json as shown in the invocation patterns.
  • Boundary markers: The templates use simple textual headers (e.g., OUTPUT:, ASSERTION:) to separate untrusted data from instructions, which does not provide strong protection against instruction override or bypass.
  • Capability inventory: The prompts are executed using the claude CLI, which has extensive capabilities for file access and subprocess execution.
  • Sanitization: There is no evidence of sanitization, escaping, or validation of the untrusted data before it is incorporated into the CLI arguments or prompt text.
  • [EXTERNAL_DOWNLOADS]: The skill's documentation in references/troubleshooting.md suggests updating the claude-code package via npm. This refers to an official package from a well-known service provider.
Audit Metadata
Risk Level
MEDIUM
Analyzed
May 4, 2026, 05:09 PM