audit-skill-by-derailment

Pass

Audited by Gen Agent Trust Hub on May 19, 2026

Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The script scripts/launch-derailment.sh uses bash -lc "$agent_cmd" to execute a shell command provided via a script argument. This is the intended mechanism to launch subagents for skill testing.
  • [EXTERNAL_DOWNLOADS]: In SKILL.md, the workflow uses the GitHub CLI (gh api) to download remote SKILL.md and reference files from user-specified repositories. These downloads are performed from a well-known service (GitHub) into local temporary directories.
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection because it interpolates unvalidated user-provided task descriptions directly into subagent prompts.
  • Ingestion points: The {TASK_IN_PLAIN_LANGUAGE} variable in SKILL.md (Step 3).
  • Boundary markers: None present; user input is placed directly into the prompt without delimiters or instructions to ignore embedded commands.
  • Capability inventory: The subagent environment has access to file system tools (ls, cat) and potentially the same shell/network capabilities as the parent agent.
  • Sanitization: No input sanitization or validation is applied to the task text before it is processed by the subagent.
  • [DATA_EXFILTRATION]: The skill accesses local files within the ~/.claude/skills/ directory to read skill definitions for auditing. This is consistent with its stated purpose of testing local skills.
Audit Metadata
Risk Level
SAFE
Analyzed
May 19, 2026, 03:52 PM
Security Audit — agent-trust-hub — audit-skill-by-derailment