The Agent Skills Directory

[PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection within its verification loop. The REVIEWER_PROMPT template defined in SKILL.md interpolates potentially untrusted content (the {output} from a previous agent) directly into the instructions for secondary review agents.\n
Ingestion points: The {output} variable, containing content generated by an agent that may have processed external inputs, is interpolated into the prompt used by Reviewer B and Reviewer C in Phase 2.\n
Boundary markers: The prompt lacks robust boundary markers or delimiters (such as XML tags or randomly generated separators) to isolate the content under review from the reviewer's instructions. It relies on simple Markdown headers (## Output Under Review), which can be bypassed if the output content includes similar headers or conflicting instructions.\n
Capability inventory: The framework utilizes the Agent tool (Claude Code) and fix_agent.execute, which could be exploited if a reviewer agent is manipulated into providing a false positive verdict or executing malicious instructions embedded in the reviewed output.\n
Sanitization: No sanitization, escaping, or validation of the content is performed before it is presented to the review sub-agents.

santa-method