prompt-injection-risk-linter
Installation
SKILL.md
When to invoke
- You are building an agent that reads untrusted content (web pages, emails, tickets) and you want a pre-flight safety lint.
- You want to add an automated check to prompt templates before deployment.
Inputs needed
--promptpath to a text file containing a system/developer prompt, or combined prompt template.- Optional:
--retrievedpath to a text file with representative untrusted content.
Workflow
- Detect common prompt-injection markers ("ignore previous instructions", requests to reveal hidden prompts, tool/credential exfiltration).
- Check for missing boundaries (no explicit statement that retrieved content is untrusted; no tool-use constraints).
- Emit a JSON report with severity, evidence snippets, and recommended mitigations.
Output format
- JSON report written to
--output.
Guardrails
- Heuristics only: do not claim the prompt is safe/unsafe with certainty.
- Avoid printing full prompt contents to stdout.