ai-testing-safety
Installation
SKILL.md
Find Every Way Users Can Break Your AI
Guide the user through automated adversarial testing — systematically discovering vulnerabilities before real users exploit them. The core insight from dspy-redteam: red-teaming is an optimization problem. Use DSPy to search for prompts that maximize attack success rate.
When NOT to use this
- Your AI is not user-facing (internal-only tools with trusted users have lower risk) — consider a simpler manual review instead
- You have not built guardrails yet — use
/ai-checking-outputsand/ai-following-rulesfirst, then come back to test them - Your AI is crashing — fix it first with
/ai-fixing-errors - You want to improve accuracy, not safety — use
/ai-improving-accuracy
Step 1: Understand the system
Ask the user:
- What AI system are you testing? (chatbot, API, agent, content generator?)
- Who are the users? (public, authenticated customers, internal staff?)
- What are the highest-risk categories? (see the table below)
- What compliance requirements exist? (SOC 2, HIPAA, internal audit, none?)