Find Every Way Users Can Break Your AI

Guide the user through automated adversarial testing — systematically discovering vulnerabilities before real users exploit them. The core insight from dspy-redteam: red-teaming is an optimization problem. Use DSPy to search for prompts that maximize attack success rate.

When NOT to use this

Your AI is not user-facing (internal-only tools with trusted users have lower risk) — consider a simpler manual review instead
You have not built guardrails yet — use /ai-checking-outputs and /ai-following-rules first, then come back to test them
Your AI is crashing — fix it first with /ai-fixing-errors
You want to improve accuracy, not safety — use /ai-improving-accuracy

Step 1: Understand the system

Ask the user:

What AI system are you testing? (chatbot, API, agent, content generator?)
Who are the users? (public, authenticated customers, internal staff?)
What are the highest-risk categories? (see the table below)
What compliance requirements exist? (SOC 2, HIPAA, internal audit, none?)

ai-testing-safety

Find Every Way Users Can Break Your AI

When NOT to use this

Step 1: Understand the system