Prompt Injection — AI/LLM Security Audit

Audit applications that use AI features, LLM integrations, or AI agents for prompt injection, privilege escalation, and authorization bypass vulnerabilities.

Cross-references: threat-modeling for design-time AI risk modeling on new AI features (before this skill applies); owasp-audit for the XSS / output-rendering patterns that overlap when LLM output reaches the browser (sanitize on render, JSON-LD breakout); api-audit for the API surface that LLM tools and MCP servers expose; ai-risk-management for the broader governance frame this skill sits within — prompt injection is the security slice of AI risk; AI RMF covers the rest (fairness, robustness, transparency, drift, lifecycle).

Background

Prompt injection is the #1 vulnerability in LLM-integrated applications (OWASP Top 10 for LLMs, LLM01). It occurs when untrusted input influences the instructions an LLM follows, causing it to ignore its system prompt, leak secrets, or take unauthorized actions.

Three attack classes:

Direct injection: Attacker provides malicious input directly to the LLM (e.g., chat input, form field processed by AI)
Indirect injection: Attacker plants malicious instructions in data the LLM will later consume (e.g., web pages, emails, documents, database records, tool outputs, RAG chunks)
Cross-privilege injection: Lower-privileged user plants injection in shared data that a higher-privileged user's AI session consumes, escalating privileges through the AI layer

prompt-injection

Prompt Injection — AI/LLM Security Audit

Background

Methodology

Step 1: Map the AI Attack Surface