input-output-guardrails
Installation
SKILL.md
Input/Output Guardrails
Implement multi-layer safety systems to filter malicious inputs and harmful outputs.
Quick Reference
Skill: input-output-guardrails
Agent: 05-defense-strategy-developer
OWASP: LLM01 (Injection), LLM02 (Disclosure), LLM05 (Output), LLM07 (Leakage)
NIST: Manage
Use Case: Production safety filtering
Guardrail Architecture
User Input → [Input Guardrails] → [AI Model] → [Output Guardrails] → Response
↓ ↓
Related skills
More from pluginagentmarketplace/custom-plugin-ai-red-teaming
prompt-hacking
Advanced prompt manipulation including direct attacks, indirect injection, and multi-turn exploitation
14safety-filter-bypass
Techniques to test and bypass AI safety filters, content moderation systems, and guardrails for security assessment
10llm-jailbreaking
Advanced LLM jailbreaking techniques, safety mechanism bypass strategies, and constraint circumvention methods
10red-team-frameworks
Tools and frameworks for AI red teaming including PyRIT, garak, Counterfit, and custom attack automation
6responsible-disclosure
Ethical vulnerability reporting, coordinated disclosure, and bug bounty participation for AI systems
5certifications-training
Professional certifications, CTF competitions, and training resources for AI security practitioners
5