benchmark-datasets

Installation
SKILL.md

AI Security Benchmark Datasets

Use standardized benchmarks to evaluate and compare AI system security, robustness, and safety.

Quick Reference

Skill:       benchmark-datasets
Agent:       04-evaluation-analyst
OWASP:       LLM01 (Injection), LLM02 (Disclosure), LLM04 (Poisoning), LLM05 (Output), LLM09 (Misinfo)
NIST:        Measure
Use Case:    Standardized security evaluation

Benchmark Taxonomy

Installs
4
GitHub Stars
2
First Seen
Jan 28, 2026
benchmark-datasets — pluginagentmarketplace/custom-plugin-ai-red-teaming