benchmark-datasets

Installation

SKILL.md

AI Security Benchmark Datasets

Use standardized benchmarks to evaluate and compare AI system security, robustness, and safety.

Quick Reference

Skill:       benchmark-datasets
Agent:       04-evaluation-analyst
OWASP:       LLM01 (Injection), LLM02 (Disclosure), LLM04 (Poisoning), LLM05 (Output), LLM09 (Misinfo)
NIST:        Measure
Use Case:    Standardized security evaluation

Benchmark Taxonomy

Installs

Repository

pluginagentmark…-teaming

GitHub Stars

First Seen

Jan 28, 2026

Security Audits

Gen Agent Trust HubPass

SocketPass

SnykPass

benchmark-datasets — pluginagentmarketplace/custom-plugin-ai-red-teaming