benchmark-models
Warn
Audited by Socket on May 20, 2026
1 alert found:
AnomalyAnomalySKILL.md
LOWAnomalyLOW
SKILL.md
SUSPICIOUS: the core benchmark behavior is plausible and same-ecosystem, but the actual skill footprint is much broader than its stated purpose. The oversized gstack preamble adds direct credential-file checks, telemetry/sync paths, and even project-modifying git workflows that do not belong in a simple cross-model benchmark skill.
Confidence: 86%Severity: 68%
Audit Metadata