skill-testing
Fail
Audited by Socket on Mar 6, 2026
1 alert found:
Obfuscated FileObfuscated Filetemplates/eval.sh
HIGHObfuscated FileHIGH
templates/eval.sh
The script is a benign tool for automated evaluation of agent skills using remote LLMs. No direct malicious code, backdoors, or hardcoded secrets were found. The dominant security risk is privacy/exfiltration: SKILL_CONTENT and scenario data are sent to third-party LLM providers (via claude/opencode CLIs). Mitigations: avoid running with sensitive skill files, run CLIs from trusted installs, consider local-only models or on-premise LLMs, and redact secrets before including files in SKILL_CONTENT. The script's tmp-file deletion guard is a positive safety measure.
Confidence: 98%
Audit Metadata