skillgrade-graders

Pass

Audited by Gen Agent Trust Hub on Mar 18, 2026

Risk Level: SAFE
Full Analysis
  • [COMMAND_EXECUTION]: The skill instructs the user on how to author and execute bash or TypeScript scripts (e.g., bash graders/check.sh) to perform deterministic scoring. This is the core intended functionality of the evaluation framework.- [CREDENTIALS_UNSAFE]: The error handling section mentions the need to set GEMINI_API_KEY or ANTHROPIC_API_KEY in the environment to resolve LLM-related failures. There are no hardcoded secrets or evidence of credential exfiltration.- [SAFE]: The skill uses structured JSON schemas for outputs and provides validation procedures, which are standard practices for development and evaluation tasks.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 18, 2026, 03:21 AM