skillgrade-graders
Pass
Audited by Gen Agent Trust Hub on Mar 18, 2026
Risk Level: SAFE
Full Analysis
- [COMMAND_EXECUTION]: The skill instructs the user on how to author and execute bash or TypeScript scripts (e.g.,
bash graders/check.sh) to perform deterministic scoring. This is the core intended functionality of the evaluation framework.- [CREDENTIALS_UNSAFE]: The error handling section mentions the need to setGEMINI_API_KEYorANTHROPIC_API_KEYin the environment to resolve LLM-related failures. There are no hardcoded secrets or evidence of credential exfiltration.- [SAFE]: The skill uses structured JSON schemas for outputs and provides validation procedures, which are standard practices for development and evaluation tasks.
Audit Metadata