sf-eval
Pass
Audited by Gen Agent Trust Hub on Mar 28, 2026
Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill executes a local bash script
evals/checks/static-checks.shwhen a user requests a static check on a file. This is an intended function for code quality verification and is restricted to the local environment.\n- [PROMPT_INJECTION]: The instructions for 'Baseline Generation' guide the AI to simulate a generic model by intentionally omitting platform-specific security patterns such asWITH USER_MODE. This is a documented benchmarking technique used to quantify the value of skill-provided context and does not target the agent's safety guardrails.\n- [DATA_EXPOSURE]: The skill reads local benchmark definitions, rubrics, and judge prompts from the project'sevals/directory to facilitate the comparison and scoring process. These operations are confined to the local repository context.
Audit Metadata