llm-evaluation
Pass
Audited by Gen Agent Trust Hub on May 29, 2026
Risk Level: SAFE
Full Analysis
- [SAFE]: The skill uses established open-source libraries and official APIs from well-known providers to perform model evaluations. The code follows standard practices for calculating automated metrics, conducting A/B tests, and tracking performance regressions.
- [PROMPT_INJECTION]: The skill implements LLM-as-judge patterns which are inherently susceptible to indirect prompt injection where the data being evaluated attempts to influence the evaluator.
- Ingestion points: Model responses (e.g., in
llm_judge_quality) are interpolated directly into prompts processed by an evaluator LLM. - Boundary markers: Absent; the prompts use simple labels like 'Response:' without specialized delimiters or instructions to ignore nested commands.
- Capability inventory: The judging functions return structured data and do not have access to dangerous system capabilities, file system writes, or unauthorized network operations.
- Sanitization: None; input is passed directly to the model. This is standard behavior for evaluation tools and the risk is restricted to the accuracy of the resulting metrics.
Audit Metadata