advanced-evaluation
Pass
Audited by Gen Agent Trust Hub on Apr 29, 2026
Risk Level: SAFE
Full Analysis
- [SAFE]: The skill is entirely instructional and provides boilerplate code for evaluation metrics and bias mitigation strategies. All provided scripts and references align with the stated purpose and follow established industry best practices for model evaluation.
- [PROMPT_INJECTION]: The skill defines a surface for indirect prompt injection, as it is designed to ingest and evaluate untrusted LLM outputs. This is a characteristic of the intended use case (LLM-as-a-judge) rather than a malicious pattern. The risk is low because the skill does not request or utilize any dangerous tools, such as network access or shell execution, that could be abused if an injection were successful.
- Ingestion points: Untrusted data enters the agent context via variables like
response,response_a, andresponse_bin the evaluation scripts and templates (e.g.,scripts/evaluation_example.pyandreferences/full-guide.md). - Boundary markers: The provided prompt templates use clear markdown headers (e.g.,
## Response to Evaluate) to isolate third-party data from the evaluation instructions. - Capability inventory: No dangerous operations, such as subprocess calls, file-system writes, or network requests, are present in the provided scripts or references.
- Sanitization: The skill uses basic input validation for non-empty strings but does not perform content-based sanitization, which is typical for text evaluation templates.
Audit Metadata