Agent Evaluation Framework Builder
Pass
Audited by Gen Agent Trust Hub on Mar 29, 2026
Risk Level: SAFEPROMPT_INJECTION
Full Analysis
- [PROMPT_INJECTION]: The skill describes an LLM-as-judge evaluation pattern that is vulnerable to indirect prompt injection. This occurs when the response being evaluated contains instructions that manipulate the judge model's scoring behavior.
- Ingestion points: The
actual_responsevariable in theJUDGE_PROMPTtemplate withinSKILL.mdis the point where untrusted data enters the judge's context. - Boundary markers: The provided template lacks boundary markers (such as XML tags or triple backticks) to separate the instruction block from the variable data, increasing the risk that the model will follow instructions contained within the response.
- Capability inventory: The judge model has the capability to generate scores and reasoning which directly impact the evaluation metrics and CI/CD pass/fail status.
- Sanitization: No sanitization or input validation is performed on the
actual_responsecontent before it is interpolated into the judge prompt.
Audit Metadata