verification-loops

Pass

Audited by Gen Agent Trust Hub on Apr 12, 2026

Risk Level: SAFEPROMPT_INJECTION
Full Analysis
  • [PROMPT_INJECTION]: The LLMJudgeGrader implementation in SKILL.md presents a surface for indirect prompt injection by interpolating untrusted agent outputs directly into an evaluation prompt without delimiters.\n
  • Ingestion points: The output parameter in the evaluate method of LLMJudgeGrader.\n
  • Boundary markers: Absent; the agent output is concatenated directly into the JUDGE_PROMPT string template.\n
  • Capability inventory: The result of this grader determines the control flow of the agent pipeline (acceptance or rejection of generated content).\n
  • Sanitization: None; the skill provides a conceptual implementation without input escaping or validation.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 12, 2026, 07:03 PM
Security Audit — agent-trust-hub — verification-loops