agent-evaluation

Pass

Audited by Gen Agent Trust Hub on May 19, 2026

Risk Level: SAFECOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill utilizes shell commands and local scripts to aggregate session data for evaluation.
  • Evidence: Documented use of cat, jq, and npx tsx eval.ts to process session JSON files and execution traces.
  • [DATA_EXFILTRATION]: Execution traces and task data are transmitted to external LLM providers (OpenAI, Azure) for scoring purposes. This is a standard operation for this type of tool but involves external data sharing.
  • Evidence: curl commands targeting api.openai.com and Azure endpoint configurations in agent-eval.yaml.
  • [PROMPT_INJECTION]: The skill processes untrusted agent outputs and execution traces within an evaluation template, creating a surface for indirect prompt injection.
  • Ingestion points: TASK, TRACE, and OUTPUT variables extracted from .reflection/session_*.json and shell environment.
  • Boundary markers: None present in the evaluation prompt template to distinguish untrusted content.
  • Capability inventory: Network access via curl and file access via cat.
  • Sanitization: No sanitization or validation of the agent-generated trace content is implemented before it is sent to the LLM judge.
Audit Metadata
Risk Level
SAFE
Analyzed
May 19, 2026, 05:13 AM