agent-platform-eval-flywheel

Warn

Audited by Snyk on Jun 25, 2026

Risk Level: MEDIUM
Full Analysis

MEDIUM W011: Third-party content exposure detected (indirect prompt injection risk).

  • Third-party content exposure detected (high risk: 0.85). Outsider free text can enter the LLM context via the endpoint-evaluation workflow: scripts/endpoint_evaluation.py reads a runtime-provided GCS JSONL dataset (gsutil cat), extracts each prompt, calls the endpoint to get response, then client.evals.evaluate(...) uses LLM-as-a-judge metrics (e.g., GENERAL_QUALITY) which feed that outsider-authored prompt/response text into the judge model.

MEDIUM W012: Unverifiable external dependency detected (runtime URL that controls agent).

  • Potentially malicious external URL detected (high risk: 1.00). The endpoint_evaluation/maas_evaluation scripts call gsutil at runtime to fetch a GCS JSONL dataset (e.g., gs://your-bucket/eval.jsonl) and then use the file's "prompt" fields directly as model prompts, so the external GCS content controls prompts at runtime (runtime fetch: subprocess ["gsutil","cat", dataset]).

Issues (2)

W011
MEDIUM

Third-party content exposure detected (indirect prompt injection risk).

W012
MEDIUM

Unverifiable external dependency detected (runtime URL that controls agent).

Audit Metadata
Risk Level
MEDIUM
Analyzed
Jun 25, 2026, 01:27 AM
Issues
2
Security Audit — snyk — agent-platform-eval-flywheel