benchmark-models
Warn
Audited by Snyk on May 20, 2026
Risk Level: MEDIUM
Full Analysis
MEDIUM W011: Third-party content exposure detected (indirect prompt injection risk).
- Third-party content exposure detected (high risk: 0.80). The workflow explicitly runs external model providers (Claude, GPT via Codex CLI, Gemini) in Step 4 and streams/interprets their outputs in Step 5 (and optionally an Anthropic judge in Step 3), so the agent ingests untrusted third‑party model responses as part of its decision-making.
MEDIUM W012: Unverifiable external dependency detected (runtime URL that controls agent).
- Potentially malicious external URL detected (high risk: 0.80). The skill performs a runtime git fetch/merge from the remote "origin" for the gstack home (i.e., the repo URL configured in $GSTACK_HOME/.git/config, fetched via
git fetch origin), which will pull remote repo changes that can modify skill prompts/binaries and thus directly control agent behavior.
Issues (2)
W011
MEDIUMThird-party content exposure detected (indirect prompt injection risk).
W012
MEDIUMUnverifiable external dependency detected (runtime URL that controls agent).
Audit Metadata