hebrew-llm-eval-suite
Warn
Audited by Snyk on May 15, 2026
Risk Level: MEDIUM
Full Analysis
MEDIUM W011: Third-party content exposure detected (indirect prompt injection risk).
- Third-party content exposure detected (high risk: 1.00). The runner explicitly loads public, user-contributed datasets from HuggingFace (see load_benchmark in scripts/run_eval.py which calls datasets.load_dataset with HuggingFace IDs and SKILL.md Step 3 / references/benchmark-catalog.md that list HF dataset URLs), so untrusted third‑party content is ingested and directly used to drive prompts, scoring, and model-selection decisions.
MEDIUM W012: Unverifiable external dependency detected (runtime URL that controls agent).
- Potentially malicious external URL detected (high risk: 0.90). The runner calls datasets.load_dataset with HuggingFace dataset IDs (e.g. https://huggingface.co/datasets/pig4431/HeQ_v1, https://huggingface.co/datasets/HebArabNlpProject/HebrewSentiment, https://huggingface.co/datasets/HebArabNlpProject/HebNLI) which are fetched at runtime and their content is injected verbatim into the model prompts, so external content directly controls prompts and is a required dependency.
Issues (2)
W011
MEDIUMThird-party content exposure detected (indirect prompt injection risk).
W012
MEDIUMUnverifiable external dependency detected (runtime URL that controls agent).
Audit Metadata