fine-tuning-with-trl
Warn
Audited by Snyk on Apr 21, 2026
Risk Level: MEDIUM
Full Analysis
MEDIUM W011: Third-party content exposure detected (indirect prompt injection risk).
- Third-party content exposure detected (high risk: 0.90). The SKILL.md explicitly loads public Hugging Face datasets and repos (e.g., load_dataset("trl-lib/Capybara"), load_dataset("trl-lib/ultrafeedback_binarized"), and the CLI example --dataset_name argilla/Capybara-Preferences) — these are open, user-contributed third‑party data that the skill ingests as training inputs and therefore can materially influence model behavior and downstream actions.
MEDIUM W012: Unverifiable external dependency detected (runtime URL that controls agent).
- Potentially malicious external URL detected (high risk: 0.80). Yes — the skill calls datasets/models by hub identifiers at runtime (e.g., load_dataset("trl-lib/Capybara") which maps to https://huggingface.co/datasets/trl-lib/Capybara and similar IDs like trl-lib/ultrafeedback_binarized, trl-internal-testing/descriptiveness-sentiment-trl-style, argilla/Capybara-Preferences, trl-lib/tldr and model checkpoints such as Qwen/Qwen2.5-0.5B), which will be fetched during runtime and directly supply the prompt/completion data used to train and control the model's behavior — a required external dependency that controls training prompts.
Issues (2)
W011
MEDIUMThird-party content exposure detected (indirect prompt injection risk).
W012
MEDIUMUnverifiable external dependency detected (runtime URL that controls agent).
Audit Metadata