harbor-adapter-creator

Fail

Audited by Gen Agent Trust Hub on Apr 7, 2026

Risk Level: HIGHREMOTE_CODE_EXECUTIONEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [REMOTE_CODE_EXECUTION]: The documentation includes a template for test.sh that installs the uv package manager by fetching an installation script from the official astral.sh domain and piping it to a shell. While this domain is the official source for this well-known tool, the pattern involves executing code fetched directly from the internet.\n- [EXTERNAL_DOWNLOADS]: The skill facilitates downloading benchmark datasets and assets from official domains including OpenAI's public storage, Hugging Face, and GitHub repositories as part of the data loading and task generation process.\n- [COMMAND_EXECUTION]: The core functionality involves generating and executing Python and shell scripts (adapter.py, run_adapter.py, test.sh, solve.sh) and dynamically constructing Dockerfiles for benchmark environments.\n- [PROMPT_INJECTION]: The skill framework presents a surface for indirect prompt injection (Category 8).\n
  • Ingestion points: Benchmark data is ingested from external CSV and JSONL files or remote repositories (e.g., SimpleQA and GAIA datasets).\n
  • Boundary markers: The provided templates for instruction.md and test.sh do not include delimiters to isolate untrusted benchmark data or instructions to ignore potentially malicious embedded commands.\n
  • Capability inventory: The generated tasks involve the creation and execution of shell scripts and instructions that incorporate external data from benchmark instances.\n
  • Sanitization: No automated sanitization is implemented in the adapter templates; the documentation explicitly notes that unescaped characters in benchmark data can disrupt generated scripts.
Recommendations
  • HIGH: Downloads and executes remote code from: https://astral.sh/uv/0.9.7/install.sh - DO NOT USE without thorough review
Audit Metadata
Risk Level
HIGH
Analyzed
Apr 7, 2026, 09:58 AM