plotting-agent

Fail

Audited by Gen Agent Trust Hub on Apr 14, 2026

Risk Level: HIGHREMOTE_CODE_EXECUTIONEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [REMOTE_CODE_EXECUTION]: The skill facilitates the execution of code from an unverified external source. The script 'scripts/paperbanana_render.py' dynamically adds 'PAPERBANANA_PATH' to the Python system path and imports multiple modules from it. This path is intended to contain a clone of a third-party repository (https://github.com/dwzhu-pku/PaperBanana). \n- [EXTERNAL_DOWNLOADS]: Documentation in 'references/paperbanana-cookbook.md' directs the agent or user to download the PaperBanana toolkit from GitHub and install its dependencies. This repository is owned by an individual user and is not associated with a trusted organization or verified service. \n- [COMMAND_EXECUTION]: The skill requires the host agent to generate Python code for matplotlib based on user-supplied data and then execute that code in the local environment. This creates a risk where malicious input could influence the generated code to perform unauthorized actions. \n- [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection by processing untrusted user data. \n
  • Ingestion points: Data enters the agent context through 'workspace/inputs/idea.md' and 'workspace/inputs/experimental_log.md'. \n
  • Boundary markers: No explicit delimiters or instructions to ignore embedded commands are present in the figure spec or caption prompts. \n
  • Capability inventory: The skill can execute Python/matplotlib code via the Bash tool and performs dynamic module loading in 'scripts/paperbanana_render.py'. \n
  • Sanitization: There is no evidence of input validation or sanitization before external content is used in prompts.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Apr 14, 2026, 02:01 PM