visual-qa
Warn
Audited by Gen Agent Trust Hub on May 9, 2026
Risk Level: MEDIUMCOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill instructs the agent to manually construct and execute shell commands using freeform text from user arguments. This creates a significant risk of command injection if the agent fails to escape shell metacharacters like semicolons, backticks, or pipes.
- [COMMAND_EXECUTION]: The visual_qa.py script allows an arbitrary log file path to be specified via the --log argument. If manipulated, this could permit the writing of data to sensitive local files or configuration scripts.
- [DATA_EXFILTRATION]: The Python script transmits game assets and textual context to the Google Gemini API. While this is the primary function of the skill, it involves sending potentially sensitive data to an external service provider.
- [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection because it processes images and task descriptions from potentially untrusted sources which could contain instructions to manipulate the agent.
- Ingestion points: Images and task context strings enter the agent's context through visual_qa.py and native vision tool reads.
- Boundary markers: The skill uses markdown headers like '## Task Context' in its prompt templates, but these provide limited protection against adversarial instructions.
- Capability inventory: The agent can execute shell commands, read local files passed as arguments, and perform network operations via the GenAI SDK.
- Sanitization: No sanitization or validation of the input text or image content is performed before processing.
Audit Metadata