ultraqa

Warn

Audited by Gen Agent Trust Hub on May 6, 2026

Risk Level: MEDIUMPROMPT_INJECTIONCOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
  • [PROMPT_INJECTION]: The skill utilizes 'mode activation' and role-play directives such as '[ULTRAQA ACTIVATED]' and 'You are now in ULTRAQA mode', which are common techniques used to override an agent's standard operational constraints.
  • [PROMPT_INJECTION]: The instructions reference a non-existent version ('GPT-5.4 Guidance Alignment') and explicitly direct the agent to skip user confirmation for 'safe reversible steps', which reduces the transparency and control the user has over the agent's actions.
  • [COMMAND_EXECUTION]: The skill is designed to run arbitrary project-level shell commands for tests, builds, and linting based on user input or project structure, which can execute code within the local environment.
  • [REMOTE_CODE_EXECUTION]: The workflow includes an autonomous 'executor' role that is tasked with applying code fixes ('Apply the fix precisely as recommended') to files. This automated code generation and file modification process poses a risk if the preceding diagnosis phase is influenced by malicious data.
  • [DATA_EXPOSURE]: The skill ingests untrusted data in the form of test and build outputs (Category 8) to drive the 'architect' diagnosis. Evidence of this attack surface includes:
  • Ingestion points: Test/build output is captured and passed to the 'architect' role in SKILL.md.
  • Boundary markers: There are no explicit delimiters or instructions to ignore malicious content within the processed output.
  • Capability inventory: The skill has the ability to modify files via the 'executor' role.
  • Sanitization: There is no evidence of sanitization or validation of the ingested output before it influences code changes.
Audit Metadata
Risk Level
MEDIUM
Analyzed
May 6, 2026, 05:06 PM