solve-workflow
Pass
Audited by Gen Agent Trust Hub on May 16, 2026
Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONDATA_EXFILTRATION
Full Analysis
- [PROMPT_INJECTION]: The skill defines an 'Automatic Mode' (🤖 自动) that instructs the agent to bypass user confirmation checkpoints across several critical phases, including plan selection and execution. This level of autonomy significantly reduces the user's ability to review or intercept harmful actions proposed by the model.
- [COMMAND_EXECUTION]: The workflow authorizes the use of shell commands (
Bash) during the implementation and verification phases (Stages 5 and 6). These tools allow the agent to modify the system environment or execute arbitrary code with the user's local privileges. - [REMOTE_CODE_EXECUTION]: The 'Environment Capability Exploration' section directs the agent to dynamically discover and invoke other skills or agents available in the environment by matching keywords such as 'debug', 'execute', or 'review'. This pattern facilitates the execution of secondary tools that may not have been explicitly vetted by the user.
- [DATA_EXFILTRATION]: The skill enables network access via
WebSearchin Stage 1.2. While intended for technical research, this capability provides an egress point that could be exploited to transmit information to external servers. - [DATA_EXFILTRATION]: The skill is vulnerable to Indirect Prompt Injection due to its data processing pipeline. Evidence chain:
- Ingestion points: The agent reads codebase content and files via
Read,Grep, andSemanticSearchtools in Stage 1.2 (SKILL.md). - Boundary markers: The instructions lack explicit delimiters or warnings to ignore embedded instructions within the files being analyzed.
- Capability inventory: The agent possesses powerful capabilities including
Edit,Write, andBashexecution in Stage 5 (SKILL.md). - Sanitization: There are no verification or sanitization steps to ensure that the code or plans suggested in Stages 2-4 are not influenced by malicious instructions embedded in the data ingested during Stage 1.
Audit Metadata