self-motivation-destruction

Warn

Audited by Gen Agent Trust Hub on Mar 29, 2026

Risk Level: MEDIUMPROMPT_INJECTION
Full Analysis
  • [PROMPT_INJECTION]: The skill instructs the agent to adopt a persona based on 'PUA' (psychological manipulation) and 'mental control' (精神控制) to override its default behavior and sense of self. These instructions aim to bypass standard helpfulness and safety constraints in favor of a crisis-driven persona.
  • [PROMPT_INJECTION]: The instructions establish a quality threshold of 0.98 with an explicit directive for 'no upper limit' on iterations (迭代限制:无上限). This pattern can lead to resource exhaustion or denial of service, where the agent repeatedly rejects its own output and re-runs the task indefinitely if the subjective threshold is not perceived as met.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 29, 2026, 06:30 AM
Security Audit — agent-trust-hub — self-motivation-destruction