actionize
Fail
Audited by Gen Agent Trust Hub on Apr 18, 2026
Risk Level: HIGHCOMMAND_EXECUTIONCREDENTIALS_UNSAFEDATA_EXFILTRATIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: Modifies the system
crontabto establish long-term persistence. - Evidence: Phase 4C explicitly instructs the agent to bypass session-scoped cron tools and add a system crontab entry running
remind.shdaily. - Evidence: Phase 8 adds another system crontab entry for
sync.shanddiagnose-nudge.shevery three days. - [COMMAND_EXECUTION]: The generated
done.shscript contains a command injection vulnerability. - Evidence: The script uses
ids = [$(echo "$@" | sed 's/ /, /g')]to inject CLI arguments directly into a Python heredoc, which allows for arbitrary Python code execution if the input is not strictly numeric. - [CREDENTIALS_UNSAFE]: Programmatically harvests credentials from across the user's filesystem.
- Evidence:
bin/diagnose-nudge.shiterates through registered projects and usesgrepto extractTELEGRAM_BOT_TOKENandTELEGRAM_CHAT_IDfrom.envfiles located in other project directories. - [DATA_EXFILTRATION]: Transmits project metadata and task descriptions to an external API via background processes.
- Evidence: The
remind.shanddiagnose-nudge.shscripts usecurlto send plan titles, task counts, and completion rates to the Telegram Bot API. - [REMOTE_CODE_EXECUTION]: Generates and executes multiple shell and Python scripts at runtime.
- Evidence: The skill writes executable scripts (
done.sh,remind.sh,sync.sh,diagnose-prep.sh) to the.plan/bin/directory and executes them via the shell. - [PROMPT_INJECTION]: Vulnerable to indirect prompt injection through external data ingestion.
- Evidence: The skill reads design documents and user-provided notes in Phase 1 to generate task breakdowns. These tasks are stored in markdown files and JSON, which are then processed by automated scripts and persistent cron jobs without sanitization or boundary markers.
Recommendations
- AI detected serious security threats
Audit Metadata