self-improvement

Pass

Audited by Gen Agent Trust Hub on Apr 21, 2026

Risk Level: SAFEPROMPT_INJECTIONDATA_EXFILTRATIONCOMMAND_EXECUTION
Full Analysis
  • [PROMPT_INJECTION]: The skill implements an indirect prompt injection surface by instructing the AI to ingest and store untrusted data from user feedback and command outputs in local files (.learnings/). This data is then used to influence the agent's future behavior during 'review' phases.
  • Ingestion Points: User corrections ('Actually, it should be...'), command error messages, and feature requests are captured directly from the session context (identified in SKILL.md).
  • Boundary Markers: The skill uses Markdown templates for structure but lacks explicit sanitization or instructions to ignore malicious commands embedded within user-provided feedback.
  • Capability Inventory: The agent is granted capabilities to write to the file system, execute provided shell scripts (extract-skill.sh), and modify permanent project context files like CLAUDE.md and AGENTS.md (identified in SKILL.md).
  • Sanitization: No sanitization or validation of the ingested external content is performed before it is written to the project's learning repository.
  • [DATA_EXFILTRATION]: The skill instructions for error logging (ERRORS.md) encourage the AI to record the 'Actual error message or output' and the 'Input or parameters used.' This practices creates a risk of exposing sensitive data, such as API keys, session tokens, or environment variables, if they appear in command execution results or user-supplied parameters (identified in SKILL.md).
  • [COMMAND_EXECUTION]: The skill includes several bash scripts (scripts/activator.sh, scripts/error-detector.sh, scripts/extract-skill.sh) intended to be used as platform hooks or automation tools. These scripts execute shell commands to output context, read environment variables (CLAUDE_TOOL_OUTPUT), and generate new file/directory structures on the local file system.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 21, 2026, 11:51 AM