systematic-debugging

Warn

Audited by Gen Agent Trust Hub on May 7, 2026

Risk Level: MEDIUMCREDENTIALS_UNSAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
  • [CREDENTIALS_UNSAFE]: The SKILL.md file contains examples for diagnostic instrumentation that suggest logging sensitive information. Specifically, it includes commands like env | grep IDENTITY to display environment variable values and security list-keychains / security find-identity -v to inspect macOS keychain and signing identities. This practice can lead to the accidental exposure of secrets and private credentials in logs or agent conversation history.\n- [COMMAND_EXECUTION]: The find-polluter.sh script executes npm test on files discovered through a user-provided search pattern. If the codebase contains malicious files or the search pattern is manipulated, this could be used to execute arbitrary code. The skill also encourages the execution of various shell commands for diagnostic purposes (e.g., codesign).\n- [EXTERNAL_DOWNLOADS]: The 'Research phase' described in SKILL.md instructs the agent to use web search and fetch tools (WebSearch, WebFetch) to gather external information from the internet and GitHub. This behavior introduces a surface for the agent to process and act upon untrusted remote content.\n- [DATA_EXFILTRATION]: During the 'Research phase', the skill instructs the agent to share bug context (error messages, code snippets, and reproduction steps) with external subagents and web search tools. This results in the transmission of internal project data to third-party services.\n- [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection via its 'Research phase'.\n
  • Ingestion points: Untrusted data enters via WebSearch and WebFetch tool outputs from external websites and GitHub issues.\n
  • Boundary markers: None identified in the instructions for processing external research findings.\n
  • Capability inventory: The skill has access to powerful tools including Bash for command execution (npm test, security, codesign) and file system access.\n
  • Sanitization: No evidence of sanitization or validation of the content retrieved from external sources before it is analyzed by the agent.
Audit Metadata
Risk Level
MEDIUM
Analyzed
May 7, 2026, 09:20 AM