diagnose-hard-problem

Pass

Audited by Gen Agent Trust Hub on May 14, 2026

Risk Level: SAFECOMMAND_EXECUTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill facilitates the execution of various diagnostic tools and scripts (e.g., git bisect, curl, Playwright, and custom shell scripts) to reproduce and debug issues. This is consistent with its stated purpose of software diagnosis.
  • [INDIRECT_PROMPT_INJECTION]: The skill exhibits an attack surface for indirect prompt injection due to the way it handles external data during the diagnosis process.
  • Ingestion points: The skill processes external data including captured network traces, error logs, and direct user responses from the hitl-loop.template.sh script.
  • Boundary markers: The instructions do not specify the use of boundary markers or instructions for the agent to ignore potentially malicious content within the ingested data.
  • Capability inventory: The agent has the capability to execute shell commands, run automated tests, perform network requests via curl, and modify the codebase.
  • Sanitization: There are no explicit sanitization or validation steps for the content of logs, traces, or user-provided strings before they are presented to the agent for analysis.
Audit Metadata
Risk Level
SAFE
Analyzed
May 14, 2026, 11:22 PM
Security Audit — agent-trust-hub — diagnose-hard-problem