gaia-debugging

Warn

Audited by Gen Agent Trust Hub on Jun 13, 2026

Risk Level: MEDIUMREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [REMOTE_CODE_EXECUTION]: The skill uses npx @claude-flow/cli@latest to download and execute code directly from the npm registry. This represents a significant supply chain risk as the package is not from a recognized trusted organization and the latest version is pulled dynamically at runtime.
  • [COMMAND_EXECUTION]: Multiple steps utilize node -e to execute inline JavaScript logic. This pattern allows for the execution of complex, dynamically generated code that can bypass simple static analysis and potentially perform unauthorized operations on the host system.
  • [EXTERNAL_DOWNLOADS]: The use of npx to fetch @claude-flow/cli involves downloading code from an external source. Without version pinning or source verification, this exposes the environment to potential malicious package updates or typosquatting attacks.
  • [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection through its diagnostic workflow:
  • Ingestion points: It reads raw benchmark traces from ~/.cache/ruflo/gaia/results-latest.json (SKILL.md Step 1).
  • Boundary markers: There are no boundary markers or instructions to the agent to disregard embedded commands within the trace data.
  • Capability inventory: The skill has access to the Bash tool and broad execution capabilities via node and npx.
  • Sanitization: Trace data, which may contain content from external websites or tool outputs encountered during the benchmark, is displayed and processed without any sanitization or filtering.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Jun 13, 2026, 01:22 PM
Security Audit — agent-trust-hub — gaia-debugging