gaia-debugging
Warn
Audited by Gen Agent Trust Hub on Jun 13, 2026
Risk Level: MEDIUMREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
- [REMOTE_CODE_EXECUTION]: The skill uses
npx @claude-flow/cli@latestto download and execute code directly from the npm registry. This represents a significant supply chain risk as the package is not from a recognized trusted organization and the latest version is pulled dynamically at runtime. - [COMMAND_EXECUTION]: Multiple steps utilize
node -eto execute inline JavaScript logic. This pattern allows for the execution of complex, dynamically generated code that can bypass simple static analysis and potentially perform unauthorized operations on the host system. - [EXTERNAL_DOWNLOADS]: The use of
npxto fetch@claude-flow/cliinvolves downloading code from an external source. Without version pinning or source verification, this exposes the environment to potential malicious package updates or typosquatting attacks. - [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection through its diagnostic workflow:
- Ingestion points: It reads raw benchmark traces from
~/.cache/ruflo/gaia/results-latest.json(SKILL.md Step 1). - Boundary markers: There are no boundary markers or instructions to the agent to disregard embedded commands within the trace data.
- Capability inventory: The skill has access to the
Bashtool and broad execution capabilities vianodeandnpx. - Sanitization: Trace data, which may contain content from external websites or tool outputs encountered during the benchmark, is displayed and processed without any sanitization or filtering.
Audit Metadata