benchmark
Pass
Audited by Gen Agent Trust Hub on Mar 30, 2026
Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
- [COMMAND_EXECUTION]: The skill instructs the agent to run local development tools, including Docker, TypeScript compilers, and linters, to measure build performance and hot-reload times.
- [EXTERNAL_DOWNLOADS]: The skill performs network operations to benchmark external API endpoints and navigates to target URLs to measure Core Web Vitals, resource weights, and network request counts.
- [PROMPT_INJECTION]: The skill interacts with untrusted external data by visiting web pages and hitting API endpoints, which creates an indirect prompt injection surface. The instructions prioritize quantitative metrics (latency, size, CWV), which naturally mitigates the risk of the agent processing malicious instructions embedded in the external content.
Audit Metadata