evaluate-skill
Pass
Audited by Gen Agent Trust Hub on Jun 28, 2026
Risk Level: SAFECOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
- [EXTERNAL_DOWNLOADS]: The skill instructs the user to install the
caliper-evalpackage viapipxand references a third-party Homebrew tap (steipete/tap/summarize) for thesummarizetool used in evaluation examples.\n- [REMOTE_CODE_EXECUTION]: The Caliper evaluation framework executes Python code snippets provided in theassertfield of.eval.yamlfiles. This dynamic execution is used to deterministically verify task outcomes, such as checking if a file exists or matches a pattern.\n- [COMMAND_EXECUTION]: The skill uses shell commands to run evaluations, manage files, and execute helper scripts. This includes a reference to a screenshot helper that usespowershell -ExecutionPolicy Bypasson Windows to capture desktop images.\n- [DATA_EXPOSURE]: Thesummarizereference skill provides instructions for configuring API keys for various LLM providers (OpenAI, Anthropic, Google, xAI) via environment variables or a local config file (~/.summarize/config.json).
Audit Metadata