benchmark-models
Pass
Audited by Gen Agent Trust Hub on May 20, 2026
Risk Level: SAFE
Full Analysis
- [DATA_EXPOSURE]: The skill checks for the presence of API keys in
~/.claude/.credentials.jsonand environment variables to determine which model adapters are available for benchmarking. - [DYNAMIC_EXECUTION]: Shell environment configuration is dynamically handled by executing output from local binaries using
eval(e.g.,gstack-slug) andsource(e.g.,gstack-repo-mode). - [COMMAND_EXECUTION]: Multiple local binaries located in the
~/.claude/skills/gstack/bin/directory are invoked to manage skill updates, configuration, telemetry logging, and the core benchmarking engine. - [DATA_EXFILTRATION]: Includes opt-in mechanisms to synchronize project artifacts with a private GitHub repository and to send usage telemetry to the vendor's infrastructure. These features are inactive unless the user explicitly consents during the interactive setup flow.
- [PROMPT_INJECTION]: The skill possesses an indirect prompt injection surface as it ingests user-provided prompts or content from local files to be processed by the models being benchmarked. It utilizes interactive checkpoints to confirm prompt sources.
- [COMMAND_EXECUTION]: Automatically modifies the local
CLAUDE.mdfile (if the user opts in) to add skill routing rules, which involves standard Git operations (git add,git commit).
Audit Metadata