The Agent Skills Directory

[COMMAND_EXECUTION]: The core logic is defined in run-benchmark.sh, which uses git worktree and tar to create clean, isolated environments for benchmarking scenarios. It also invokes the claude CLI and several Node.js scripts.
[COMMAND_EXECUTION]: Automated sessions are executed using the claude CLI with the --dangerously-skip-permissions flag to suppress interactive tool-use prompts, enabling unattended benchmarking. This is a legitimate requirement for the skill's purpose.
[SAFE]: The skill operates entirely on local files and uses standard system utilities. No remote code execution, external downloads, or data exfiltration attempts were found.

ln-840-benchmark-compare