The Agent Skills Directory

[SAFE]: The skill utilizes subagent spawning to isolate execution environments, preventing sensitive conversation context from leaking into the test runs.
[COMMAND_EXECUTION]: Employs standard developer commands like git status to perform security audits of the local worktree and revert any unauthorized file modifications after a test run.
[SAFE]: Implements a 'blind' testing architecture that explicitly protects the agent from being influenced by expected outcomes or prior instructions during evaluation.

eval-skills