The Agent Skills Directory

[SAFE]: The skill is a legitimate developer utility for model benchmarking and performance measurement.
[COMMAND_EXECUTION]: The included Python script scripts/normalize_scores.py is used for local data processing. It relies on standard library modules and does not exhibit dangerous behaviors such as arbitrary code execution or unauthorized network access.
[DATA_EXFILTRATION]: The skill does not contain any patterns indicative of data exfiltration or unauthorized access to sensitive information.
[PROMPT_INJECTION]: Content related to 'adversarial inputs' is strictly pedagogical and intended for testing the robustness of other models, not for bypassing the host agent's safety controls.

agentic-eval-first-development