The Agent Skills Directory

[SAFE]: The skill implementation focuses on evaluation logic, metrics, and monitoring of agent performance. All operations are local and perform benign calculations based on agent responses.
[SAFE]: The Python script scripts/evaluator.py uses only safe standard libraries (typing, dataclasses, enum, time, random) for timestamping, data structuring, and sampling interaction logs.
[SAFE]: No network operations (curl, wget, requests), sensitive file access (.env, .ssh, .aws), or hardcoded credentials were detected in any of the skill's files.
[SAFE]: No dynamic execution patterns (such as eval(), exec(), or subprocess) or external code downloads were found.
[PROMPT_INJECTION]: The skill exhibits an indirect prompt injection surface as it processes untrusted agent outputs for evaluation, though it lacks exploitable capabilities to leverage this surface.
Ingestion points: The output and query parameters in the AgentEvaluator.evaluate and ProductionMonitor classes in scripts/evaluator.py ingest untrusted agent-generated content.
Boundary markers: Absent; the script does not use specific delimiters to separate data from instructions during processing.
Capability inventory: None; the skill's scripts contain no dangerous functions such as subprocess.run, eval, exec, or network communication tools.
Sanitization: Absent; the output is processed using standard string methods (.lower(), keyword matching) without specific sanitization filters.

evaluation