The Agent Skills Directory

[SAFE]: The skill implements automated performance logging and quality evaluation for production AI systems, which are essential for maintaining model reliability and safety.
[EXTERNAL_DOWNLOADS]: Mentions well-known industry platforms including Langtrace, Arize Phoenix, and Weights & Biases (Weave). These are established services for AI observability and their integration is standard practice.
[PROMPT_INJECTION]: The skill uses LLM-as-a-judge patterns to evaluate production traffic, which presents a surface for indirect prompt injection.
Ingestion points: Untrusted production data (user inputs and model outputs) is ingested from log files in sample_and_evaluate (SKILL.md) and daily_monitoring_check (examples.md).
Boundary markers: The provided prompt signatures (AssessQuality, SafetyCheck) do not include explicit delimiters or instructions to ignore potential commands embedded in the data being evaluated.
Capability inventory: The skill environment has capabilities for local file system writes and making network requests to observability platforms.
Sanitization: No explicit sanitization or filtering of the logged data is performed before it is passed to the judging LLM.

ai-monitoring