ai-monitoring
Pass
Audited by Gen Agent Trust Hub on May 1, 2026
Risk Level: SAFEPROMPT_INJECTIONEXTERNAL_DOWNLOADS
Full Analysis
- [SAFE]: The skill implements automated performance logging and quality evaluation for production AI systems, which are essential for maintaining model reliability and safety.
- [EXTERNAL_DOWNLOADS]: Mentions well-known industry platforms including Langtrace, Arize Phoenix, and Weights & Biases (Weave). These are established services for AI observability and their integration is standard practice.
- [PROMPT_INJECTION]: The skill uses LLM-as-a-judge patterns to evaluate production traffic, which presents a surface for indirect prompt injection.
- Ingestion points: Untrusted production data (user inputs and model outputs) is ingested from log files in
sample_and_evaluate(SKILL.md) anddaily_monitoring_check(examples.md). - Boundary markers: The provided prompt signatures (
AssessQuality,SafetyCheck) do not include explicit delimiters or instructions to ignore potential commands embedded in the data being evaluated. - Capability inventory: The skill environment has capabilities for local file system writes and making network requests to observability platforms.
- Sanitization: No explicit sanitization or filtering of the logged data is performed before it is passed to the judging LLM.
Audit Metadata