phoenix-evals
Pass
Audited by Gen Agent Trust Hub on May 4, 2026
Risk Level: SAFE
Full Analysis
- [SAFE]: The skill serves as a technical resource for implementing LLM observability and evaluation. All provided code snippets and instructions align with best practices for using the Arize Phoenix platform.
- [SAFE]: Dependencies listed (e.g., arize-phoenix, openai, scikit-learn) are official, well-known packages from industry-standard sources.
- [SAFE]: External domain references (e.g., app.phoenix.arize.com) are legitimate and correspond to the official services associated with the skill's purpose.
- [SAFE]: The skill promotes secure prompt engineering by recommending XML delimiters and boundary markers in LLM evaluation templates to mitigate prompt injection risks.
- [SAFE]: No evidence of malicious patterns such as credential exfiltration, persistence, or unauthorized privilege escalation was found in the documentation or provided scripts.
Audit Metadata