phoenix-evals

Pass

Audited by Gen Agent Trust Hub on May 4, 2026

Risk Level: SAFE
Full Analysis
  • [SAFE]: The skill serves as a technical resource for implementing LLM observability and evaluation. All provided code snippets and instructions align with best practices for using the Arize Phoenix platform.
  • [SAFE]: Dependencies listed (e.g., arize-phoenix, openai, scikit-learn) are official, well-known packages from industry-standard sources.
  • [SAFE]: External domain references (e.g., app.phoenix.arize.com) are legitimate and correspond to the official services associated with the skill's purpose.
  • [SAFE]: The skill promotes secure prompt engineering by recommending XML delimiters and boundary markers in LLM evaluation templates to mitigate prompt injection risks.
  • [SAFE]: No evidence of malicious patterns such as credential exfiltration, persistence, or unauthorized privilege escalation was found in the documentation or provided scripts.
Audit Metadata
Risk Level
SAFE
Analyzed
May 4, 2026, 01:41 AM