llm-evaluation

Pass

Audited by Gen Agent Trust Hub on May 12, 2026

Risk Level: SAFE
Full Analysis
  • [SAFE]: The skill is a collection of documentation and code snippets designed to help developers establish evaluation frameworks for AI applications. No malicious intent or dangerous capabilities were identified.
  • [EXTERNAL_DOWNLOADS]: The skill references several well-known machine learning models and libraries. It includes instructions to load models from Microsoft's official Hugging Face repositories, such as 'microsoft/deberta-xlarge-mnli' and 'microsoft/deberta-large-mnli', which are standard for natural language inference and semantic evaluation tasks.
  • [COMMAND_EXECUTION]: No shell commands, privilege escalation attempts, or autonomous execution patterns are present in the skill files.
  • [DATA_EXFILTRATION]: There are no patterns indicating the collection or exfiltration of sensitive user data, environment variables, or credentials.
Audit Metadata
Risk Level
SAFE
Analyzed
May 12, 2026, 01:42 PM