llm-evaluation
Pass
Audited by Gen Agent Trust Hub on May 12, 2026
Risk Level: SAFE
Full Analysis
- [SAFE]: The skill is a collection of documentation and code snippets designed to help developers establish evaluation frameworks for AI applications. No malicious intent or dangerous capabilities were identified.
- [EXTERNAL_DOWNLOADS]: The skill references several well-known machine learning models and libraries. It includes instructions to load models from Microsoft's official Hugging Face repositories, such as 'microsoft/deberta-xlarge-mnli' and 'microsoft/deberta-large-mnli', which are standard for natural language inference and semantic evaluation tasks.
- [COMMAND_EXECUTION]: No shell commands, privilege escalation attempts, or autonomous execution patterns are present in the skill files.
- [DATA_EXFILTRATION]: There are no patterns indicating the collection or exfiltration of sensitive user data, environment variables, or credentials.
Audit Metadata