deepeval
Pass
Audited by Gen Agent Trust Hub on May 12, 2026
Risk Level: SAFE
Full Analysis
- [SAFE]: The skill provides a legitimate integration for the DeepEval evaluation framework. It follows industry best practices for LLM testing, including the use of committed test suites, versioned datasets, and standard environment-based credential management.
- [COMMAND_EXECUTION]: The skill utilizes the
deepevalCLI (e.g.,deepeval test run,deepeval generate) to perform evaluation and synthetic data tasks. These commands are core to the skill's functionality and are executed on locally generated or user-provided files. - [DATA_EXFILTRATION]: The skill includes functionality to upload traces, datasets, and evaluation results to the Confident AI platform. This is the official cloud service belonging to the skill's author (confident-ai) and is an intended feature for hosting reports and production monitoring, requiring explicit user authentication via
deepeval loginor environment variables. - [PROMPT_INJECTION]: The skill's workflow ingests external data from files like
.dataset.json(ingestion point) for processing in an evaluation loop. While this represents an indirect prompt injection surface—lacking explicit boundary markers or sanitization in the provided templates—the capability is limited to thedeepevaltesting environment and is fundamental to the task of AI evaluation.
Audit Metadata