rag-eval

Pass

Audited by Gen Agent Trust Hub on Jun 14, 2026

Risk Level: SAFE
Full Analysis
  • [SAFE]: The skill is a standard development and benchmarking tool provided by NVIDIA. Its operations are consistent with its stated purpose of evaluating RAG (Retrieval-Augmented Generation) quality.
  • [COMMAND_EXECUTION]: The skill provides commands to execute a local evaluation script scripts/eval/evaluate_rag.py using the uv package manager. These commands are necessary for the skill's functionality and target internal repository paths.
  • [EXTERNAL_DOWNLOADS]: The skill uses uv sync to manage Python dependencies for the evaluation script. These dependencies are standard for the task and are loaded through conventional package management workflows.
  • [CREDENTIALS_UNSAFE]: The skill references the use of an NVIDIA_API_KEY for RAGAS evaluation. It includes a dedicated 'Credential hygiene' section that correctly advises users against hardcoding secrets, recommending the use of environment files and secrets managers instead.
  • [DATA_EXFILTRATION]: Network operations are restricted to communication with a RAG server (defaulting to localhost) and an official NVIDIA API for judge scoring. There are no signs of unauthorized data exfiltration.
Audit Metadata
Risk Level
SAFE
Analyzed
Jun 14, 2026, 04:52 PM
Security Audit — agent-trust-hub — rag-eval