langsmith-evaluator

Pass

Audited by Gen Agent Trust Hub on Apr 9, 2026

Risk Level: SAFEEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • Indirect Prompt Injection Surface: The skill defines evaluators that process agent outputs which are potentially untrusted. This creates a surface where the agent's output could attempt to influence the LLM judge's scoring or reasoning. (1) Ingestion points: Agent outputs and dataset examples are processed in accuracy_evaluator and trajectory_evaluator in SKILL.md. (2) Boundary markers: There are no explicit delimiters or instructions to ignore embedded commands in the example prompt templates. (3) Capability inventory: The evaluators utilize the ChatOpenAI and OpenAI APIs for structured output. (4) Sanitization: The provided code snippets do not include explicit sanitization or validation of the agent outputs before interpolation.
  • Remote Script Execution: The setup section includes a command to download and execute an installation script for the LangSmith CLI directly from the official LangChain AI GitHub repository. This is a standard and recognized installation method for the vendor's tools.
  • Secure Credential Handling: The instructions guide the user to configure API keys for LangSmith and OpenAI using environment variables. The examples use standard placeholders and avoid hardcoding actual secrets.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 9, 2026, 07:09 AM