langsmith-evaluator
Pass
Audited by Gen Agent Trust Hub on Apr 9, 2026
Risk Level: SAFEEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
- Indirect Prompt Injection Surface: The skill defines evaluators that process agent outputs which are potentially untrusted. This creates a surface where the agent's output could attempt to influence the LLM judge's scoring or reasoning. (1) Ingestion points: Agent outputs and dataset examples are processed in accuracy_evaluator and trajectory_evaluator in SKILL.md. (2) Boundary markers: There are no explicit delimiters or instructions to ignore embedded commands in the example prompt templates. (3) Capability inventory: The evaluators utilize the ChatOpenAI and OpenAI APIs for structured output. (4) Sanitization: The provided code snippets do not include explicit sanitization or validation of the agent outputs before interpolation.
- Remote Script Execution: The setup section includes a command to download and execute an installation script for the LangSmith CLI directly from the official LangChain AI GitHub repository. This is a standard and recognized installation method for the vendor's tools.
- Secure Credential Handling: The instructions guide the user to configure API keys for LangSmith and OpenAI using environment variables. The examples use standard placeholders and avoid hardcoding actual secrets.
Audit Metadata