langsmith-evaluator

Installation

Summary

Build evaluation pipelines for LangSmith with LLM-as-Judge and custom code evaluators.

Three core components: creating evaluators (LLM-as-Judge or custom code), defining run functions to capture agent outputs and trajectories, and running evaluations locally or auto-running via uploaded evaluators
Supports both offline evaluators (comparing run outputs to dataset examples) and online evaluators (real-time quality checks on production runs)
Requires LangSmith API key and project configuration; includes Python and TypeScript examples with structured output support for LLM judges
Critical workflow: inspect actual agent output structure and dataset schema before writing evaluators; query LangSmith traces to verify trajectory data and field names match

SKILL.md

LANGSMITH_API_KEY=lsv2_pt_your_api_key_here          # REQUIRED
LANGSMITH_PROJECT=your-project-name                   # Check this to know which project has traces
LANGSMITH_WORKSPACE_ID=your-workspace-id              # Optional: for org-scoped keys
OPENAI_API_KEY=your_openai_key                        # For LLM as Judge

Authentication is REQUIRED: either set the LANGSMITH_API_KEY environment variable, or pass the --api-key flag to CLI commands (preferred):

langsmith evaluator list --api-key $LANGSMITH_API_KEY

Related skills

More from langchain-ai/langsmith-skills

Installs

1.8K

Repository

langchain-ai/la…h-skills

GitHub Stars

115

First Seen

Mar 4, 2026

Security Audits

Gen Agent Trust HubPass

SocketPass

SnykWarn

langsmith-evaluator

More from langchain-ai/langsmith-skills

langsmith-trace

langsmith-dataset