langsmith-evaluator

Installation
Summary

Build evaluation pipelines for LangSmith with LLM-as-Judge and custom code evaluators.

  • Three core components: creating evaluators (LLM-as-Judge or custom code), defining run functions to capture agent outputs and trajectories, and running evaluations locally or auto-running via uploaded evaluators
  • Supports both offline evaluators (comparing run outputs to dataset examples) and online evaluators (real-time quality checks on production runs)
  • Requires LangSmith API key and project configuration; includes Python and TypeScript examples with structured output support for LLM judges
  • Critical workflow: inspect actual agent output structure and dataset schema before writing evaluators; query LangSmith traces to verify trajectory data and field names match
SKILL.md
LANGSMITH_API_KEY=lsv2_pt_your_api_key_here          # REQUIRED
LANGSMITH_PROJECT=your-project-name                   # Check this to know which project has traces
LANGSMITH_WORKSPACE_ID=your-workspace-id              # Optional: for org-scoped keys
OPENAI_API_KEY=your_openai_key                        # For LLM as Judge

Authentication is REQUIRED: either set the LANGSMITH_API_KEY environment variable, or pass the --api-key flag to CLI commands (preferred):

langsmith evaluator list --api-key $LANGSMITH_API_KEY
Related skills
Installs
1.8K
GitHub Stars
115
First Seen
Mar 4, 2026