langsmith-dataset
Create, manage, and upload evaluation datasets to LangSmith for testing and validation.
- Supports four dataset types: final_response (full conversations), single_step (individual node behavior), trajectory (tool call sequences), and RAG (question/chunks/answer/citations)
- CLI commands for dataset lifecycle management: create, list, get, delete, export, and upload from local JSON files
- SDK-based dataset creation in Python and JavaScript with programmatic example addition
- Example management commands to add, list, and delete individual examples within datasets
- Complete workflow from trace export through processing to LangSmith upload with experiment tracking
LANGSMITH_API_KEY=lsv2_pt_your_api_key_here # REQUIRED
LANGSMITH_PROJECT=your-project-name # Check this to know which project has traces
LANGSMITH_WORKSPACE_ID=your-workspace-id # Optional: for org-scoped keys
Authentication is REQUIRED: either set the LANGSMITH_API_KEY environment variable, or pass the --api-key flag to CLI commands (preferred):
langsmith dataset list --api-key $LANGSMITH_API_KEY
IMPORTANT: Always check the environment variables or .env file for LANGSMITH_PROJECT before querying or interacting with LangSmith. This tells you which project contains the relevant traces and data. If the LangSmith project is not available, use your best judgement to identify the right one.
More from langchain-ai/langsmith-skills
langsmith-trace
INVOKE THIS SKILL when working with LangSmith tracing OR querying traces. Covers adding tracing to applications and querying/exporting trace data. Uses the langsmith CLI tool.
1.8Klangsmith-evaluator
INVOKE THIS SKILL when building evaluation pipelines for LangSmith. Covers three core components: (1) Creating Evaluators - LLM-as-Judge, custom code; (2) Defining Run Functions - how to capture outputs and trajectories from your agent; (3) Running Evaluations - locally with evaluate() or auto-run via LangSmith. Uses the langsmith CLI tool.
1.7K