langgraph-testing-evaluation

Installation

SKILL.md

LangGraph Testing & Evaluation

Practical workflows for validating agent quality with:

Unit/integration tests
Trajectory evaluation
LangSmith dataset evaluations
A/B-style comparisons between versions

Use this file for high-level flow. Load references/* for detailed implementation.

Start Here

Choose the smallest approach that answers your question:

Related skills

More from lubu-labs/langchain-agent-skills

langgraph-agent-patterns
Implement multi-agent coordination patterns (supervisor-subagent, router, orchestrator-worker, handoffs) for LangGraph applications. Use when users want to (1) implement multi-agent systems, (2) coordinate multiple specialized agents, (3) choose between coordination patterns, (4) set up supervisor-subagent workflows, (5) implement router-based agent selection, (6) create parallel orchestrator-worker patterns, (7) implement agent handoffs, (8) design state schemas for multi-agent systems, or (9) debug multi-agent coordination issues.
45
langgraph-state-management
Design state schemas, implement reducers, configure persistence, and debug state issues for LangGraph applications. Use when users want to (1) design or define state schemas for LangGraph graphs, (2) implement reducer functions for state accumulation, (3) configure persistence with checkpointers (InMemorySaver/MemorySaver, SqliteSaver, PostgresSaver), (4) debug state update issues or unexpected state behavior, (5) migrate state schemas between versions, (6) validate state schema structure, (7) choose between TypedDict and MessagesState patterns, (8) implement custom reducers for lists, dicts, or sets, (9) use the Overwrite type to bypass reducers, (10) set up thread-based persistence for multi-turn conversations, or (11) inspect checkpoints for debugging.
26
langgraph-error-handling
Implement LangGraph error handling with current v1 patterns. Use when users need to classify failures, add RetryPolicy for transient issues, build LLM recovery loops with Command routing, add human-in-the-loop with interrupt()/resume, handle ToolNode errors, or choose a safe strategy between retry, recovery, and escalation.
25
langgraph-project-setup
Initialize and configure LangGraph projects with proper structure, langgraph.json configuration, environment variables, and dependency management. Use when users want to (1) create a new LangGraph project, (2) set up langgraph.json for deployment, (3) configure environment variables for LLM providers, (4) initialize project structure for agents, (5) set up local development with LangGraph Studio, (6) configure dependencies (pyproject.toml, requirements.txt, package.json), or (7) troubleshoot project configuration issues.
21
deepagents-setup-configuration
Initialize, validate, and troubleshoot Deep Agents projects in Python or JavaScript using the `deepagents` package. Use when users need to create agents with built-in planning/filesystem/subagents, configure middleware/backends/checkpointing/HITL, migrate from `create_react_agent` or `create_agent`, scaffold projects with repo scripts, validate agent config files, and confirm compatibility with current LangChain/LangGraph/LangSmith docs.
21
langsmith-trace-analyzer
Fetch, organize, and analyze LangSmith traces for debugging and evaluation. Use when you need to: query traces/runs by project, metadata, status, or time window; download traces to JSON; organize outcomes into passed/failed/error buckets; analyze token/message/tool-call patterns; compare passed vs failed behavior; or investigate benchmark and production failures.
19

Installs

Repository

lubu-labs/langc…t-skills

GitHub Stars

First Seen

Feb 14, 2026

Security Audits

Gen Agent Trust HubFail

SocketPass

SnykWarn

langgraph-testing-evaluation

LangGraph Testing & Evaluation

Start Here

More from lubu-labs/langchain-agent-skills

langgraph-agent-patterns

langgraph-state-management

langgraph-error-handling

langgraph-project-setup

deepagents-setup-configuration

langsmith-trace-analyzer