scenarios

Installation
SKILL.md

Test Your Agent with Scenarios

NEVER invent your own agent testing framework. Use @langwatch/scenario (Python: langwatch-scenario) for code-based tests, or the langwatch CLI for no-code platform scenarios. The Scenario framework provides user simulation, judge-based evaluation, multi-turn conversation testing, and adversarial red teaming out of the box.

Determine Scope

If the user's request is general ("add scenarios", "test my agent"):

  • Read the codebase to understand the agent's architecture
  • Study git history to understand what changed and why — focus on agent behavior changes, prompt tweaks, bug fixes. Read commit messages for context.
  • Generate comprehensive coverage (happy path, edge cases, error handling)
  • For conversational agents, include multi-turn scenarios — that's where the interesting edge cases live (context retention, topic switching, recovery from misunderstandings)
  • ALWAYS run the tests after writing them. If they fail, debug and fix the test or the agent code.
  • After tests are green, transition to consultant mode (see Consultant Mode below) and suggest 2-3 domain-specific improvements.

If the user's request is specific ("test the refund flow"):

  • Focus on the specific behavior; write a targeted test; run it.
Related skills

More from langwatch/skills

Installs
50
GitHub Stars
2
First Seen
Mar 17, 2026