ai-sdk-testing
Installation
SKILL.md
AI SDK Testing
You write deterministic, fast tests for code that uses the Vercel AI SDK. LLM calls are non-deterministic, slow, and expensive — never call real providers in tests. Instead, use the SDK's built-in mock providers (ai/test) to control outputs exactly, and assert on the behavior of your code around those outputs.
When to use this skill
- Any code imports from
ai(generateText,streamText,generateObject,streamObject) - Testing route handlers that proxy or transform LLM responses
- Testing structured output parsing (Zod schemas +
Output.object) - Testing streaming UIs or SSE endpoints that use AI SDK
- As part of
/nightshift,/swarm,/ralph-tddloops when the target code uses AI SDK
Core principles
- Never call real providers in tests. Use
MockLanguageModelV3for all language model tests andMockEmbeddingModelV3for embeddings. - Test your code, not the SDK. Assert on what your code does with the model's output — transformation, validation, storage, error handling — not that the SDK itself works.
- Test both sync and streaming paths. If your code supports both
generateTextandstreamText, test both. Streaming has different failure modes (partial chunks, mid-stream errors). - Test structured output parsing. When using
Output.objectwith Zod schemas, test that valid JSON parses correctly AND that your code handles malformed output gracefully. - Mock at the model layer, not fetch. Prefer
MockLanguageModelV3over raw fetch mocking. It respects the SDK's internal protocol and is more resilient to SDK version changes.
Related skills
More from jonmumm/skills
dont-use-use-effect
>
59react-composable-components
>
41grill-me
>
32mutation-testing
Run and interpret Stryker mutation testing; kill survivors to reach ≥95% score. Use when running mutation tests, setting up Stryker, interpreting survivors, or verifying test quality after TDD.
32offensive-typesafety
>
31expo-testing
Build, install, and test Expo/React Native apps on simulators and physical devices. Use when asked to "run on simulator", "install on device", "test on phone", "run detox", "preview build", or "build and test".
30