stepfun-asr
StepFun stepaudio-2.5-asr
Transcribe audio with StepFun's stepaudio-2.5-asr (released 2026-04, verified 2026-04-23). Long audio in one call, no chunking — but only if the request hits the right endpoint with the right body shape. The wrong endpoint returns an error that looks identical to "model doesn't exist", which is the #1 reason this skill exists.
Companion: for TTS with
stepaudio-2.5-tts(the sibling model), use thestepfun-ttsskill — they share an API key but live on different endpoints with different body shapes.
Why this skill exists — three traps that cost hours
-
Wrong endpoint, wrong error.
stepaudio-2.5-asrdoes not live on/v1/audio/transcriptions(that endpoint serves the olderstep-asrfamily). It lives on/v1/audio/asr/sse— SSE streaming, JSON body, base64 audio. Sending it to the wrong endpoint returns{"error":{"message":"model stepaudio-2.5-asr not supported"}}, which is identical in structure to a genuinely nonexistent model name. People waste hours filing whitelist tickets. -
Plan key vs Normal key, silent failure. StepFun's "Plan" subscription keys (cheap, text-only) cannot call audio endpoints, but the failure manifests as a 4xx with no auth-shaped error message. If your account has a Plan subscription, you need a separate "Normal" key from the same console.
-
SSE error events are real. Censorship can fire on the ASR side too (rarely). Don't assume only
transcript.text.deltaandtranscript.text.doneevents arrive — handletype: errorevents in the stream or you'll silently drop them.
Config and auth
API key resolves in this order (fail-fast, no defaults):
More from daymade/claude-code-skills
twitter-reader
Fetch Twitter/X post content including long-form Articles with full images and metadata. Use when Claude needs to retrieve tweet/article content, author info, engagement metrics, and embedded media. Supports individual posts and X Articles (long-form content). Automatically downloads all images to local attachments folder and generates complete Markdown with proper image references. Preferred over Jina for X Articles with images.
1.4Kppt-creator
Create professional slide decks from topics or documents. Generates structured content with data-driven charts, speaker notes, and complete PPTX files. Applies persuasive storytelling principles (Pyramid Principle, assertion-evidence). Supports multiple formats (Marp, PowerPoint). Use for presentations, pitches, slide decks, or keynotes.
793qa-expert
This skill should be used when establishing comprehensive QA testing processes for any software project. Use when creating test strategies, writing test cases following Google Testing Standards, executing test plans, tracking bugs with P0-P4 classification, calculating quality metrics, or generating progress reports. Includes autonomous execution capability via master prompts and complete documentation templates for third-party QA team handoffs. Implements OWASP security testing and achieves 90% coverage targets.
735prompt-optimizer
Transform vague prompts into precise, well-structured specifications using EARS (Easy Approach to Requirements Syntax) methodology. This skill should be used when users provide loose requirements, ambiguous feature descriptions, or need to enhance prompts for AI-generated code, products, or documents. Triggers include requests to "optimize my prompt", "improve this requirement", "make this more specific", or when raw requirements lack detail and structure.
729macos-cleaner
Analyze and reclaim macOS disk space through intelligent cleanup recommendations. This skill should be used when users report disk space issues, need to clean up their Mac, or want to understand what's consuming storage. Focus on safe, interactive analysis with user confirmation before any deletions.
589deep-research
|
522