doubao-multimodal
Installation
SKILL.md
Doubao Multimodal Understanding
Bun + TypeScript CLI wrapping the Doubao-Seed multimodal chat completion endpoint. Resolves a single audio/video source (URL or local path), normalizes it for Ark (download remote → cache, upload local → TOS pre-signed URL), splits oversized media, fans out concurrent Ark calls, and merges the results.
Script Directory
{baseDir} = this SKILL.md's directory. Main entry: {baseDir}/scripts/main.ts. Run with bun run {baseDir}/scripts/main.ts .... Dependencies live in {baseDir}/scripts/package.json (run bun install inside that folder once).
Required Environment
ARK_API_KEY=... # 火山方舟 API Key (必填)
ARK_MODEL=... # 多模态 endpoint id 或 model 名,例如 doubao-seed-2-0-lite-260428 或 doubao-seed-1-6-flash-250928
ARK_BASE_URL=... # 可选,默认 https://ark.cn-beijing.volces.com/api/v3
ARK_REASONING_EFFORT=minimal # 可选