doubao-multimodal

Installation
SKILL.md

Doubao Multimodal Understanding

Bun + TypeScript CLI wrapping the Doubao-Seed multimodal chat completion endpoint. Resolves a single audio/video source (URL or local path), normalizes it for Ark (download remote → cache, upload local → TOS pre-signed URL), splits oversized media, fans out concurrent Ark calls, and merges the results.

Script Directory

{baseDir} = this SKILL.md's directory. Main entry: {baseDir}/scripts/main.ts. Run with bun run {baseDir}/scripts/main.ts .... Dependencies live in {baseDir}/scripts/package.json (run bun install inside that folder once).

Required Environment

ARK_API_KEY=...                 # 火山方舟 API Key (必填)
ARK_MODEL=...                   # 多模态 endpoint id 或 model 名,例如 doubao-seed-2-0-lite-260428 或 doubao-seed-1-6-flash-250928
ARK_BASE_URL=...                # 可选,默认 https://ark.cn-beijing.volces.com/api/v3
ARK_REASONING_EFFORT=minimal    # 可选
Installs
2
GitHub Stars
25
First Seen
May 7, 2026
doubao-multimodal — jimliu/doubao-multimodal-skill