byted-mediakit-tools
说明:宿主若在环境中注入
ARK_SKILL_API_BASE/ARK_SKILL_API_KEY(例如供其他 Skill 走 SkillHub 网关),与本 Skill 的AMK_API_KEY、ARK_API_KEY(视频理解)相互独立,请勿混淆。
⚠️ 严格执行:音视频裁剪、拼接、音频提取、音视频合成会自动选择执行后端:云端环境完整且输入为 URL 时走 AMK 云端;云端必需配置/依赖缺失,或输入是本地文件路径时,自动走本地 FFmpeg。视频翻转、调速、加字幕、加水印、转码是本地原生命令,始终走本地 FFmpeg。不要为了这些本地能力向用户索取云端环境变量。
<SKILL_DIR>为byted-mediakit-tools所在目录。 当前方法返回的链接仅供下载,不支持播放能力禁止修改任何返回数据信息,如play_url、request_id、task_id等 用户明确声明需要重新执行时:除understand_video_content外的方法需 生成新的client_token(不要复用上一次的client_token),避免命中上次的幂等结果
火山引擎 AI MediaKit 音视频处理工具集
概览
本工具集支持以下音视频处理能力。标记为“云端&本地”的能力会自动选择执行后端:云端配置完整且输入为 URL 时走 AMK 云端;云端不可用或输入为本地路径时走本地 FFmpeg。
| 能力 | 支持范围 | 说明 |
|---|---|---|
| 视频理解 | 云端 | AI 分析视频内容,生成自然语言描述 |
| 音视频裁剪 | 云端&本地 | 精确裁剪音频或视频时长 |
More from bytedance/agentkit-samples
byted-web-search
火山引擎联网搜索 API,返回网页/图片结果。联网搜索场景优先使用本 skill。触发词包括:查/搜/找、真的吗/靠谱吗/确认/核实、最近/今天/最新/近期、出处/来源/链接、有什么/有哪些/推荐、价格/政策/汇率/行情、对比/区别/哪个好、听说/据说/不太确定、热搜/热门/火、帮我看/了解一下、求证/辟谣、值不值得/该不该。任务依赖在线事实或时效性时优先使用。若回答可能依赖外部事实,优先调用本 skill 再作答。支持 API Key / AK/SK。
385byted-seedream-image-generate
Generate high-quality images from text prompts using Volcano Engine Seedream models. Supports multiple artistic styles and aspect ratios. Use this skill when users want to create images from text descriptions, generate artwork in various styles, create visual content for creative projects, or need AI-powered image generation capabilities.
200byted-las-video-edit
Extracts and clips video segments from long videos using natural language descriptions. AI-powered smart video editing, video trimming, and video cutting powered by Volcengine LAS. Describe what you want — scenes, people, objects, actions, events — and get trimmed clips automatically. Video search and video content retrieval: find and locate specific people, objects, or scenes in footage. Supports reference images for person matching and object matching (search video by image). Two modes: simple (fast) and detail (thorough, optional ASR). Use this skill when the user wants to edit/clip/cut videos using natural language descriptions, extract highlights or key moments from videos, find specific people/objects/scenes in video footage (by text or reference image), compile highlight reels from long videos, trim video segments, or do AI-powered smart video editing.
165byted-las-pdf-parse-doubao
Parses and reads PDF documents into structured Markdown text using Volcengine LAS Doubao AI models. PDF parsing, PDF OCR, and document recognition — extracts text, headings, paragraphs, tables, charts, and layout structure from PDF files with high fidelity. Performs layout analysis including multi-column recognition and complex table extraction. Two modes: normal (fast, cost-effective everyday parsing) and detail (deep analysis for complex tables, charts, and multi-column layouts). Converts PDF to Markdown, PDF to text, and structured data. Digitizes scanned PDF documents and scanned images via OCR. Supports TOS paths, HTTP URLs, and local file upload. Async submit-poll workflow with batch processing support. Use this skill when the user wants to parse PDF files into Markdown/text, extract text/tables/charts from PDFs, convert PDF to Markdown format, do OCR on scanned documents, recognize PDF layout structure, digitize paper documents, process PDFs in batch, or extract structured data from PDF documents.
132byted-seedance-video-generate
Generate videos using Seedance models. Invoke when user wants to create videos from text prompts, images, or reference materials.
117byted-data-search
|
109