byted-podcast-gen
Podcast Skill
基于火山引擎豆包语音合成 WebSocket 协议(PodcastTTS,/api/v3/sami/podcasttts)将某个话题合成为播客音频并保存为本地文件。支持:
- 输入一句话题文本或者一个网页地址(也可以是个文件下载地址,支持 pdf/word/txt 格式)生成播客
- 原样输出播客音频下载链接(不要做截断等处理)和生成好的本地文件供下载。验证下载链接是否可下载,若可下载则返回给用户,不可下载的只是只返回本地文件。
- 输出播客分段文本(JSON)
适用场景
- 用户提到
生成播客或播客合成等相关关键词。 - 用户需要为某个话题生成播客形式的音频文件。
- 用户需要某个网页或文件内容生成播客形式的音频文件。
- 用户需要为用户上传的文件内容或者一个长上下文生成播客形式的音频文件。
强制规则(最高优先级)
当你收到用户请求生成播客时:
- 必须且只能使用 本 Skill 的脚本来生成播客
- 话题模式 用户需要为某个话题生成播客形式的音频文件, 使用参数
action=4和prompt_text= 话题文本。 - 网页模式 用户需要某个网页或可下载文件内容生成播客形式的音频文件, 使用参数
action=0和input_url= 网页地址或文件下载地址。 - 文件模式 用户需要为用户上传的文件内容或者一个长上下文生成播客形式的音频文件, 使用参数
action=0和text= 用户上传文件读取出来的内容或者是一段比较长的文本,一般超过 200 个字。
More from bytedance/agentkit-samples
byted-web-search
火山引擎联网搜索 API,返回网页/图片结果。联网搜索场景优先使用本 skill。触发词包括:查/搜/找、真的吗/靠谱吗/确认/核实、最近/今天/最新/近期、出处/来源/链接、有什么/有哪些/推荐、价格/政策/汇率/行情、对比/区别/哪个好、听说/据说/不太确定、热搜/热门/火、帮我看/了解一下、求证/辟谣、值不值得/该不该。任务依赖在线事实或时效性时优先使用。若回答可能依赖外部事实,优先调用本 skill 再作答。支持 API Key / AK/SK。
387byted-seedream-image-generate
Generate high-quality images from text prompts using Volcano Engine Seedream models. Supports multiple artistic styles and aspect ratios. Use this skill when users want to create images from text descriptions, generate artwork in various styles, create visual content for creative projects, or need AI-powered image generation capabilities.
202byted-las-video-edit
Extracts and clips video segments from long videos using natural language descriptions. AI-powered smart video editing, video trimming, and video cutting powered by Volcengine LAS. Describe what you want — scenes, people, objects, actions, events — and get trimmed clips automatically. Video search and video content retrieval: find and locate specific people, objects, or scenes in footage. Supports reference images for person matching and object matching (search video by image). Two modes: simple (fast) and detail (thorough, optional ASR). Use this skill when the user wants to edit/clip/cut videos using natural language descriptions, extract highlights or key moments from videos, find specific people/objects/scenes in video footage (by text or reference image), compile highlight reels from long videos, trim video segments, or do AI-powered smart video editing.
166byted-las-pdf-parse-doubao
Parses and reads PDF documents into structured Markdown text using Volcengine LAS Doubao AI models. PDF parsing, PDF OCR, and document recognition — extracts text, headings, paragraphs, tables, charts, and layout structure from PDF files with high fidelity. Performs layout analysis including multi-column recognition and complex table extraction. Two modes: normal (fast, cost-effective everyday parsing) and detail (deep analysis for complex tables, charts, and multi-column layouts). Converts PDF to Markdown, PDF to text, and structured data. Digitizes scanned PDF documents and scanned images via OCR. Supports TOS paths, HTTP URLs, and local file upload. Async submit-poll workflow with batch processing support. Use this skill when the user wants to parse PDF files into Markdown/text, extract text/tables/charts from PDFs, convert PDF to Markdown format, do OCR on scanned documents, recognize PDF layout structure, digitize paper documents, process PDFs in batch, or extract structured data from PDF documents.
133byted-seedance-video-generate
Generate videos using Seedance models. Invoke when user wants to create videos from text prompts, images, or reference materials.
118byted-data-search
|
110