ai-multimodal
AI Multimodal Processing Skill
Process audio, images, videos, documents, and generate images using Google Gemini's multimodal API. Unified interface for all multimedia content understanding and generation.
Core Capabilities
Audio Processing
- Transcription with timestamps (up to 9.5 hours)
- Audio summarization and analysis
- Speech understanding and speaker identification
- Music and environmental sound analysis
- Text-to-speech generation with controllable voice
Image Understanding
- Image captioning and description
- Object detection with bounding boxes (2.0+)
- Pixel-level segmentation (2.5+)
- Visual question answering
- Multi-image comparison (up to 3,600 images)
More from microck/ordinary-claude-skills
crypto-research
Comprehensive cryptocurrency market research and analysis using specialized AI agents. Analyzes market data, price trends, news sentiment, technical indicators, macro correlations, and investment opportunities. Use when researching cryptocurrencies, analyzing crypto markets, evaluating digital assets, or investigating blockchain projects like Bitcoin, Ethereum, Solana, etc.
135alex-hormozi-pitch
Create irresistible offers and pitches using Alex Hormozi's methodology from $100M Offers. Guides through value equation, guarantee frameworks, pricing psychology, and creating offers "too good not to take" for any product or service.
129novelweave-workflow
使用 NovelWeave 进行小说创作的完整工作流程,包括命令使用、最佳实践和高效创作技巧。适用于规划小说项目、组织创作过程或学习 NovelWeave 功能。
90moon-dev-trading-agents
Master Moon Dev's Ai Agents Github with 48+ specialized agents, multi-exchange support, LLM abstraction, and autonomous trading capabilities across crypto markets
63dnd5e-srd
Retrieval-augmented generation (RAG) skill for the D&D 5e System Reference Document (SRD). Use when answering questions about D&D 5e core rules, spells, combat, equipment, conditions, monsters, and other SRD content. This skill provides agentic search-based access to the SRD split into page-range markdown files.
52analyzing-financial-statements
This skill calculates key financial ratios and metrics from financial statement data for investment analysis
31