Agent
SKILL.md
Vision Agents Skill
Product Summary
Vision Agents is an open-source Python framework for building real-time voice and video AI agents. Agents join sessions, connect to AI providers through swappable plugins (LLM, STT, TTS, vision models), and respond in real time. The framework handles call lifecycle, audio/video routing, turn-taking, and deployment. You configure an Agent class with provider plugins, define create_agent() and join_call() functions, and run via Runner for console or HTTP server modes.
Key files and commands:
agent.py— Main agent configuration file (created byuvx vision-agents init)pyproject.toml— Dependencies and project metadata.env— API keys for providers (Stream, LLM, STT, TTS)uv run agent.py run— Console mode (single agent, browser demo)uv run agent.py serve— HTTP server mode (production, multiple agents)- Primary docs: https://visionagents.ai