The Agent Skills Directory

[EXTERNAL_DOWNLOADS]: Instructs the user to clone an external code repository from a third-party GitHub account (github.com/debpalash/OmniVoice-Studio) and execute uv sync to set up the local environment and install dependencies.
[EXTERNAL_DOWNLOADS]: Fetches configuration and large machine learning model assets (approximately 2.4 GB) from HuggingFace's official repository (k2-fsa/OmniVoice) during the initial synthesis operation.
[COMMAND_EXECUTION]: Provides management scripts (scripts/start-backend.sh, scripts/stop-backend.sh) that execute shell commands to manage the local FastAPI backend, including starting the server with uvicorn and terminating processes using kill.
[DATA_EXFILTRATION]: Includes a utility (scripts/record-reference.sh) for macOS that accesses the system microphone via ffmpeg to record reference audio for voice cloning.
[PROMPT_INJECTION]: The generate_speech tool interpolates user-supplied text into the synthesis engine without defined boundary markers or sanitization, creating a surface for indirect prompt injection. (1) Ingestion points: Untrusted data enters the context through the text parameter in the generate_speech tool; (2) Boundary markers: None identified in instructions; (3) Capability inventory: Local command execution, file system access, and microphone access; (4) Sanitization: No validation or escaping of external content is documented.

omnivoice