speech-to-text
Pass
Audited by Gen Agent Trust Hub on May 14, 2026
Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
- [PROMPT_INJECTION]: The skill possesses a surface for indirect prompt injection because it ingests and processes untrusted audio and video data. \n
- Ingestion points: Audio and video sources are ingested via the
audio_urlandvideo_urlparameters used inbeltcommands throughoutSKILL.md. \n - Boundary markers: The instructions do not define delimiters or specific constraints to prevent the agent from obeying instructions that might be embedded within the transcribed speech. \n
- Capability inventory: The skill has command execution capabilities via the
beltCLI tool, as defined in theallowed-toolssection ofSKILL.md. \n - Sanitization: There is no evidence of sanitization, filtering, or validation of the transcribed text before it is returned to the agent context. \n- [COMMAND_EXECUTION]: The skill utilizes the
beltCLI to run transcription and translation models. Access is restricted to this specific tool through theallowed-toolsfrontmatter. \n- [EXTERNAL_DOWNLOADS]: The skill references external resources, including installation instructions and documentation, from theinference.shdomain and its associated GitHub repository. These references are essential for the operation of the CLI tool described.
Audit Metadata