speech-to-text

Pass

Audited by Gen Agent Trust Hub on May 14, 2026

Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
  • [PROMPT_INJECTION]: The skill possesses a surface for indirect prompt injection because it ingests and processes untrusted audio and video data. \n
  • Ingestion points: Audio and video sources are ingested via the audio_url and video_url parameters used in belt commands throughout SKILL.md. \n
  • Boundary markers: The instructions do not define delimiters or specific constraints to prevent the agent from obeying instructions that might be embedded within the transcribed speech. \n
  • Capability inventory: The skill has command execution capabilities via the belt CLI tool, as defined in the allowed-tools section of SKILL.md. \n
  • Sanitization: There is no evidence of sanitization, filtering, or validation of the transcribed text before it is returned to the agent context. \n- [COMMAND_EXECUTION]: The skill utilizes the belt CLI to run transcription and translation models. Access is restricted to this specific tool through the allowed-tools frontmatter. \n- [EXTERNAL_DOWNLOADS]: The skill references external resources, including installation instructions and documentation, from the inference.sh domain and its associated GitHub repository. These references are essential for the operation of the CLI tool described.
Audit Metadata
Risk Level
SAFE
Analyzed
May 14, 2026, 11:31 AM
Security Audit — agent-trust-hub — speech-to-text