speech-to-text

Pass

Audited by Gen Agent Trust Hub on Jun 19, 2026

Risk Level: SAFEEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [EXTERNAL_DOWNLOADS]: The skill references external resources for installation, including a GitHub-hosted markdown file for CLI instructions and the 'belt-sh/cli' package from the npm registry.
  • [REMOTE_CODE_EXECUTION]: The skill's primary functionality involves running remote AI applications (such as Whisper and ElevenLabs Scribe) on the inference.sh platform via the 'belt app run' command.
  • [COMMAND_EXECUTION]: Instructs the agent to use the 'belt' CLI tool for logging in, running applications, and sampling inputs, which is necessary for the intended audio processing workflow.
  • [PROMPT_INJECTION]: The skill processes untrusted audio and video content from external URLs; this represents a surface for indirect prompt injection if the transcribed text is subsequently used as instructions for an AI agent without proper sanitization or boundary markers.
  • Ingestion points: 'audio_url' and 'video_url' parameters in 'SKILL.md'.
  • Boundary markers: Not specified in the provided instructions.
  • Capability inventory: Network and compute access via the 'belt' CLI.
  • Sanitization: No explicit sanitization of transcribed content is mentioned.
Audit Metadata
Risk Level
SAFE
Analyzed
Jun 19, 2026, 02:17 AM
Security Audit — agent-trust-hub — speech-to-text