byted-voice-to-text

Pass

Audited by Gen Agent Trust Hub on Mar 26, 2026

Risk Level: SAFEPROMPT_INJECTIONDATA_EXFILTRATION
Full Analysis
  • [PROMPT_INJECTION]: The skill exhibits an indirect prompt injection surface by converting untrusted audio input into text that the agent processes as user instructions.\n
  • Ingestion points: scripts/asr.py ingests audio through local file paths, remote URLs, and Feishu file keys.\n
  • Boundary markers: Absent. The transcribed output is not enclosed in delimiters or marked as potentially untrusted content.\n
  • Capability inventory: scripts/asr.py utilizes requests for network operations and performs file reads for audio processing.\n
  • Sanitization: Absent. The skill does not filter or sanitize the transcription before presenting it to the agent.\n- [DATA_EXFILTRATION]: The skill transmits audio data to an external service for speech processing.\n
  • Evidence: scripts/asr.py sends audio data to https://openspeech.bytedance.com, which is the official API endpoint for the author's speech recognition service.\n
  • Evidence: The script fetches audio from Feishu's official API (https://open.feishu.cn) when a file key is provided.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 26, 2026, 03:03 PM