audio-transcribe

Pass

Audited by Gen Agent Trust Hub on May 2, 2026

Risk Level: SAFE
Full Analysis
  • [EXTERNAL_DOWNLOADS]: The environment setup scripts (setup_env.sh, setup_mimo.sh) download Python dependencies and model weights from well-known repositories, including HuggingFace, official GitHub releases (Dao-AILab/flash-attention), and the vendor's repository (XiaomiMiMo). These operations are essential for the skill's local inference capabilities.
  • [COMMAND_EXECUTION]: The pipeline utilizes system utilities such as ffmpeg and ffprobe for audio preprocessing and duration validation. These calls are implemented using argument lists rather than shell strings, effectively mitigating shell injection risks.
  • [SAFE]: The skill includes a utility (scripts/patch_clustering.py) that applies a performance optimization to the installed funasr library. This patch replaces a computationally expensive O(N³) operation with a more efficient O(N²*k) sparse implementation, which is a legitimate and documented optimization for processing long recordings.
  • [DATA_EXFILTRATION]: The skill provides an optional LLM-based transcript cleanup phase that transmits content to AWS Bedrock, Anthropic, or OpenAI. This behavior is clearly disclosed in the documentation, requires explicit user configuration (API keys/credentials), and can be disabled to keep all processing strictly local.
  • [CREDENTIALS_UNSAFE]: The skill relies on standard environment variables and the local AWS credential chain for authentication with LLM providers. It follows security best practices by not hardcoding any secrets or access tokens.
Audit Metadata
Risk Level
SAFE
Analyzed
May 2, 2026, 08:08 PM