audio-transcribe
Pass
Audited by Gen Agent Trust Hub on May 2, 2026
Risk Level: SAFE
Full Analysis
- [EXTERNAL_DOWNLOADS]: The environment setup scripts (
setup_env.sh,setup_mimo.sh) download Python dependencies and model weights from well-known repositories, including HuggingFace, official GitHub releases (Dao-AILab/flash-attention), and the vendor's repository (XiaomiMiMo). These operations are essential for the skill's local inference capabilities. - [COMMAND_EXECUTION]: The pipeline utilizes system utilities such as
ffmpegandffprobefor audio preprocessing and duration validation. These calls are implemented using argument lists rather than shell strings, effectively mitigating shell injection risks. - [SAFE]: The skill includes a utility (
scripts/patch_clustering.py) that applies a performance optimization to the installedfunasrlibrary. This patch replaces a computationally expensive O(N³) operation with a more efficient O(N²*k) sparse implementation, which is a legitimate and documented optimization for processing long recordings. - [DATA_EXFILTRATION]: The skill provides an optional LLM-based transcript cleanup phase that transmits content to AWS Bedrock, Anthropic, or OpenAI. This behavior is clearly disclosed in the documentation, requires explicit user configuration (API keys/credentials), and can be disabled to keep all processing strictly local.
- [CREDENTIALS_UNSAFE]: The skill relies on standard environment variables and the local AWS credential chain for authentication with LLM providers. It follows security best practices by not hardcoding any secrets or access tokens.
Audit Metadata