The Agent Skills Directory

[COMMAND_EXECUTION]: The skill uses shell variables like $INPUT and $OUTPUT directly within Bash tool blocks (e.g., python3 scripts/audio_enhance.py separate "$INPUT"). While the use of double quotes provides some protection, this pattern remains a potential surface for command injection if input filenames contain sophisticated shell metacharacters or if the agent environment performs unsafe expansion.
[PROMPT_INJECTION]: This skill is vulnerable to Indirect Prompt Injection (Category 8). It processes untrusted audio/video files to generate transcripts and speaker labels (via WhisperX and pyannote-audio).
Ingestion points: External media files provided as $INPUT in SKILL.md.
Boundary markers: None identified. The transcript output is processed by the agent without explicit instructions to ignore embedded commands.
Capability inventory: The skill has access to the Bash, Read, and Write tools, allowing for command execution and file system modification.
Sanitization: No evidence of sanitization or filtering of the generated transcript text before it is returned to the agent context.
[CREDENTIALS_UNSAFE]: The skill instructions suggest passing sensitive API tokens (HuggingFace, ElevenLabs, OpenAI) via command-line flags (e.g., --hf-token TOKEN). This is a sub-optimal security practice as it may expose credentials in process lists or shell history logs, although the skill also correctly mentions the use of environment variables.

claude-video-enhance-audio