The Agent Skills Directory

[COMMAND_EXECUTION]: The scripts/media_optimizer.py script executes ffmpeg and ffprobe as subprocesses to optimize media files and extract metadata. This allows the skill to perform complex media processing tasks on the host system.
[REMOTE_CODE_EXECUTION]: The skill utilizes dynamic execution in multiple locations. The scripts/media_optimizer.py script uses the eval() function to calculate frame rates from ffprobe output, which is a discouraged practice that can lead to code execution if input data is manipulated. Additionally, the scripts/check_setup.py script uses the __import__ function for dynamic module loading to verify dependencies.
[DATA_EXFILTRATION]: The skill's orchestration scripts access sensitive local configuration files, including local .env files and shared environment configurations in the ~/.claude/ directory, to retrieve API keys and configuration parameters.
[PROMPT_INJECTION]: The skill processes untrusted multimedia files (images, audio, video, documents) through the Gemini API. This ingestion of external data creates a surface for indirect prompt injection, where instructions hidden within the media could attempt to manipulate the agent's logic or behavior.
Ingestion points: Files processed through scripts/gemini_batch_process.py and scripts/document_converter.py.
Boundary markers: The prompts sent to the Gemini API do not use specific delimiters or instructions to isolate the untrusted file content from the task instructions.
Capability inventory: Subprocess execution (FFmpeg), file system write access for generated assets, and network communication with Google Gemini services.
Sanitization: The skill does not perform validation or sanitization on the content extracted from files before it is processed by the generative models.

ai-multimodal