alicloud-ai-multimodal-qwen-vl
Category: provider
Model Studio Qwen VL (Image Understanding)
Validation
mkdir -p output/alicloud-ai-multimodal-qwen-vl
python -m py_compile skills/ai/multimodal/alicloud-ai-multimodal-qwen-vl/scripts/analyze_image.py && echo "py_compile_ok" > output/alicloud-ai-multimodal-qwen-vl/validate.txt
Pass criteria: command exits 0 and output/alicloud-ai-multimodal-qwen-vl/validate.txt is generated.
Output And Evidence
- Save raw model responses and normalized extraction results to
output/alicloud-ai-multimodal-qwen-vl/. - Include input image reference and prompt for traceability.
Use Qwen VL models for image input + text output understanding tasks via DashScope compatible-mode API.
More from cinience/alicloud-skills
alicloud-ai-audio-tts-voice-clone
Voice cloning workflows with Alibaba Cloud Model Studio Qwen TTS VC models. Use when creating cloned voices from sample audio and synthesizing text with cloned timbre.
396alicloud-ai-image-qwen-image
Generate images with Model Studio DashScope SDK using Qwen Image generation models (qwen-image, qwen-image-plus, qwen-image-max, qwen-image-2.0 series and snapshots). Use when implementing or documenting image.generate requests/responses, mapping prompt/negative_prompt/size/seed/reference_image, or integrating image generation into the video-agent pipeline.
366alicloud-observability-sls-log-query
Query and troubleshoot logs in Alibaba Cloud Log Service (SLS) using query|analysis syntax and the Python SDK. Use for time-bounded log search, error investigation, and root-cause analysis workflows.
340alicloud-ai-image-qwen-image-edit
Edit images with Alibaba Cloud Model Studio Qwen Image Edit models (qwen-image-edit, qwen-image-edit-plus, qwen-image-edit-max, qwen-image-2.0 series and snapshots). Use when modifying existing images (inpaint, replace, style transfer, local edits), preserving subject consistency, or documenting image edit request/response mappings.
335alicloud-ai-audio-tts
Generate human-like speech audio with Model Studio DashScope Qwen TTS models (qwen3-tts-flash, qwen3-tts-instruct-flash). Use when converting text to speech, producing voice lines for short drama/news videos, or documenting TTS request/response fields for DashScope.
314alicloud-ai-video-wan-video
Generate videos with Model Studio DashScope SDK using Wan video generation models (wan2.6-t2v, wan2.6-i2v-flash, wan2.6-i2v and regional variants). Use when implementing or documenting video.generate requests/responses, mapping prompt/negative_prompt/duration/fps/size/seed/reference_image/motion_strength, or integrating video generation into the video-agent pipeline.
307