alicloud-ai-multimodal-qwen-ocr
Category: provider
Model Studio Qwen OCR
Validation
mkdir -p output/alicloud-ai-multimodal-qwen-ocr
python -m py_compile skills/ai/multimodal/alicloud-ai-multimodal-qwen-ocr/scripts/prepare_ocr_request.py && echo "py_compile_ok" > output/alicloud-ai-multimodal-qwen-ocr/validate.txt
Pass criteria: command exits 0 and output/alicloud-ai-multimodal-qwen-ocr/validate.txt is generated.
Output And Evidence
- Save request payloads, selected OCR task name, and normalized output expectations under
output/alicloud-ai-multimodal-qwen-ocr/. - Keep the exact model, image source, and task configuration with each saved run.
Use Qwen OCR when the task is primarily text extraction or document structure parsing rather than broad visual reasoning.
More from cinience/alicloud-skills
alicloud-ai-audio-tts-voice-clone
Voice cloning workflows with Alibaba Cloud Model Studio Qwen TTS VC models. Use when creating cloned voices from sample audio and synthesizing text with cloned timbre.
396alicloud-ai-image-qwen-image
Generate images with Model Studio DashScope SDK using Qwen Image generation models (qwen-image, qwen-image-plus, qwen-image-max, qwen-image-2.0 series and snapshots). Use when implementing or documenting image.generate requests/responses, mapping prompt/negative_prompt/size/seed/reference_image, or integrating image generation into the video-agent pipeline.
366alicloud-observability-sls-log-query
Query and troubleshoot logs in Alibaba Cloud Log Service (SLS) using query|analysis syntax and the Python SDK. Use for time-bounded log search, error investigation, and root-cause analysis workflows.
340alicloud-ai-multimodal-qwen-vl
Understand images with Alibaba Cloud Model Studio Qwen VL models (qwen3-vl-plus/qwen3-vl-flash and latest aliases). Use when building image Q&A, visual analysis, OCR-like extraction, chart/table reading, or screenshot understanding workflows.
337alicloud-ai-image-qwen-image-edit
Edit images with Alibaba Cloud Model Studio Qwen Image Edit models (qwen-image-edit, qwen-image-edit-plus, qwen-image-edit-max, qwen-image-2.0 series and snapshots). Use when modifying existing images (inpaint, replace, style transfer, local edits), preserving subject consistency, or documenting image edit request/response mappings.
335alicloud-ai-audio-tts
Generate human-like speech audio with Model Studio DashScope Qwen TTS models (qwen3-tts-flash, qwen3-tts-instruct-flash). Use when converting text to speech, producing voice lines for short drama/news videos, or documenting TTS request/response fields for DashScope.
314