DOCX → Markdown

DOCX is structured XML, so text/tables can be extracted losslessly without OCR. But embedded images (architecture diagrams, flowcharts, screenshots) carry information that text-only extractors silently drop. This skill extracts them to disk and references them via standard Markdown image syntax at their original position — you describe them inline using your built-in Vision capability.

Workflow (agent mode — default, zero config)

Step 1 — Run the extractor

python "${CLAUDE_SKILL_DIR}/scripts/docx_to_md.py" \
  --input <docx_or_dir> \
  --output <output_dir>

Output:

<output_dir>/<stem>.md — headings, paragraphs, tables in document order, plus ![](<stem>/imgs/img_NNN.png) placeholders at each large image's original position
<output_dir>/<stem>/imgs/ — extracted image files

docx-to-md

DOCX → Markdown

Workflow (agent mode — default, zero config)

Step 1 — Run the extractor

Step 2 — Fill in image descriptions

More from ocozyo/docs-to-wiki

pdf-to-md

docs-to-wiki

pptx-to-md