multimodal-corpus-ingestion

Installation
SKILL.md

Multimodal Corpus Ingestion

Overview

Mixed corpora break down when everything is treated like plain text. Ingest code, prose, visuals, and transcripts according to what each artifact can actually tell you, then normalize them into one corpus with provenance intact.

When to Use

  • A task spans code, docs, PDFs, screenshots, or diagrams
  • You need one queryable corpus instead of scattered files
  • The user gives a folder with mixed artifact types
  • Architecture or product understanding depends on visuals and prose together
  • Retrieval quality is poor because source types are inconsistent

Source Classes

Structural Sources

Use deterministic extraction first:

Related skills

More from v1truv1us/ai-eng-system

Installs
1
GitHub Stars
6
First Seen
6 days ago