data-catalog
data-catalog
Purpose
This skill manages metadata for data assets, enabling discovery, governance, and lineage tracking in data engineering workflows. It catalogs datasets, schemas, and dependencies to support data-driven projects.
When to Use
Use this skill when you need to track data assets in a project, such as during ETL processes, data governance audits, or when building data pipelines. Apply it in scenarios involving large-scale data repositories, compliance requirements, or collaborative data teams.
Key Capabilities
- Register and update metadata for datasets using JSON structures, e.g.,
{"name": "sales_data", "schema": {"columns": ["id", "date"]}}. - Search and query assets via full-text or tag-based filters, supporting lineage queries like tracing data origins.
- Enforce governance policies, such as access controls, by associating tags like "sensitive" to assets.
- Generate lineage graphs in JSON format, e.g.,
{"source": "raw_logs", "target": "processed_reports"}. - Integrate with storage systems like S3 or databases, using connectors that require API keys via
$DATA_CATALOG_API_KEY.
Usage Patterns
To use this skill, first authenticate with an environment variable like export DATA_CATALOG_API_KEY=your_key. Then, follow a pattern: initialize the catalog, register assets, query as needed, and handle updates. For pipelines, embed it in scripts to auto-register outputs. Always validate metadata before operations to avoid conflicts.
More from alphaonedev/openclaw-graph
playwright-scraper
Playwright web scraping: dynamic content, auth flows, pagination, data extraction, screenshots
1.4Kgcp-iam
Manages identity and access control for Google Cloud resources using IAM policies and roles.
370humanize-ai-text
AI text humanization: reduce AI-detection patterns, natural phrasing, tone adjustment
260macos-automation
AppleScript, JXA, Shortcuts, Automator, osascript, System Events, accessibility API
173tavily-web-search
Tavily: web search optimized for AI agents, answer synthesis, domain filtering, depth control
155clawflows
OpenClaw workflow automation: multi-step task chains, conditional logic, triggers, schedule
102