etl-pipelines

Installation
SKILL.md

etl-pipelines

Purpose

This skill enables OpenClaw to design and implement ETL pipelines for data extraction, transformation, and loading in data engineering workflows. It focuses on handling structured data sources like databases, files, and APIs, ensuring efficient data flow for analytics and reporting.

When to Use

Use this skill when building data pipelines for batch processing, real-time data ingestion, or data migration. Apply it in scenarios involving large datasets (e.g., >1TB), integrating with tools like Apache Spark or AWS Glue, or automating ETL for BI dashboards.

Key Capabilities

  • Extract data from sources like CSV, JSON files, SQL databases, or APIs using connectors (e.g., JDBC for databases).
  • Transform data with operations such as filtering, aggregation, or SQL queries (e.g., via Pandas or Spark DataFrames).
  • Load data into targets like PostgreSQL, BigQuery, or S3 buckets with schema validation and error logging.
  • Support for scheduling pipelines with cron-like expressions or integration with orchestration tools like Airflow.
  • Handle incremental loads by tracking last processed timestamps or change data capture (CDC).
Related skills
Installs
21
GitHub Stars
5
First Seen
Mar 5, 2026