airflow
airflow
Purpose
Airflow is an open-source workflow orchestration tool for defining, scheduling, and monitoring data pipelines as code. It uses Python to create Directed Acyclic Graphs (DAGs) that represent task dependencies and execution flows.
When to Use
Use Airflow for scenarios involving recurring data tasks, such as ETL processes, batch jobs, or complex workflows with dependencies. It's ideal when you need dynamic scheduling, retries, and monitoring in data engineering pipelines, especially for production-scale operations with tools like Spark or databases.
Key Capabilities
- Define workflows as DAGs in Python, specifying tasks, dependencies, and schedules.
- Built-in schedulers that run tasks at defined intervals (e.g., cron-style).
- Web UI for real-time monitoring, including task logs and DAG status via endpoints like
/admin/. - Operators like
BashOperatorfor shell commands orPythonOperatorfor custom functions. - Extensible hooks for integrations, such as
PostgresHookfor database connections. - Configuration via
airflow.cfgfile, e.g., set[core] executor = LocalExecutorfor local testing.
Usage Patterns
To use Airflow, install it via pip install apache-airflow, then initialize the database with airflow db init. Define DAGs in the dags folder of your Airflow home directory. Always use a virtual environment to avoid conflicts. For authentication, set environment variables like $AIRFLOW_UID for user isolation.
More from alphaonedev/openclaw-graph
playwright-scraper
Playwright web scraping: dynamic content, auth flows, pagination, data extraction, screenshots
1.4Kgcp-iam
Manages identity and access control for Google Cloud resources using IAM policies and roles.
370humanize-ai-text
AI text humanization: reduce AI-detection patterns, natural phrasing, tone adjustment
260macos-automation
AppleScript, JXA, Shortcuts, Automator, osascript, System Events, accessibility API
173tavily-web-search
Tavily: web search optimized for AI agents, answer synthesis, domain filtering, depth control
155clawflows
OpenClaw workflow automation: multi-step task chains, conditional logic, triggers, schedule
102