nextflow
Installation
SKILL.md
Nextflow
Overview
Nextflow is a workflow language and runtime for building reproducible, portable, scalable data pipelines. It is dominant in bioinformatics but works for any data-heavy computation. nf-core is a community curating production-grade Nextflow pipelines, reusable modules, and the nf-core tooling on top of Nextflow.
Key ideas:
- Dataflow programming: pipelines are
processtasks connected by channels. Nextflow infers execution order and parallelism from data dependencies — there is no explicit scheduler to write. - Write once, run anywhere: the same pipeline runs locally, on HPC (SLURM, SGE, LSF, PBS), and on cloud (AWS Batch, Google Batch, Azure Batch, Kubernetes) by changing config/profiles, not code.
- Reproducibility: per-task containers (Docker/Singularity/Apptainer/Conda/Wave) +
-resumecaching + pinned pipeline revisions. - DSL2 is the modern, required syntax: modular
process/workflow/includedefinitions.
This skill covers both running existing pipelines and developing your own (Nextflow language + nf-core conventions, testing with nf-test, configuration, and deployment).