dask
Dask
Overview
Dask is a Python library for parallel and distributed computing that enables three critical capabilities:
- Larger-than-memory execution on single machines for data exceeding available RAM
- Parallel processing for improved computational speed across multiple cores
- Distributed computation supporting terabyte-scale datasets across multiple machines
Dask scales from laptops (processing ~100 GiB) to clusters (processing ~100 TiB) while maintaining familiar Python APIs.
When to Use This Skill
This skill should be used when:
- Process datasets that exceed available RAM
- Scale pandas or NumPy operations to larger datasets
- Parallelize computations for performance improvements
- Process multiple files efficiently (CSVs, Parquet, JSON, text logs)
- Build custom parallel workflows with task dependencies
More from jimmc414/kosmos
scientific-schematics
Create publication-quality scientific diagrams, flowcharts, and schematics using Python (graphviz, matplotlib, schemdraw, networkx). Specialized in neural network architectures, system diagrams, and flowcharts. Generates SVG/EPS in figures/ folder with automated quality verification.
32pptx
Presentation toolkit (.pptx). Create/edit slides, layouts, content, speaker notes, comments, for programmatic presentation creation and modification.
18ensembl-database
Query Ensembl genome database REST API for 250+ species. Gene lookups, sequence retrieval, variant analysis, comparative genomics, orthologs, VEP predictions, for genomic research.
17docx
Document toolkit (.docx). Create/edit documents, tracked changes, comments, formatting preservation, text extraction, for professional document processing.
15scientific-slides
Build slide decks and presentations for research talks. Use this for making PowerPoint slides, conference presentations, seminar talks, research presentations, thesis defense slides, or any scientific talk. Provides slide structure, design templates, timing guidance, and visual validation. Works with PowerPoint and LaTeX Beamer.
13scientific-visualization
Create publication figures with matplotlib/seaborn/plotly. Multi-panel layouts, error bars, significance markers, colorblind-safe, export PDF/EPS/TIFF, for journal-ready scientific plots.
12