analyzing-data
Analyzing Data
Use this skill for exploratory data analysis and visualization: understanding dataset structure, identifying patterns, choosing the right visualization approach, and communicating insights effectively.
When to use this skill
- New dataset — need orientation on structure, types, distributions
- Choosing visualization libraries and chart types for a project
- Data quality investigation — find anomalies, missing patterns, outliers
- Statistical hypothesis testing — validate assumptions about data
- Creating publication-quality figures or exploratory charts
- Large dataset exploration — sampling and aggregation strategies
- Understanding missing value mechanisms (MCAR/MAR/MNAR)
- Before feature engineering — understand variable relationships
- Model preparation — validate assumptions about data
When NOT to use this skill
- Building interactive dashboards or data applications → use
@building-data-apps
More from legout/data-agent-skills
data-engineering
Comprehensive data engineering skill suite covering core libraries (Polars, DuckDB, PyArrow), lakehouse formats, cloud storage, orchestration, streaming, quality, observability, and AI/ML pipelines.
5data-engineering-storage-remote-access-libraries-obstore
High-performance Rust-based remote filesystem library. Covers store creation, basic operations, async API, streaming uploads, Arrow integration, and fsspec compatibility wrapper.
4data-engineering-storage-remote-access-integrations-iceberg
Apache Iceberg catalog configuration for cloud storage (S3, GCS, Azure). Covers AWS Glue and REST catalogs, table scanning, and append/overwrite operations.
4data-science-eda
Exploratory Data Analysis (EDA): profiling, visualization, correlation analysis, and data quality checks. Use when understanding dataset structure, distributions, relationships, or preparing for feature engineering and modeling.
4data-science-notebooks
Interactive notebooks for data science: Jupyter, JupyterLab, and marimo. Use for exploratory analysis, reproducible research, documentation, and sharing insights with stakeholders.
4data-engineering-storage-remote-access-libraries-fsspec
Comprehensive guide to fsspec: the universal filesystem interface for Python. Covers S3, GCS, Azure via s3fs, gcsfs, adlfs; protocol chaining, caching, async operations, and integration with the data ecosystem.
4