exploratory-data-analysis

Installation
SKILL.md

Exploratory Data Analysis for Scientific Data

Overview

Exploratory data analysis (EDA) is the systematic examination of scientific data files to understand their structure, content, quality, and characteristics before formal analysis. This knowhow covers methodology for detecting file types, selecting appropriate analysis approaches, assessing data quality, and generating comprehensive reports across all major scientific data domains.

Key Concepts

Scientific Data Type Categories

Category Common Formats Typical Analysis Key Libraries
Tabular CSV, TSV, XLSX, Parquet Summary statistics, distributions, correlations, missing values pandas, polars
Sequence FASTA, FASTQ, SAM/BAM Length distribution, quality scores, GC content, alignment stats BioPython, pysam
Image/Microscopy TIFF, ND2, CZI, DICOM Dimensions (XYZCT), intensity stats, metadata, calibration tifffile, aicsimageio, nd2reader
Spectral mzML, SPC, JCAMP, FID Peak detection, baseline, S/N ratio, resolution pymzml, nmrglue, pyteomics
Structural PDB, CIF, MOL, SDF Atom counts, bond validation, B-factors, completeness BioPython, RDKit, MDAnalysis
Array/Tensor NPY, HDF5, Zarr, NetCDF Shape, dtype, value range, NaN/Inf check, chunk structure numpy, h5py, zarr, xarray
Omics H5AD, MTX, VCF, BED Feature/sample counts, sparsity, annotation completeness scanpy, pyranges, cyvcf2
Related skills

More from jaechang-hits/sciagent-skills

Installs
9
GitHub Stars
152
First Seen
Mar 16, 2026