data-analysis

Installation
SKILL.md

Data Analysis

This skill enables an AI agent to perform rigorous statistical analysis on structured datasets. The agent loads data, computes descriptive and inferential statistics, identifies trends and correlations, tests hypotheses, and produces actionable insights. It supports CSV, Excel, Parquet, and JSON inputs and leverages pandas, scipy, and statsmodels for analysis.

Workflow

  1. Load and profile the data. Read the dataset into a pandas DataFrame and inspect its shape, column types, and memory usage. Display the first and last rows to confirm the data loaded correctly. Check for obvious structural issues such as shifted columns or encoding problems.

  2. Compute descriptive statistics. Generate summary statistics for all numeric columns including mean, median, standard deviation, skewness, and kurtosis. For categorical columns, compute value counts and mode. This step establishes a baseline understanding of each variable's distribution.

  3. Identify trends and patterns. Apply rolling averages, percentage changes, and seasonal decomposition to time-indexed data. For non-temporal data, use group-by aggregations and pivot tables to surface patterns across categories. Flag any monotonic trends or cyclical behavior.

  4. Perform correlation and hypothesis testing. Calculate Pearson and Spearman correlation matrices to quantify relationships between variables. Conduct hypothesis tests (t-tests, chi-square, ANOVA) where appropriate to determine statistical significance. Report p-values and confidence intervals alongside effect sizes.

  5. Detect anomalies and outliers. Use the IQR method and z-scores to identify data points that deviate significantly from the norm. Cross-reference outliers with domain context to determine whether they represent errors, rare events, or meaningful signals.

  6. Synthesize findings into a report. Summarize the key insights in plain language, supported by specific numbers. Rank findings by business impact or statistical significance. Include limitations and caveats such as sample size constraints or confounding variables.

Supported Technologies

Related skills
Installs
15
GitHub Stars
78
First Seen
Mar 19, 2026