data-anomaly-detection

Installation
SKILL.md

Data Anomaly Detection

A skill for identifying anomalies, outliers, and suspicious patterns in research datasets. Combines classical statistical methods with modern machine learning approaches to flag data points that deviate significantly from expected distributions, helping researchers maintain data integrity and uncover genuine scientific findings.

Overview

Anomalous data points in research datasets can arise from measurement errors, instrument malfunction, data entry mistakes, or genuine rare phenomena. Distinguishing between these sources is critical: blindly removing outliers can bias results, while ignoring measurement errors introduces noise. This skill provides a structured framework for detecting, classifying, and handling anomalies in univariate, multivariate, and time-series research data.

The approach follows a three-stage pipeline: detection (flagging candidate anomalies), diagnosis (determining likely cause), and decision (remove, transform, or retain with justification). Every decision is logged for reproducibility and transparent reporting.

Statistical Detection Methods

Univariate Outlier Detection

import numpy as np
from scipy import stats

def detect_univariate_outliers(data: np.ndarray, method: str = 'iqr') -> dict:
Related skills
Installs
3
GitHub Stars
217
First Seen
Mar 31, 2026