dataset-finder-guide

Installation
SKILL.md

Dataset Finder Guide

Search, evaluate, and download research datasets from major repositories including Kaggle, Hugging Face, Google Dataset Search, Zenodo, UCI Machine Learning Repository, and domain-specific archives. This skill helps researchers locate the right data for their experiments efficiently.

Overview

Finding suitable datasets is often one of the most time-consuming phases of empirical research. Datasets are scattered across dozens of platforms, each with different APIs, licensing terms, download mechanisms, and metadata standards. A single research project might require datasets from Kaggle for benchmarking, Hugging Face for NLP tasks, Zenodo for supplementary materials from published papers, and government open data portals for demographic or economic variables.

This skill provides a unified approach to dataset discovery: formulating search queries, evaluating dataset quality and suitability, understanding licensing implications, and efficiently downloading and organizing data. It covers both general-purpose repositories and domain-specific archives that researchers in various fields need.

The emphasis is on reproducibility -- every dataset used in research should be citable, versioned, and documented. This skill includes patterns for recording dataset provenance, creating data cards, and managing dataset versions across experiments.

Dataset Repositories

General-Purpose Repositories

Installs
3
GitHub Stars
227
First Seen
Mar 31, 2026
dataset-finder-guide — wentorai/research-plugins