hugging-face-datasets

Installation

Summary

Create, query, and manage datasets on Hugging Face Hub with SQL-based transformation and streaming updates.

Initialize new dataset repositories with template-based schemas (chat, classification, QA, completion, tabular) and custom system prompts
Query any Hugging Face dataset using DuckDB SQL via the hf:// protocol, including filtering, aggregations, joins, and regex operations
Stream rows efficiently without downloading entire datasets, with JSON validation and batch processing for large uploads
Export query results locally (Parquet, JSONL) or push transformed subsets directly to new Hub repositories with optional privacy settings
Designed to complement the HF MCP server: use MCP for discovery and metadata, use this skill for creation, editing, and data transformation

SKILL.md

Overview

This skill provides tools to manage datasets on the Hugging Face Hub with a focus on creation, configuration, content management, and SQL-based data manipulation. It is designed to complement the existing Hugging Face MCP server by providing dataset editing and querying capabilities.

Integration with HF MCP Server

Use HF MCP Server for: Dataset discovery, search, and metadata retrieval
Use This Skill for: Dataset creation, content editing, SQL queries, data transformation, and structured data formatting

Version

2.1.0

Dependencies

This skill uses PEP 723 scripts with inline dependency management

Scripts auto-install requirements when run with: uv run scripts/script_name.py

uv (Python package manager)
Getting Started: See "Usage Instructions" below for PEP 723 usage

Core Capabilities

Related skills

More from huggingface/skills

Installs

368

Repository

huggingface/skills

GitHub Stars

10.5K

First Seen

Jan 20, 2026

Security Audits

Gen Agent Trust HubPass

SocketPass

SnykWarn

hugging-face-datasets

Overview

Integration with HF MCP Server

Version

Dependencies

This skill uses PEP 723 scripts with inline dependency management

Scripts auto-install requirements when run with: uv run scripts/script_name.py

Core Capabilities

More from huggingface/skills

hf-cli

huggingface-gradio

transformers-js

huggingface-datasets

hugging-face-model-trainer

huggingface-llm-trainer