tabular-eda
Tabular EDA — Done Right
Whenever you get handed a new tabular dataset, stop. Do not jump
straight to XGBClassifier(). Ten minutes of EDA will catch problems
that would otherwise destroy your downstream model — target leakage,
high-cardinality explosions, MAR missing data, non-linear features that
Pearson correlation says are useless. This skill is the workflow.
When to use this skill
- You just received a new dataset and have no idea what's in it
- You're about to train a model and want to validate the data first
- A model is performing suspiciously well (or poorly) and you suspect a data quality issue
More from brojonat/llmsrules
go-service
Build Go microservices with stdlib HTTP handlers, sqlc, urfave/cli, and slog. Use when creating or modifying a Go HTTP server, adding routes, middleware, database queries, or CLI commands.
14temporal-go
Build Temporal workflow applications in Go. Use when creating or modifying Temporal workflows, activities, workers, clients, signals, queries, updates, retry policies, saga patterns, or writing Temporal tests.
14python-cli
Build Python CLIs with Click using subcommand groups. Use when creating or modifying a Python command-line interface, adding subcommands, or structuring a CLI package.
14ibis-data
Use Ibis for database-agnostic data access in Python. Use when writing data queries, connecting to databases (DuckDB, PostgreSQL, SQLite), or building portable data pipelines that should work across backends.
14ducklake
Work with DuckLake, an open lakehouse format built on DuckDB. Use when creating or querying DuckLake tables, managing snapshots, time travel, schema evolution, partitioning, or lakehouse maintenance operations.
13k8s-deployment
Deploy services to Kubernetes with Docker multi-stage builds, kustomize overlays, and Makefile automation. Use when creating Dockerfiles, writing k8s manifests, or setting up deployment pipelines.
13