managing-database-sharding
Database Sharding Manager
Overview
Implement and manage horizontal database sharding strategies across PostgreSQL, MySQL, and MongoDB. This skill covers shard key selection, data distribution analysis, cross-shard query routing, and rebalancing operations for databases that have outgrown single-node capacity.
Prerequisites
- Database admin credentials with CREATE DATABASE, CREATE TABLE, and replication permissions
psql,mysql, ormongoshCLI tools installed and configured- Network connectivity between all shard nodes
- Current table sizes and growth rate data (query
pg_total_relation_sizeorinformation_schema.TABLES) - Application query patterns documented or access to slow query logs
- Enough disk and memory on target shard nodes to handle redistributed data
Instructions
-
Analyze the current database size and identify tables exceeding single-node capacity thresholds (typically >500GB or >1B rows). Run
SELECT pg_size_pretty(pg_total_relation_size('table_name'))for PostgreSQL orSELECT data_length + index_length FROM information_schema.TABLESfor MySQL. -
Evaluate candidate shard keys by examining query WHERE clauses, JOIN patterns, and data distribution. A good shard key has high cardinality, even distribution, and appears in most queries. Run
SELECT shard_key_column, COUNT(*) FROM table GROUP BY shard_key_column ORDER BY COUNT(*) DESC LIMIT 20to check distribution.