clickhouse-incident-runbook

Installation
SKILL.md

ClickHouse Incident Runbook

Overview

Step-by-step procedures for triaging and resolving ClickHouse incidents using built-in system tables and SQL commands.

Severity Levels

Level Definition Response Examples
P1 ClickHouse unreachable / all queries failing < 15 min Server down, OOM, disk full
P2 Degraded performance / partial failures < 1 hour Slow queries, merge backlog
P3 Minor impact / non-critical errors < 4 hours Single table issue, warnings
P4 No user impact Next business day Monitoring gaps, optimization

Quick Triage (Run First)

Related skills
Installs
2
GitHub Stars
2.2K
First Seen
Mar 30, 2026