root-cause-analysis
Root Cause Analysis with Kopai
Guide for debugging production issues using telemetry data (traces, logs, metrics) via Kopai CLI.
Prerequisites
Ensure access to Kopai app backend. Make sure the services are set up to send their OpenTelemetry data to Kopai. See otel-instrumentation skill for setup.
RCA Workflow
- Find error traces —
npx @kopai/cli traces search --status-code ERROR --limit 20 --json. If empty: broaden time range, check service name, or search logs with--severity-min 17 - Get full trace context —
npx @kopai/cli traces get <traceId> --json. Check Duration, StatusCode, and span hierarchy for bottlenecks - Correlate logs —
npx @kopai/cli logs search --trace-id <traceId> --json. Look for error messages, stack traces, and timestamps - Check metrics —
npx @kopai/cli metrics discover --jsonthennpx @kopai/cli metrics search --type <type> --name <name> --jsonfor anomalies - Present findings — summarize root cause with evidence (specific traceIds, log entries, metric anomalies), impact, and suggested fix
Quick Example
More from kopai-app/kopai-mono
otel-instrumentation
Instrument applications with OpenTelemetry SDK and validate telemetry using Kopai. Use when setting up observability, adding tracing/logging/metrics, testing instrumentation, debugging missing telemetry data, or when traces/logs/metrics aren't appearing after setup. Also use when users say things like "my traces aren't showing up", "I don't see any data", or "how do I add observability to my app".
25create-dashboard
Create observability dashboards from OTEL metrics, logs, and traces using Kopai. Use when building metric visualizations, monitoring views, KPI panels, or when the user wants to see their telemetry data in a dashboard — even if they don't say "dashboard" explicitly. Also use when other skills or workflows need to present telemetry data visually (e.g. after root cause analysis).
4