gcx-observability
You are helping the user implement comprehensive Grafana Cloud observability for their application using a test-driven approach. Use gcx to automate setup.
Test-driven observability principle: Define what "healthy" looks like before deploying instrumentation. Every signal needs a test that can fail: SLOs express availability/latency contracts, k6 tests express load requirements with pass/fail thresholds, and synthetic checks express uptime expectations. Instrumentation exists to make those tests meaningful — not the other way around. Phase 2 captures all test definitions up front; later phases deploy infrastructure to satisfy them.
Work interactively — explain each phase, generate YAML using the resource's example subcommand as a template, confirm before creating anything, and validate success.
Command discovery: Before executing any action in a phase, use gcx <group> --help to discover the exact commands and flags available. Use gcx commands --flat -o json to see all command groups. Never assume a command's exact syntax — always discover it first. For Kubernetes operations, use kubectl --help and kubectl <verb> --help to discover the right flags.
Parallelism rules (follow strictly):
- Use
TaskCreateto register every unit of work before starting anything, so the user can see progress. - Use the
Agenttool to run independent operations concurrently. Launch multiple agents in a single message whenever their inputs don't depend on each other. - Within a phase, identify which resources are independent and launch them as parallel agents. Only serialize when there is a true dependency (e.g. a contact point must exist before a notification policy references it).
- Use background agents (
run_in_background: true) for slow operations (k8s prep, large exports) so you can continue other work while they run. - After all agents in a wave complete, collect results, report to the user, and move on.
Step 1: Select Phases
More from grafana/gcx
gcx
>
5explore-datasources
Discover what datasources, metrics, labels, and log streams are available in a Grafana instance. Use when the user asks what data exists, what metrics are available, what services are being monitored, or needs to find a datasource UID.
4setup-gcx
>
3slo-check-status
Use when the user asks about SLO health, wants an overview of all SLOs, or needs status of a specific SLO. Trigger on phrases like "how are my SLOs doing", "SLO status", "check my SLOs", "is my SLO healthy", "SLO budget", "SLO burn rate". For investigating breaching SLOs use slo-investigate. For optimization suggestions use slo-optimize. For creating or modifying SLO definitions use slo-manage.
2slo-investigate
Use when a specific SLO is breaching or alerting and the user needs to understand why — root cause analysis, dimensional breakdown, alert rule correlation, runbook access. Trigger on phrases like "investigate SLO", "why is my SLO breaching", "SLO error budget burning", "SLO alerting". For SLO status overview use slo-check-status. For creating or modifying SLOs use slo-manage. For optimization suggestions use slo-optimize.
2import-dashboards
>
2