gke-inference
Installation
SKILL.md
GKE AI/ML Inference
This reference covers deploying AI/ML inference workloads on GKE using Google's Inference Quickstart (GIQ) and best practices for LLM serving.
MCP Tools:
apply_k8s_manifest,get_k8s_resource,get_k8s_logs,get_k8s_rollout_status,describe_k8s_resource,list_k8s_events. CLI-only:gcloud container ai profiles *
When to Use
- Deploy an AI model (Llama, Gemma, Mistral, etc.) to GKE
- Generate optimized Kubernetes manifests for inference
- Select GPU/TPU accelerators for model serving
- Configure autoscaling for LLM inference