gke-inference

Installation
SKILL.md

GKE AI/ML Inference

This reference covers deploying AI/ML inference workloads on GKE using Google's Inference Quickstart (GIQ) and best practices for LLM serving.

MCP Tools: apply_k8s_manifest, get_k8s_resource, get_k8s_logs, get_k8s_rollout_status, describe_k8s_resource, list_k8s_events. CLI-only: gcloud container ai profiles *

When to Use

  • Deploy an AI model (Llama, Gemma, Mistral, etc.) to GKE
  • Generate optimized Kubernetes manifests for inference
  • Select GPU/TPU accelerators for model serving
  • Configure autoscaling for LLM inference

Prerequisites

Installs
123
Repository
google/skills
GitHub Stars
14.2K
First Seen
4 days ago
gke-inference — google/skills