gke-inference-quickstart
Installation
SKILL.md
GKE Inference Quickstart (GIQ)
Purpose
This skill guides the deployment of AI/ML inference workloads on GKE using GIQ. It leverages gcloud container ai profiles manifests create to create optimized Kubernetes manifests based on Google's best practices and benchmarks.
When to Use
- Goal: Deploy an AI model (e.g., Llama, Gemma, Mistral) to GKE.
- Goal: Generate a Kubernetes manifest for inference.
- Context: User asks about "GIQ", "Inference Quickstart", or "AI benchmarks" on GKE.
Prerequisites
- A GKE cluster (preferably with GPU/TPU node pools, though GIQ can help identify requirements).
gcloudCLI installed and authenticated (for discovery commands).