gke-inference-quickstart

Installation
SKILL.md

GKE Inference Quickstart (GIQ)

Purpose

This skill guides the deployment of AI/ML inference workloads on GKE using GIQ. It leverages gcloud container ai profiles manifests create to create optimized Kubernetes manifests based on Google's best practices and benchmarks.

When to Use

  • Goal: Deploy an AI model (e.g., Llama, Gemma, Mistral) to GKE.
  • Goal: Generate a Kubernetes manifest for inference.
  • Context: User asks about "GIQ", "Inference Quickstart", or "AI benchmarks" on GKE.

Prerequisites

  • A GKE cluster (preferably with GPU/TPU node pools, though GIQ can help identify requirements).
  • gcloud CLI installed and authenticated (for discovery commands).

Workflow

Installs
12
GitHub Stars
158
First Seen
May 7, 2026
gke-inference-quickstart — googlecloudplatform/gke-mcp