databricks-model-serving
Installation
SKILL.md
Model Serving Endpoints
FIRST: Use the parent databricks-core skill for CLI basics, authentication, and profile selection.
Model Serving provides managed endpoints for serving LLMs, custom ML models, and external models as scalable REST APIs. Endpoints are identified by name (unique per workspace).
Endpoint Types
| Type | When to Use | Key Detail |
|---|---|---|
| Pay-per-token | Foundation Model APIs (Llama, DBRX, etc.) | Uses system.ai.* catalog models, simplest setup |
| Provisioned throughput | Dedicated GPU capacity | Guaranteed throughput, higher cost |
| Custom model | Your own MLflow models or containers | Deploy any model with an MLflow signature |