Put Your AI Behind an API

Wrap a DSPy program in a web API so other services or a frontend can call it over HTTP. Defaults to FastAPI but adapts to the user's existing framework.

Step 1: Gather context

Ask the user:

What DSPy program are you serving? (classification, RAG, extraction, pipeline, etc.)
Is it optimized? (do you have an optimized.json from /ai-improving-accuracy?)
What endpoints do you need? (single query, batch, health check, etc.)
Do you have an existing web framework? (FastAPI, Flask, Django — default to FastAPI)

When NOT to serve via API

Internal script or notebook only — if only your team calls the AI from Python, skip the API layer. Import the module directly. An API adds latency, deployment complexity, and a failure surface for no benefit.
Batch-only workloads — if you process data on a schedule (nightly re-classification, weekly report generation), use a script or job runner (cron, Airflow). An HTTP API implies real-time request/response which is overkill for batch.
Frontend can call the LM provider directly — if your app is a thin wrapper around a single LM call with no optimization or custom logic, the frontend can call the provider API directly (with a proxy for auth). You only need a DSPy API when you have optimized prompts, multi-step pipelines, or retrieval logic worth encapsulating.

ai-serving-apis

Put Your AI Behind an API

Step 1: Gather context

When NOT to serve via API