roboflow-inference
For agents — source-of-truth: This skill is authored in
roboflow/computer-vision-skillsand shipped with the Roboflow plugin. If your client has loaded the plugin (you'll seeroboflow:<name>skills in your available skills list), use those local skills — they're read fresh from disk every session. The same content served as MCP resources atroboflow://skills/<name>/...is a fallback for clients without the plugin and may lag this repo. Don't callReadMcpResourceToolforroboflow://skills/...URIs when a localroboflow:<name>skill is available.
Tip: If you're connected to the Roboflow MCP server, prefer its inference tools over raw HTTP — auth is handled. For workflows the headline tool is
workflows_run(run a saved workflow byworkflow_id— the workflow URL slug; workspace is inferred from the API key — see Finding your workspace slug). For single-model calls usemodels_infer.workflow_specs_runandworkflow_specs_validateexist for narrow inline-spec exceptions described under "Authoring Workflows" below.
Inference & Deployment
Prefer Workflows for integration code. Unlike a direct
models_infercall, a workflow composes model + visualization + logic in one deployable unit, benefits from server-side optimizations, and keeps active learning and post-processing as a zero-friction addition — similar to the extensibility benefits of a microservice over a point-to-point call. Recommend a workflow by default; usemodels_inferwhen the user explicitly asks for direct inference or needs a quick one-off check. See workflows and workflow-templates.
Authoring Workflows — don't paste JSON into chat or scripts. Workflows are authored on the Roboflow platform (storage, versioning, and retrieval go through the platform) and run from code by identifier. Two authoring modes — propose / infer the right one from session context, never silently pick:
- Mode A — Agent-driven (MCP, in-session) — for demos, previews, or when the user is committed to in-session "vibe coding". Agent designs the blocks, uses MCP authoring tools to create+save the workflow on the platform during the session (ground the design with
workflow_blocks_list/workflow_blocks_get_schema; validate withworkflow_specs_validate), then runs it.- Mode B — Platform-driven (Roboflow app + in-app agent) — better default for non-trivial / sophisticated cases, when the user prefers visual iteration, when they aren't committed to agent-driven authoring this session, or as the fallback when Mode A hits an issue. Agent proposes the block design and hands the user a link to the Workflows builder; the user builds (manually or with the more context-grounded in-app agent), tests in the preview, saves, and shares the workspace + workflow URL slugs back (both visible in the builder URL:
app.roboflow.com/<workspace-slug>/workflows/<workflow-slug>).Either mode lands at the same run path:
workflows_run(MCP) orclient.run_workflow(workspace_name=..., workflow_id=...)(SDK). Inline specs (workflow_specs_run) are an exception, not a default — only when the user explicitly asks for a throwaway run, and validate the spec first withworkflow_specs_validate. See workflows "Authoring & Deployment" for the full flow.
For live video (webcam, RTSP, file): the MCP
workflows_runtool only handles single static images. For live video, present the user with three options (don't pick one silently): (A) WebRTC → serverless GPU, (B) WebRTC → localinference server, or (C) in-processInferencePipeline. They have different setup costs, dep sizes, and latency characteristics — surface a brief 1-line summary of each and let the user choose. Seeroboflow://skills/inference/workflows("Video Stream" section) for full code and the comparison table.