vllm-studio-backend

Installation
SKILL.md

vLLM Studio Backend Architecture

Overview

This skill explains how the backend is wired: controller runtime, OpenAI-compatible proxy, Pi-mono agent loop, LiteLLM gateway, and inference process management.

When To Use

  • Modifying controller routes or run streaming.
  • Debugging OpenAI-compatible endpoint behavior.
  • Updating Pi-mono agent runtime or tool execution.
  • Understanding how inference + LiteLLM fit together.

Quick Start

  • Read references/backend-architecture.md for the component map and data flow.
  • Read references/openai-compat.md for /v1/models and /v1/chat/completions behavior.
  • Read references/backend-commands.md for useful run/debug commands.
Installs
8
GitHub Stars
388
First Seen
Feb 10, 2026
vllm-studio-backend — 0xsero/vllm-studio