MiniMax M-series Production Wiring

OpenAI-compatible chat-completion endpoint at https://api.minimax.io/v1 with the M2.7-highspeed reasoning model (premium tier as of 2026-04-29). LOOKS LIKE OPENAI but silently drops 6 OpenAI parameters and exposes <think> reasoning traces inside content. Beyond chat-completion, the API is MiniMax-native (different URLs, different body shapes, different error envelopes — HTTP 200 + base_resp.status_code instead of HTTP 4xx).

The model itself is a competent qualitative judge + theory explainer + tool orchestrator for finance/quant work, but cannot do raw math on realistic data sizes (saturates reasoning budget) and hallucinates plausible details under input uncertainty (6 documented instances). For production: pair it with Python for math, sandbox validators for code, and deterministic detectors for pattern recognition.

Self-Evolving Skill: This skill improves through use. If instructions are wrong, parameters drifted, or a workaround was needed — fix this file immediately, don't defer. Only update for real, reproducible issues. Source-of-truth campaign archive: ~/own/amonic/minimax/ (read-only reference; do not modify from this skill).

🆕 MiniMax-M3 is live (2026-06-01). This file covers M2.7. For M3 — native vision, reasoning_split clean output, response_format acceptance, ~1M input ceiling (reliable retrieval ≤ ~256K) / 524K output cap, n=1, and the docs-vs-reality discrepancies — use the sibling skill ../m3/SKILL.md and the evidence doc ../../references/M3-EMPIRICAL.md. The defensive snippets below (<think> strip, base_resp retry, cached-token reader) apply to M3 unchanged.

minimax

MiniMax M-series Production Wiring

When to use M2.7 vs not — the decision table