This skill teaches Claude to build reliable, long-horizon supply chain agent systems using the SupChain-ReAct framework from the SupChain-Bench paper. The core technique replaces brittle, hand-authored Standard Operating Procedures (SOPs) with autonomous multi-path ReAct reasoning and majority-vote aggregation, enabling agents to synthesize their own executable procedures for tool orchestration across complex supply chain workflows spanning order management, fulfillment tracking, warehouse operations, cancellation analysis, and error diagnosis.

When to Use

When the user asks to build an agent that orchestrates 10-30+ sequential tool calls to resolve supply chain or e-commerce order issues
When designing a diagnostic pipeline that traces orders through trade, fulfillment, and warehouse layers
When the user needs an agent framework that works without hand-authored SOPs or rigid procedural scripts
When implementing multi-step tool-calling workflows where early termination and execution drift are failure risks
When building order investigation systems that must handle branching logic (cancelled vs. error vs. in-transit statuses)
When the user wants to improve tool-calling reliability through parallel reasoning paths and consensus voting
When creating agents for any domain requiring long-horizon, multi-entity traversal across linked database records

Key Technique

The Problem: LLMs performing multi-step tool orchestration in supply chain settings suffer from three failure modes: (1) premature termination, where the model stops calling tools before exhausting all entities; (2) schema mismatches, where field names drift between tool calls; and (3) faithfulness errors, where the model's final response contradicts what tools actually returned. Providing hand-written SOPs helps but requires expensive domain expertise and still fails for models that prioritize conversational brevity over exhaustive coverage.

SupChain-ReAct: Instead of authoring SOPs, run N independent ReAct trajectories (the paper uses N=5) in parallel against the same task prompt and tool schema. Each trajectory alternates between a reasoning step ("I need to check the fulfillment status for each ID") and a tool invocation, continuing until it produces a final answer or hits a step limit. The final output is selected by majority vote over the textual answers from successful trajectories. This approach works because: (a) different trajectories explore different orderings and branching paths, reducing the chance that all paths prematurely terminate at the same point; (b) majority voting filters out hallucinated or unfaithful answers since they are unlikely to appear in a majority of independent runs; and (c) the model leverages its existing domain knowledge and tool-schema understanding to self-organize procedural steps without external instruction.

Results: SupChain-ReAct consistently outperformed both SOP-free and SOP-guided baselines across models. For example, Gemini-2.5-Pro jumped from 11.22% (no SOP) to 72.44% (SupChain-ReAct), and Claude-4-Sonnet went from 31.63% to 75.51%. The technique is model-agnostic and requires no training or fine-tuning.

supchain-bench-benchmarking-real-world-supply

When to Use

Key Technique