resource-optimization
Resource-Aware Optimization
Not every task requires the smartest, most expensive model. Resource-Aware Optimization (or Dynamic Routing) classifies the complexity of a user request and routes it to the most appropriate model tier. This ensures you aren't using a sledgehammer to crack a nut, saving money and improving speed.
When to Use
- High Volume APIs: When 10% of requests are complex and 90% are simple.
- Latency Sensitivity: Routing simple "Hello" or "Stop" commands to instant, small models.
- Budget Constraints: Ensuring high-end models (like GPT-4 or Opus) are only used when absolutely necessary.
- Fallback: Using a small model first, and only upgrading to a large model if the small one fails/expresses low confidence.
Use Cases
- Tiered Chatbot:
- Simple (Greetings, FAQs) -> gpt-4o-mini
- Medium (Summarization, extraction) -> gpt-4o
- Complex (Coding, Reasoning) -> o1-preview
- Cascade: Try Llama-70B -> if confidence < 0.8 -> Try GPT-4.
- SLA-based: Free users -> Small Model. Paid users -> Large Model.
More from lauraflorentin/skills-marketplace
multi-agent-collaboration
A structural pattern where multiple specialized agents communicate and coordinate to solve a problem that is too complex for a single agent. Use when user asks to "build a multi-agent system", "agents working together", "agent collaboration", or mentions team of agents, distributed agents, or swarm.
23reflection
A recursive pattern where an agent evaluates and critiques its own output to iteratively improve quality and catch errors. Use when user asks to "add self-reflection", "agent introspection", "self-critique", or mentions self-evaluation, meta-cognition, or quality self-assessment.
18human-in-the-loop
A hybrid pattern where the system pauses execution to request human approval, input, or disambiguation before proceeding with critical actions. Use when user asks to "add human approval", "require human review", "human-in-the-loop", or mentions approval workflows, human oversight, or escalation.
17planning
A high-level cognitive pattern where an agent formulates a structured sequence of actions (a plan) before executing any of them, ensuring goal-directed behavior. Use when user asks to "add planning to my agent", "task planning", "agent planning", or mentions plan generation, plan execution, or step-by-step planning.
14parallelization
A concurrency pattern where multiple agent tasks are executed at the same time to speed up processing or gather diverse perspectives. Use when user asks to "run agents in parallel", "parallelize tasks", "concurrent execution", or mentions parallel processing, fan-out, or batch execution.
13routing
A control flow pattern where a central component classifies an input request and directs it to the most appropriate specialized agent or tool. Use when user asks to "route between agents", "agent routing", "task dispatch", or mentions classifier routing, intent detection, or agent selection.
12