bedrock-inference
Installation
SKILL.md
Amazon Bedrock Inference
Overview
Amazon Bedrock Runtime provides APIs for invoking foundation models including Claude (Opus, Sonnet, Haiku), Nova (Amazon), Titan (Amazon), and third-party models (Cohere, AI21, Meta). Supports both synchronous and asynchronous inference with streaming capabilities.
Purpose: Production-grade model inference with unified API across all Bedrock models
Pattern: Task-based (independent operations for different inference modes)
Key Capabilities:
- Model Invocation - Direct model calls with native or Converse API
- Streaming - Real-time token streaming for low latency
- Async Invocation - Long-running tasks up to 24 hours
- Token Counting - Cost estimation before inference
- Guardrails - Runtime content filtering and safety
- Inference Profiles - Cross-region routing and cost optimization