embedding-optimization
Embedding Optimization
Optimize embedding generation for cost, performance, and quality in RAG and semantic search systems.
When to Use This Skill
Trigger this skill when:
- Building RAG (Retrieval Augmented Generation) systems
- Implementing semantic search or similarity detection
- Optimizing embedding API costs (reducing by 70-90%)
- Improving document retrieval quality through better chunking
- Processing large document corpora (thousands to millions of documents)
- Selecting between API-based vs. local embedding models
Model Selection Framework
Choose the optimal embedding model based on requirements:
More from ancoleman/ai-design-components
creating-dashboards
Creates comprehensive dashboard and analytics interfaces that combine data visualization, KPI cards, real-time updates, and interactive layouts. Use this skill when building business intelligence dashboards, monitoring systems, executive reports, or any interface that requires multiple coordinated data displays with filters, metrics, and visualizations working together.
245implementing-drag-drop
Implements drag-and-drop and sortable interfaces with React/TypeScript including kanban boards, sortable lists, file uploads, and reorderable grids. Use when building interactive UIs requiring direct manipulation, spatial organization, or touch-friendly reordering.
164administering-linux
Manage Linux systems covering systemd services, process management, filesystems, networking, performance tuning, and troubleshooting. Use when deploying applications, optimizing server performance, diagnosing production issues, or managing users and security on Linux servers.
127security-hardening
Reduces attack surface across OS, container, cloud, network, and database layers using CIS Benchmarks and zero-trust principles. Use when hardening production infrastructure, meeting compliance requirements, or implementing defense-in-depth security.
109building-ai-chat
Builds AI chat interfaces and conversational UI with streaming responses, context management, and multi-modal support. Use when creating ChatGPT-style interfaces, AI assistants, code copilots, or conversational agents. Handles streaming text, token limits, regeneration, feedback loops, tool usage visualization, and AI-specific error patterns. Provides battle-tested components from leading AI products with accessibility and performance built in.
74designing-distributed-systems
When designing distributed systems for scalability, reliability, and consistency. Covers CAP/PACELC theorems, consistency models (strong, eventual, causal), replication patterns (leader-follower, multi-leader, leaderless), partitioning strategies (hash, range, geographic), transaction patterns (saga, event sourcing, CQRS), resilience patterns (circuit breaker, bulkhead), service discovery, and caching strategies for building fault-tolerant distributed architectures.
52