evaluation-methodology
Evaluation Methodology
This document is the authoritative reference for how PluginEval measures plugin and skill quality. It covers the three evaluation layers, all ten scoring dimensions, the composite formula, badge thresholds, anti-pattern flags, Elo ranking, and actionable improvement tips.
Related: Full rubric anchors
The Three Evaluation Layers
PluginEval stacks three complementary layers. Each layer produces a score between 0.0 and 1.0 for each applicable dimension, and later layers override or blend with earlier ones according to per-dimension blend weights.
Layer 1 — Static Analysis
Speed: < 2 seconds. No LLM calls. Deterministic.
More from wshobson/agents
tailwind-design-system
Build scalable design systems with Tailwind CSS v4, design tokens, component libraries, and responsive patterns. Use when creating component libraries, implementing design systems, or standardizing UI patterns.
41.0Ktypescript-advanced-types
Master TypeScript's advanced type system including generics, conditional types, mapped types, template literals, and utility types for building type-safe applications. Use when implementing complex type logic, creating reusable type utilities, or ensuring compile-time type safety in TypeScript projects.
40.4Knodejs-backend-patterns
Build production-ready Node.js backend services with Express/Fastify, implementing middleware patterns, error handling, authentication, database integration, and API design best practices. Use when creating Node.js servers, REST APIs, GraphQL backends, or microservices architectures.
31.8Kpython-performance-optimization
Profile and optimize Python code using cProfile, memory profilers, and performance best practices. Use when debugging slow Python code, optimizing bottlenecks, or improving application performance.
22.1Kapi-design-principles
Master REST and GraphQL API design principles to build intuitive, scalable, and maintainable APIs that delight developers. Use when designing new APIs, reviewing API specifications, or establishing API design standards.
20.3Kpython-testing-patterns
Implement comprehensive testing strategies with pytest, fixtures, mocking, and test-driven development. Use when writing Python tests, setting up test suites, or implementing testing best practices.
19.7K