bdistill-xray
Installation
SKILL.md
When to use
- Understand actual vs claimed behavior. Discover how your AI model actually behaves compared to how it describes itself — surface hidden defaults, biases, and blind spots.
- Compare models for a specific task. Run x-ray on two or more models and compare their reports side-by-side to pick the best fit for extraction, prediction, or content generation.
- Debug unexpected refusals, hallucinations, or formatting issues. When a model over-refuses, fabricates facts, or produces surprising output formats, the x-ray pinpoints which behavioral dimension is responsible.
- Document a model's behavioral profile for your team. Generate a shareable HTML report that captures a model's strengths, weaknesses, and edge-case behavior in a standardized format.
Input contract
required:
model_name: string # Self-identified by the model (e.g. "claude-opus-4-6", "gpt-4o")
output:
dimensions: object # 6 scored dimensions (0.0-1.0 each)
behavioral_summary: object # Aggregate metrics and notable patterns
report_path: string # Path to the generated HTML report