vision-bench
Vision Bench — LLM Image Evaluation
Compare images by scoring them with one or more vision LLM judges against structured rubric criteria.
Quick Start
# Install dependencies
pip install pyyaml openai anthropic mistralai
# Score a single image
python bench.py image.png --criteria photorealism --judge gemini-2.5-flash
# Compare two AI-generated images
python bench.py img_a.png img_b.png \
--criteria text_to_image \
--prompt "a fox in a snowy forest" \
--judge gpt-4o
More from glebis/claude-skills
google-image-search
Search and download images via Google Custom Search API with LLM-powered selection. This skill should be used when finding images for articles, presentations, research documents, or enriching Obsidian notes with relevant visuals. Supports simple queries, batch processing from JSON config, automatic config generation from terms, and full note enrichment with automatic image insertion below headings.
383pdf-generation
Professional PDF generation from markdown using Pandoc with Eisvogel template and EB Garamond fonts. Use when converting markdown to PDF, creating white papers, research documents, marketing materials, or technical documentation. Supports both English and Russian documents with professional typography and color-coded themes. Mobile-optimized layout (6x9) by default for Telegram bot context, desktop/print layout (A4) for other contexts.
272firecrawl-research
This skill should be used when the user requests to research topics using FireCrawl, enrich notes with web sources, search and scrape information, or write scientific/academic papers. It extracts research topics from markdown files, creates research documents with scraped sources, generates BibTeX bibliographies from research results, and provides Pandoc/MyST templates for academic writing with citation management.
244decision-toolkit
Generate structured decision-making tools — step-by-step guides, bias checkers, scenario explorers, and interactive dashboards. Use when facing significant choices requiring systematic analysis. Supports multiple cognitive styles and output formats.
173presentation-generator
Generate interactive HTML presentations with neobrutalism styling, ASCII art decorations, and Agency brand colors. Outputs HTML (interactive with navigation), PNG (individual slides via Playwright), and PDF. References brand-agency skill for colors and typography. Use when creating presentations, slide decks, pitch materials, or visual summaries.
172elevenlabs-tts
This skill converts text to high-quality audio files using ElevenLabs API. Use this skill when users request text-to-speech generation, audio narration, or voice synthesis with customizable voice parameters (stability, similarity boost) and voice presets (rachel, adam, bella, elli, josh, arnold, ava).
169