hugging-face-evaluation

Originally fromhuggingface/skills
Installation
SKILL.md

Overview

This skill provides tools to add structured evaluation results to Hugging Face model cards. It supports multiple methods for adding evaluation data:

  • Extracting existing evaluation tables from README content
  • Importing benchmark scores from Artificial Analysis
  • Running custom model evaluations with vLLM or accelerate backends (lighteval/inspect-ai)

Integration with HF Ecosystem

  • Model Cards: Updates model-index metadata for leaderboard integration
  • Artificial Analysis: Direct API integration for benchmark imports
  • Papers with Code: Compatible with their model-index specification
  • Jobs: Run evaluations directly on Hugging Face Jobs with uv integration
  • vLLM: Efficient GPU inference for custom model evaluation
  • lighteval: HuggingFace's evaluation library with vLLM/accelerate backends
  • inspect-ai: UK AI Safety Institute's evaluation framework

Version

1.3.0

Dependencies

Related skills

More from patchy631/ai-engineering-hub

Installs
14
GitHub Stars
35.0K
First Seen
Jan 31, 2026