evaluation-methodology
Installation
SKILL.md
Evaluation Methodology
Methods for evaluating Foundation Model outputs.
Evaluation Approaches
1. Exact Evaluation
| Method | Use Case | Example |
|---|---|---|
| Exact Match | QA, Math | "5" == "5" |
| Functional Correctness | Code | Pass test cases |
| BLEU/ROUGE | Translation | N-gram overlap |
| Semantic Similarity | Open-ended | Embedding cosine |
# Semantic Similarity
from sentence_transformers import SentenceTransformer
from sklearn.metrics.pairwise import cosine_similarity