together-evaluations

Installation

SKILL.md

Together AI Evaluations

Overview

Use Together AI evaluations when the user wants a managed LLM-as-a-judge workflow rather than an ad hoc prompt loop.

Core evaluation types:

Classify: assign outputs to labels
Score: grade outputs on a numeric scale
Compare: compare two candidate outputs with bias controls

This skill also covers external providers used as judges or targets when the workflow still runs through Together AI's evaluation system.

When This Skill Wins

Benchmark prompt variants, models, or product responses

Related skills

More from zainhas/togetherai-skills

together-code-interpreter
Use this skill for Together AI Code Interpreter workflows: remote Python execution, session reuse, file uploads, data analysis, plots, and stateful notebook-like runs through the TCI API. Reach for it whenever the user wants managed remote Python execution on Together AI instead of local execution, raw clusters, or full model hosting.
33
together-audio
Text-to-speech and speech-to-text via Together AI, including REST, streaming, and realtime WebSocket TTS, plus transcription, translation, diarization, timestamps, and live STT. Reach for it whenever the user needs audio in or audio out on Together AI rather than chat generation, image or video creation, or model training.
14
together-images
Text-to-image generation and image editing via Together AI, including FLUX and Kontext models, LoRA-based styling, reference-image guidance, and local image downloads. Reach for it whenever the user wants to generate or edit images on Together AI rather than create videos or build text-only chat applications.
14
together-chat-completions
Real-time and streaming text generation via Together AI's OpenAI-compatible chat/completions API, including multi-turn conversations, tool and function calling, structured JSON outputs, and reasoning models. Reach for it whenever the user wants to build or debug text generation on Together AI, unless they specifically need batch jobs, embeddings, fine-tuning, dedicated endpoints, dedicated containers, or GPU clusters.
13
together-dedicated-endpoints
Single-tenant GPU endpoints on Together AI with autoscaling and no rate limits. Deploy fine-tuned or uploaded models, size hardware, and manage endpoint lifecycle. Reach for it whenever the user needs predictable always-on hosting rather than serverless inference, custom containers, or raw clusters.
13
together-video
Text-to-video and image-to-video generation via Together AI, including keyframe control, model and dimension selection, asynchronous job polling, and video downloads. Reach for it whenever the user wants motion generation on Together AI rather than still-image generation or text-only inference.
12

Installs

Repository

zainhas/togethe…i-skills

GitHub Stars

First Seen

Feb 27, 2026

Security Audits

Gen Agent Trust HubPass

SocketPass

SnykPass

together-evaluations

Together AI Evaluations

Overview

When This Skill Wins

More from zainhas/togetherai-skills

together-code-interpreter

together-audio

together-images

together-chat-completions

together-dedicated-endpoints

together-video