together-evaluations

Installation

SKILL.md

Together AI Evaluations

Overview

Use Together AI evaluations when the user wants a managed LLM-as-a-judge workflow rather than an ad hoc prompt loop.

Core evaluation types:

Classify: assign outputs to labels
Score: grade outputs on a numeric scale
Compare: compare two candidate outputs with bias controls

This skill also covers external providers used as judges or targets when the workflow still runs through Together AI's evaluation system.

When This Skill Wins

Benchmark prompt variants, models, or product responses

Related skills

More from zainhas/skills

together-audio
Use this skill for Together AI audio workflows: text-to-speech over REST, streaming, or realtime WebSocket APIs, plus speech-to-text transcription, translation, diarization, timestamps, and live transcription. Reach for it whenever the user needs audio in or audio out on Together AI rather than generic chat generation, image or video creation, or model training.
1
together-images
Use this skill for Together AI image workflows: text-to-image generation, image editing with Kontext, FLUX model selection, LoRA-based styling, reference-image guidance, and local image downloads. Reach for it whenever the user wants to generate or edit images on Together AI rather than create videos or build text-only chat applications.
1
together-video
Use this skill for Together AI video workflows: text-to-video generation, image-to-video with keyframe control, model and dimension selection, polling asynchronous jobs, and downloading completed videos. Reach for it whenever the user wants motion generation on Together AI rather than still-image generation or text-only inference.
1
together-embeddings
Use this skill for Together AI embedding, retrieval, and reranking workflows: generating dense vectors, building semantic search or RAG pipelines, and using rerank models behind dedicated endpoints. Reach for it whenever the user needs vector representations or retrieval quality improvements rather than direct text generation.
1
together-gpu-clusters
Use this skill for Together AI GPU clusters and raw infrastructure workflows: provisioning on-demand or reserved clusters, choosing Kubernetes vs Slurm, attaching shared storage, scaling, getting credentials, and operating cluster-backed ML or HPC jobs. Reach for it when the user needs multi-node compute or infrastructure control rather than a managed model endpoint.
1
together-fine-tuning
Use this skill for Together AI fine-tuning workflows: LoRA or full fine-tuning, DPO preference tuning, VLM training, function-calling tuning, reasoning tuning, and BYOM uploads. Reach for it whenever the user wants to adapt a model on custom data rather than only run inference, evaluate outputs, or host an existing model.
1

Installs

Repository

zainhas/skills

First Seen

Mar 30, 2026

Security Audits

Gen Agent Trust HubPass

SocketPass

SnykPass

together-evaluations

Together AI Evaluations

Overview

When This Skill Wins

More from zainhas/skills

together-audio

together-images

together-video

together-embeddings

together-gpu-clusters

together-fine-tuning