run-evals

Installation

SKILL.md

Skill: Run Evals

Run the OpenWork UI evaluation flows against a real Electron app. Prefer a fresh Daytona sandbox for each run, with a local test fallback when Daytona is unavailable.

When to use

User says "run evals on Daytona" or "run this flow on Daytona"
User wants to verify a UI change end-to-end
User wants to test the onboarding, session, or settings flows

Prerequisites

daytona CLI installed and logged in (daytona login)
Using the "Different AI" org (daytona organization use "Different AI")
The .devcontainer/ files exist in the repo

Workflow

Step 1: Create sandbox

Installs

19

Repository

different-ai/openwork

GitHub Stars

16.7K

First Seen

May 14, 2026

Security Audits

Gen Agent Trust HubPass

run-evals — different-ai/openwork