gepa-demo
What is prompt optimization?
Prompt optimization is the process of improving the quality of prompts used in language models. It is often done manually, but increasingly their are frameworks (such as DSPy) being used to use LLMs to do this.
In essence, the process involves the user providing a dataset and a grader or reward model to judge an LLM's output. A prompt's performance on the dataset is measured, the gaps in in its performance identified, and a new prompt is then proposed and tested. This runs in a loop until an end state is reached.
What is GEPA?
GEPA (which stands for GEnetic PAreto) is a prompt optimization algorithm that follows the process above. It is increasingly a popular approach to prompt optimization, and utilizes two key strategic choices compared to other algorithms:
- Text Feedback: most prompt optimization algorithms and RL simply use a scalar reward. GEPA however also uses textual feedback, including log traces, which can be used to identify the root cause of issues in a prompt. This makes it particularly well suitable for LLM as Judges, which may not be well calibrated from a score perspective but can provide high quality text feedback identifying what needs to be improved.
- Pareto Frontiers: GEPA does not simply select the best overall prompt based on score, but checks for which prompt dominates other candidate prompts on most test cases in the validation dataset. Through this, you are more likely to find a prompt that performs reliably on a wider variety of cases, rather than one that might simply spike on some scenarios but underperform on others.
More can be read about GEPA on this Github repo if needed.
The goal for this skill
Help users improve their prompts using our custom implementation of GEPA. Ask them for a dataset and create a grader for them (if they don't already have one), then optimize the prompt.
Our custom GEPA implementation forks the original library to provide observability into its run. The Github repo also packages a webserver which is used to visualize the process.