Essentials
Benchmarks
Now you’ve created some gists, added test cases, and knew about evaluators, we’re ready to run benchmarks to calculate the success rates of the gists.
How it works
- Click on the
benchmark
button on the gist variant page - Select all the variants you want to benchmark
- Select evaluators you want to enable that are applicable to your gist
- Select test cases that you want to run
- Choose how many times you want to run the test cases
- Click on
Run
Gists calls OpenAI while running the benchmarks which will use your API quota.
It can take a few minutes to finishing running and evaluating all the test cases against your variants.
Great job! We’re done with the essentials.