Benchmarks - Gists Doc

Now you’ve created some gists, added test cases, and knew about evaluators, we’re ready to run benchmarks to calculate the success rates of the gists.

How it works

Click on the benchmark button on the gist variant page
Select all the variants you want to benchmark
Select evaluators you want to enable that are applicable to your gist
Select test cases that you want to run
Choose how many times you want to run the test cases
Click on Run

Gists calls OpenAI while running the benchmarks which will use your API quota.

It can take a few minutes to finishing running and evaluating all the test cases against your variants.

Great job! We’re done with the essentials.

Characters Chat

​How it works

How it works