Skip to content

Human ratings, claims, critiques (*LLM-summarized) – publicly shared

Warning: This page is provided as a quick overview of the nature of critiques our human evaluators generate. Be careful — this is generated by ChatGPT and may have substantial errors; confirm all statements in the original evaluations (links within linked ‘eval summary’.) While most of what I’ve seen checks out, I already noticed some hallucinations, and supposed direct quotes are not always accurate.
Quick overview of human evals (LLM-summarized - be careful)


Want to print your doc?
This is not the way.
Try clicking the ··· in the right corner or using a keyboard shortcut (
CtrlP
) instead.