Warning: This page is provided as a quick overview of the nature of critiques our human evaluators generate. Be careful — this is generated by ChatGPT and may have substantial errors; confirm all statements in the original evaluations (links within linked ‘eval summary’.) While most of what I’ve seen checks out, I already noticed some hallucinations, and supposed direct quotes are not always accurate/
Quick overview of human evals (LLM-summarized - be careful)