You'll see a question and two anonymized answers — Answer A and Answer B — from two different models. You won't be told which model produced which answer.
Pick the better answer, or mark them as both good (equally good) or both bad.
Optionally add a comment explaining your choice.
Click Next › to move on. Your progress saves automatically — you can stop and resume anytime using the same name.
When you're done, you may optionally leave overall feedback at the bottom.
Choose what to compare
Pick the two answer sets you want to evaluate against each other.
vs
0 / 0 answered
Question
Answer A
Answer B
No comparison items were loaded. Add CSV files to data/ and check
config.json, then restart the server.