This page is publicly shared
Team: see discussion space
Goals
Increase quality of evaluations and ratings Make the ratings more consistent across raters Understand how evaluators are thinking of the ratings (so we can improve the template etc.) Generate new insights through collaborative discussion
Basic proposal
Ask evaluators to
confer after their initial evaluations discuss what they intended by their metrics, and share their conversation and consensus with us be allowed to revise their individual ratings after the conversation and we report both sets of ratings, making the revised ratings focal
Justification and context
That would bring us closer to the “” that RepliCATS used in the SCORE project. (Note, this is related to the Delphi method, but we don’t aim for complete unanimity) According to Anca
mathematical aggregation (MA)= mechanistic rule" ....
there are studies showing... MA of 2nd round [after feedback and discussion amongst raters/evaluators is better than MA of the 1st round (in terms of informativeness, calibration and accuracy of best estimates)
Why might this “increase quality of evaluations and ratings”? Evaluators may...
Take these more seriously and put in more effort knowing they will need to discuss them with another researcher. Highlight and correct each other’s misunderstandings Notice things and raise points the other evaluator did not consider Ask for clarifications and corrections, improving the clarity and writing
Why “make the ratings more consistent across raters”
Discussion of the ratings could lead to better understanding of what we intended Different perspectives may moderate judgments, leading to convergence (not sure that is what we want, though) As noted, we will read their comments and better understand how evaluators are thinking of the ratings. Where they misunderstand our intentions, we can revise our language. We can get insights from the language they use in explaining them to each other. We can revise/drop rating categories with consistent misunderstanding
Proposed pilot plan (details)
Pre-evaluation: Select (particularly relevant?) papers to try this on. When we invite evaluators, ask if they would be willing to do this (in advance), noting the additional compensation.
Facilitated discussion: Share the (initial) evaluations and ratings with the other evaluators. Provide a shared discussion space and template with some seeded questions (Gdoc?), to include discussions of ‘how they considered the ratings’. Facilitate (anonymous if desired) chat. Give them ~2 weeks to interactively respond to each other’s evaluations and discuss this.
Post-discussion: (Encourage evaluators to write an aggregate/synthesis report?) Allow them to adjust both their initial evaluation discussion and their initial ratings, reminding them that both the initial and revised ratings (and discussion?) will be made public.
Compensation: Additional $100 - $250 per evaluator, anticipating 2-5 hours of additional work.