Explore

Proposal: Inter-evaluator discussion

This page is publicly shared

Team: see discussion space

Goals

Increase quality of evaluations and ratings

Make the ratings more consistent across raters

Understand how evaluators are thinking of the ratings (so we can improve the template etc.)

Generate new insights through collaborative discussion

See

https://unjournalfriends.slack.com/archives/C03F3TVQ4Q1/p1715457462996759⁠

(internal team only)

Basic proposal

Ask evaluators to

confer after their initial evaluations

discuss what they intended by their metrics, and share their conversation and consensus with us

be allowed to revise their individual ratings after the conversation

and we report both sets of ratings, making the revised ratings focal

Justification and context

That would bring us closer to the “

IDEAS Protocol⁠

” that RepliCATS used in the SCORE project. (Note, this is related to the Delphi method, but we don’t aim for complete unanimity)

According to Anca

mathematical aggregation (MA)= mechanistic rule" ....

there are studies showing... MA of 2nd round [after feedback and discussion amongst raters/evaluators is better than MA of the 1st round (in terms of informativeness, calibration and accuracy of best estimates)

Why might this “increase quality of evaluations and ratings”? Evaluators may...

Take these more seriously and put in more effort knowing they will need to discuss them with another researcher.

Highlight and correct each other’s misunderstandings

Notice things and raise points the other evaluator did not consider

Ask for clarifications and corrections, improving the clarity and writing

Why “make the ratings more consistent across raters”

Discussion of the ratings could lead to better understanding of what we intended

Different perspectives may moderate judgments, leading to convergence (not sure that is what we want, though)

As noted, we will read their comments and better understand how evaluators are thinking of the ratings.

Where they misunderstand our intentions, we can revise our language. We can get insights from the language they use in explaining them to each other.

We can revise/drop rating categories with consistent misunderstanding

Proposed pilot plan (details)

Pre-evaluation: Select (particularly relevant?) papers to try this on. When we invite evaluators, ask if they would be willing to do this (in advance), noting the additional compensation.

Facilitated discussion: Share the (initial) evaluations and ratings with the other evaluators. Provide a shared discussion space and template with some seeded questions (Gdoc?), to include discussions of ‘how they considered the ratings’. Facilitate (anonymous if desired) chat. Give them ~2 weeks to interactively respond to each other’s evaluations and discuss this.

Post-discussion: (Encourage evaluators to write an aggregate/synthesis report?) Allow them to adjust both their initial evaluation discussion and their initial ratings, reminding them that both the initial and revised ratings (and discussion?) will be made public.

Compensation: Additional $100 - $250 per evaluator, anticipating 2-5 hours of additional work.

Load content from globalimpact.gitbook.io?

Loading external content may reveal information to 3rd parties. Learn more

Allow

Want to print your doc?
This is not the way.

Try clicking the ··· in the right corner or using a keyboard shortcut (

CtrlP

) instead.