This page is publicly shared

info
See for the updated protocol. Also see our , and share your feedback. See the ‘ for a database of pivotal questions under consideration.
We intend to keep continuous track of our progress in this status log.

Implementation and further details

Pilot idea: The Unjournal will elicit, curate, and evaluate questions that are relevant to funding and policy (for example, “psychotherapy increases happiness more per dollar than cash transfers”). This would integrate with our commissioning the evaluation of research papers. (And we would also continue to evaluate research that is not linked to these questions).
Aside note: This relates to a different project, but that other project is for identifying claims within a paper, this is about sourcing claims of interest.

Process Outline

Select a general topicreach out to organizations to elicit question(s)/claim(s) evaluate* several papers/projects relating to the claimwrite up a review of the evidence
* By ‘evaluate’ we mean ‘manage the expert evaluation of this’

Process Description

Select general topic(s): Select (pilot: 1-2) areas that are highly impactful and likely to fit the needs of research organizations in the (~EA) space we reach out to.

Reach out to organizations: Contact relevant impactful organizations that use external research to make decisions about funding, grant-making, and the nature of interventions. Ask them to specify concrete, quantitative/quantifiable, research-relevant questions that have the highest value-of-information (VOI) for their work.
Criteria:
Timing: The question should not only be relevant to an immediate decision since evaluating papers takes time.
Scope: The claim should be within The Unjournal’s remit (economics, social science, impact quantification, etc.).
Research: Some research should be available; the person or org suggesting the target question should provide at least one relevant research example
To offer context, we will give some specific examples of questions that came up in our work and/or related initiatives, and claims/questions taken or adapted from the organizations’ own work.

Select, refine, and get feedback on the ‘target question(s)’:
Rank the target questions, prioritizing at least one for each organization.
Help the organization specify these precisely and in a useful way.
Make these lists public and get general feedback (on the relevance of these, on their framing, on key sub-questions, on relevant research informing these).
And particular feedback on targeted questions
Elicit EA and non-EA feedback
Optional: Make this target question into a claim on a prediction market (Metaculus etc.) to be resolved by the evaluations and synthesis below. (This might be a later addition.)

Source and prioritize research informing the target question
Finalize 2-5 relevant research papers informing this question.
These may be suggested by the organization or sourced by The Unjournal or through public community feedback.
These papers or projects might not fall within the UJ’s typical scope in terms of publication stage and permissions. To deal with this:
If they are already published in a peer-reviewed journal, we can still prioritize them; this should not be a constraint.
If authors are junior and don’t want public evaluation, we may still pursue evaluation if the impact outweighs other issues.
We could introduce additional confidentiality options in especially challenging cases.
To alleviate concerns, we may ask evaluators to focus on the paper’s implications for the claim but not to rate the paper.

Offer public evaluation informing the target question(s)
Ask 1-2 evaluation managers to consider the question and contact potential evaluators.
Ask evaluators to focus on the targeted research question and target organization’s priorities and give our standard ‘evaluation of the paper’.
We should reduce the expectations for the latter, or add additional compensation
The research question could be combined with the “claim evaluation” section we .

5. Author and target org feedback: Share evaluations with authors for their response (as in our standard practice). Please share this with others, including the target org. (These will be provisionally shared on our PubPub.)

6. Synthesis report: ask one or more evaluation managers (or other members of The Unjournal) to write a report as a summary of the research investigated.
They should synthesize what the research, evaluations, and responses say about the question/claim
For prediction markets and for general usefulness, they should provide an ‘overall metric’ relating to the truth value of the target question, or a measured parameter.
Potentially, they should ‘resolve the prediction market claim’.
This basically substitutes for our usual discussions within each ‘evaluation manager’s summary’ for evaluated papers. However, this is a lot more ambitious and involved, and will merit further compensation. We might ask other members of The Unjournal’s broader team to help with this.
Share again with authors (and organizations) for feedback, but ‘read the room’ re ‘when to share’ without asking too much from authors, and timing of projects (or postpone this until after general publication). We don’t want to ask too much from authors, and we don’t want to delay this project. (We can also ask for responses after we make these packages publicly available.)

Complete and publish evaluation and target question ‘packages’
Linked by the target question under investigation, and the ‘synthesis report’ mentioned above
Provide, share, and promote further summaries of these packages (Forums, blogs, online editions, ‘base rate times’ etc.)

Pilot goals: To consider the impact and operational costs and benefits of this approach. (Evaluating research clustered into claims, or target questions, rather than just individual research papers). If successful, we will build a more systematic approach to identifying target questions, integrating this with our internal prioritization procedures.

Further notes

Forms

Potential partners and other resources to consider

Charity Entrepreneurship 'research process' doc (limited sharing) could give us some ideas/guidance for our "Pivotal Questions" approach and outreach. But there is a lot of process to wade through there. I'll try to glean some of the substantive highlights.
See notes on conversation with Metaculus
(internally shared)
(Clancy) ... these are already fairly well operationalized.

Other notes and key considerations

Aggregating claims and forecasts; Metaculus Index project as an example

Metaculus Index — , gives some detail on the elicitation and weighting of these, with some first-pass ad-hoc judgments that may be reasonable (see responses to comments) Flagship example case: ~“Do forecasters expect us to be prepared for AGI if it arrives in 2030?”, contains 8 questions
image.png
The index value is the weighted average of its component questions.
[An explicit formula would be helpful here.] The "-" and "+" signs next to the weights indicate whether it is 'good' or 'bad' for AI safety (or the direction of the relevant issue). I guess their weights are fixed and ~ad-hoc, but I can imagine an ways to fit them more explicitly and allow them to be adjusted over time; e.g., a flexible predictive model of the participant's answers to the overall 'issue question' as a function of the predictions they give for each question.

Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.