ReplicationMarkets is launching another project – this time users are asked to predict and bet on the success of preprint papers on Covid-19. The process is simple: there is survey period (ending soon) and a market period (going until November 18th). During the survey period you are asked to make predictions like “How likely is this paper to get published in a journal with” or “How many citations will this paper get within a year” (4 items in total). During the market period you can buy virtual ‘yes’ or ‘no’ shares using token money provided by the project.
You can enter completely for free, but the project will hand out a total of 14520 USD in price money (3600 for the surveys and 10920 for the markets). Payouts are a bit lower than they were for the original ReplicationMarkets project, but I think you can still earn a decent wage filling out surveys if you’re not terrible (my estimate would be on average around 10 USD/h, but it could be higher or lower depending on your performance). I would highly recommend to fill out as many batches of questions as possible to be eligible for higher prices. Payments in the markets should be even higher.
Price money aside, this is just a really cool project and I encourage you to participate!
Here is an overview of how I personally approach these questions: I would recommend looking for the baseline rate to help inform your predictions. Basically, you should stick with the baseline and adapt it slightly for the paper at hand.
Question 1 asks you to give a probability that the paper will a) not be published, b) be published in a journal with an Impact Factor < 10 or c) be published in a journals with an Impact Factor > 10
For a distribution of Journal Impact Factors that might help answer one of the questions, I found this for example (not too sure about its accuracy, but the overall picture looks right):
My personal guess is that most research will eventually be published. Authors already put a lot of effort into it and there are really quite a lot of journals – one of them will publish it. My baseline for “not published” would be around 5%. My baseline for “published in a journal with impact factor > 10” would be at around 1%, given the small number of journals in that category. Edit: possibly, this should somewhat higher (see Mike’s comment below).
Question 2) asks how many citations the preprint will get relative to other preprints in the study.
Your baseline answer should of course be 50%, adjusting it upwards or downwards depending on whether you feel the article made an important point
Question 3) asks whether the results presented in the preprint are helpful to mitigate the impact of the COVID pandemic.
Again go with the middle if in doubt and adjust upwards and downwards. Honestly most research will probably not be extremely helpful I think, but there are multiple ways to interpret this question. Either 5 could be “quite helpful” or 5 could be “just mediocre”. As 5 is the default I would expect the median prediction to land around 5 or maybe slightly lower – so going with 5 seems like a sensible default.
Question 4) Asks for a probability estimate that the findings presented in the preprint agree with the majority of results from similar future studies.
My guess here is that the baseline should be around 70 percent. Many of the articles seem to make quite sensible points (and often quite similar), so I would assume that there is some kind of consensus.
For more help, here are some clarifications regarding questions 3 and 4.
Good thinking to look at the base rates, e.g., how many journals have impact factor > 10, etc.
However, keep in mind that the journals may publish different numbers of articles, that articles that begin as preprints may be different than those that skip preprint, and that the preprints in Replication Markets are not a random sample of preprints, but were selected because they received more social media attention (Altmetric) than most preprints.
Thanks for pointing that out. Do you have any data on what a sensible baseline would look like (e.g. past preprints on Covid that were then published)?
Scott Leibrand has begun a thread in Reddit r/ReplicationMarkets/.
The best I can think of would be to look at all the published COVID preprints in bioRxiv, and attempt to predict their journal impact factor. That covers the conditional $\Pr(JIF>10 | published)$. It then remains to estimate $\Pr(published | preprint, Altmetric, t=1yr)$. COVID papers have not yet hit 1 year, but perhaps could be done using pre-COVID papers and supplemented.
Hoping our talented forecaster team will find clever ways.
interesting thread, thanks for the reference!