Peer reviewing in ML
My notes from the amazing What to do about NeurIPS Reviewer 2 talk:
- Junior reviewers are as good as seniors
- Their review quality goes down over years
- Getting trained to do better reviewing also loses its positive impact over years
-
If a reviewer knows a paper was rejected before, then his/her score significantly decreases.
-
Single vs double blind experiments; there is a bias towards top authors/universities but no country, gender, academia vs industry bias
-
Only 10-20% of review scores change after rebuttal
-
Seeing others’ scores, reviewers tend to decrease their score rather than to increase.
-
If reviewers are friends of authors, they may tell authors who are rejecting their paper. In turn, senior authors may pressure reviewers.
- Review examples
- Dr. Fox effect: “if you can’t convince reviewers, confuse them”.
- Surprisingness bias: reviewers tend to find results “expected” and reject. Hence, authors (should?) stress the unpredictability of results and make reader think about the counterfactual.
- Confirmation bias: reviewers favor papers whose main claim is aligned with their own views.
- Positive-outcome bias: if a paper reports negative findings, reviewers tend to reject a lot more.
- Citation bias: reviewers who are cited tend to give higher scores
-
Meta-reviewers’ and ACs’ evaluation of a good vs bad review is independent of the review score
-
Meta-reviewers find reviews unnecessarily long reviews more useful
-
Tendency to reject if rejecting other papers seems to increase chances of acceptance
-
PC members and authors ill-intentionally try to get assigned to one another’s paper and proposals.
-
on average, reviewers catch 30% of all errors in other disciplines. one experiment done in a major ML conference showed that only 1/80 reviewers was suspicious of the error.
-
[2014] 57% of the papers accepted by one committee would be rejected by another.
-
[2021] More than half of the spotlights would be rejected by another committee.
-
Reviewer scores are uncorrelated with citations.
-
Review scores of “impact” does not correlate with the citation count but with social media counts.
- Even authors don’t always agree on how impactful their shared papers are.