## Peer Review Of Research Paper

Image by James Yang http://www.jamesyang.com

**PhD2Published has several informative posts about writing journal articles, and more recently has featured a post outlining a potentially revolutionary collaborative peer review process for this kind of publishing. Todays post offers an alternative perspective; that of the journal article peer reviewer. Doing peer reviews provides important experience for those writing their own papers and may help writers consider what they should include based on what peer reviewers are looking for.**

At some point in your scholarly career, you likely will get asked to review an article for a journal. In this post, I explain how I usually go about doing a peer review. I imagine that each scholar has their own way of doing this, but it might be helpful to talk openly about this task, which we generally complete in isolation.

**Step One: Accept the invitation to peer review**. The first step in reviewing a journal article is to accept the invitation. When deciding whether or not to accept, take into consideration three things: 1) Do you have time to do the review by the deadline? 2) Is the article within your area of expertise? 3) Are you sure you will complete the review by the deadline? Once you accept the invitation, set aside some time in your schedule to read the article and write the review.

**Step Two: Read the article**. I usually read the article with a pen in hand so that I can write my thoughts in the margins as I read. As I read, I underline parts of the article that seem important, write down any questions I have, and correct any mistakes I notice.

**Step Three: Write a brief summary of the article and its contribution**. When I am doing a peer review, I sometimes do it all in one sitting – which will take me about two hours – or I read it one day and write it the next. Often, I prefer to do the latter to give myself some time to think about the article and to process my thoughts. When writing a draft of the review, the first thing I do is summarize the article as best I can in three to four sentences. If I think favorably of the article and believe it should be published, I often will write a longer summary, and highlight the strengths of the article. Remember that even if you don’t have any (or very many) criticisms, you still need to write a review. Your critique and accolades may help convince the editor of the importance of the article. As you write up this summary, take into consideration the suitability of the article for the journal. If you are reviewing for the top journal in your field, for example, an article simply being factually correct and having a sound analysis is not enough for it to be published in that journal. Instead, it would need to change the way we think about some aspect of your field.

**Step Four: Write out your major criticisms of the article**. When doing a peer review, I usually begin with the larger issues and end with minutiae. Here are some major areas of criticism to consider:

– Is the article well-organized?

– Does the article contain all of the components you would expect (Introduction, Methods, Theory, Analysis, etc)?

– Are the sections well-developed?

– Does the author do a good job of synthesizing the literature?

– Does the author answer the questions he/she sets out to answer?

– Is the methodology clearly explained?

– Does the theory connect to the data?

– Is the article well-written and easy to understand?

– Are you convinced by the author’s results? Why or why not?

**Step Five: Write out any minor criticisms of the article**. Once you have laid out the pros and cons of the article, it is perfectly acceptable (and often welcome) for you to point out that the table on page 3 is mislabeled, that the author wrote “compliment” instead of “complement” on page 7, or other minutiae. Correcting those minor errors will make the author’s paper look more professional if it goes out for another peer review, and certainly will have to be corrected before being accepted for publication.

**Step Six: Review**. Go over your review and make sure that it makes sense and that you are communicating your critiques and suggestions in as helpful a way as possible.

Finally, I will say that, when writing a review, be mindful that you are critiquing the article in question – not the author. Thus, make sure your critiques are constructive. For example, it is not appropriate to write: “The author clearly has not read any Foucault.” Instead, say: “The analysis of Foucault is not as developed as I would expect to see in an academic journal article.” Also, be careful not to write: “The author is a poor writer.” Instead, you can say: “This article would benefit from a close editing. I found it difficult to follow the author’s argument due to the many stylistic and grammatical errors.” Although you are an anonymous reviewer, the Editor knows who you are, and it never looks good when you make personal attacks on others. So, in addition to being nice, it is in your best interest.

**Tanya Golash-Boza is Associate Professor of Sociology and American Studies at the University of Kansas. She Tweets as @tanyagolashboza and has her own website.**

Anna Tarrant, Peer Review, Pitching & Publishing, publishing, Tanya Golash-Boza, Writing

*et al.*criticizes a paper by Bradley Efron that discusses Bayesian statistics (Efron, 2013a), focusing on a particular example that was also discussed in Efron (2013b). The example concerns a woman who is carrying twins, both male (as determined by sonogram and we ignore the possibility that gender has been observed incorrectly). The parents-to-be ask Efron to tell them the probability that the twins are identical.

This is my first open review, so I'm not sure of the protocol. But given that there appears to be errors in both Efron (2013b) and the paper under review, I am sorry to say that my review might actually be longer than the article by Efron (2013a), the primary focus of the critique, and the critique itself. I apologize in advance for this. To start, I will outline the problem being discussed for the sake of readers.

This problem has various parameters of interest. The primary parameter is the genetic composition of the twins in the mother’s womb. Are they identical (which I describe as the state *x* = 1) or fraternal twins (*x* = 0)? Let *y* be the data, with *y* = 1 to indicate the twins are the same gender. Finally, we wish to obtain Pr(*x* = 1 | *y* = 1), the probability the twins are identical given they are the same gender1. Bayes’ rule gives us an expression for this:

Pr(*x* = 1 | *y* = 1) = Pr(*x*=1) Pr(*y* = 1 | *x* = 1) / {Pr(*x*=1) Pr(*y* = 1 | *x* = 1) + Pr(*x*=0) Pr(*y* = 1 | *x* = 0)}

Now we know that Pr(*y* = 1 | *x* = 1) = 1; twins must be the same gender if they are identical. Further, Pr(*y* = 1 | *x* = 0) = 1/2; if twins are not identical, the probability of them being the same gender is 1/2.

Finally, Pr(*x* = 1) is the prior probability that the twins are identical. The bone of contention in the Efron papers and the critique by Amrhein *et al.* revolves around how this prior is treated. One can think of Pr(*x* = 1) as the population-level proportion of twins that are identical for a mother like the one being considered.

However, if we ignore other forms of twins that are extremely rare (equivalent to ignoring coins finishing on their edges when flipping them), one incontrovertible fact is that Pr(*x* = 0) = 1 − Pr(*x* = 1); the probability that the twins are fraternal is the complement of the probability that they are identical.

The above values and expressions for Pr(*y* = 1 | *x* = 1), Pr(*y* = 1 | *x* = 0), and Pr(*x* = 0) leads to a simpler expression for the probability that we seek ‐ the probability that the twins are identical given they have the same gender:

Pr(*x* = 1 | *y* = 1) = 2 Pr(*x*=1) / [1 + Pr(*x*=1)] (1)

We see that the answer depends on the prior probability that the twins are identical, Pr(*x*=1). The paper by Amrhein *et al.* points out that this is a mathematical fact. For example, if identical twins were impossible (Pr(*x* = 1) = 0), then Pr(*x* = 1| *y* = 1) = 0. Similarly, if all twins were identical (Pr(*x* = 1) = 1), then Pr(*x* = 1| *y* = 1) = 1. The “true” prior lies somewhere in between. Apparently, the doctor knows that one third of twins are identical2. Therefore, if we assume Pr(*x* = 1) = 1/3, then Pr(*x* = 1| *y* = 1) = 1/2.

Now, what would happen if we didn't have the doctor's knowledge? Laplace's “Principle of Insufficient Reason” would suggest that we give equal prior probability to all possibilities, so Pr(*x* = 1) = 1/2 and Pr(*x* = 1| *y* = 1) = 2/3, an answer different from 1/2 that was obtained when using the doctor's prior of 1/3.

Efron (2013a) highlights this sensitivity to the prior, representing someone who defines an uninformative prior as a “violator”, with Laplace as the “prime violator”. In contrast, Amrhein *et al.* correctly points out that the difference in the posterior probabilities is merely a consequence of mathematical logic. No one is violating logic – they are merely expressing ignorance by specifying equal probabilities to all states of nature. Whether this is philosophically valid is debatable (Colyvan 2008), but this example does not lend much weight to that question, and it is well beyond the scope of this review. But setting Pr(*x* = 1) = 1/2 is not a violation; it is merely an assumption with consequences (and one that in hindsight might be incorrect2).

Alternatively, if we don't know Pr(*x* = 1), we could describe that probability by its own probability distribution. Now the problem has two aspects that are uncertain. We don’t know the true state *x*, and we don’t know the prior (except in the case where we use the doctor’s knowledge that Pr(*x* = 1) = 1/3). Uncertainty in the state of *x* refers to uncertainty about this particular set of twins. In contrast, uncertainty in Pr(*x* = 1) reflects uncertainty in the population-level frequency of identical twins. A key point is that the state of one particular set of twins is a different parameter from the frequency of occurrence of identical twins in the population.

Without knowledge about Pr(*x* = 1), we might use Pr(*x* = 1) ~ dunif(0, 1), which is consistent with Laplace. Alternatively, Efron (2013b) notes another alternative for an uninformative prior: Pr(*x* = 1) ~ dbeta(0.5, 0.5), which is the Jeffreys prior for a probability.

Here I disagree with Amrhein *et al.*; I think they are confusing the two uncertain parameters. Amrhein *et al.* state:

*“We argue that this example is not only flawed, but useless in illustrating Bayesian data analysis because it does not rely on any data. Although there is one data point (a couple is due to be parents of twin boys, and the twins are fraternal), Efron does not use it to update prior knowledge. Instead, Efron combines different pieces of expert knowledge from the doctor and genetics using Bayes’ theorem.”*

This claim might be correct when describing uncertainty in the population-level frequency of identical twins. The data about the twin boys is not useful by itself for this purpose – they are a biased sample (the data have come to light because their gender is the same; they are not a random sample of twins). Further, a sample of size one, especially if biased, is not a firm basis for inference about a population parameter. While the data are biased, the claim by Amrheim *et al.* that there are no data is incorrect.

However, the data point (the twins have the same gender) is entirely relevant to the question about the state of this particular set of twins. And it does update the prior. This updating of the prior is given by equation (1) above. The doctor’s prior probability that the twins are identical (1/3) becomes the posterior probability (1/2) when using information that the twins are the same gender. The prior is clearly updated with Pr(*x* = 1| *y* = 1) ≠ Pr(*x* = 1) in all but trivial cases; Amrheim *et al.*’s statement that I quoted above is incorrect in this regard.

This possible confusion between uncertainty about these twins and uncertainty about the population level frequency of identical twins is further suggested by Amrhein *et al.*’s statements:

“Second, for the uninformative prior, Efron mentions erroneously that he used a uniform distribution between zero and one, which is clearly different from the value of 0.5 that was used. Third, we find it at least debatable whether a prior can be called an uninformative prior if it has a fixed value of 0.5 given without any measurement of uncertainty.”

Note, if the prior for Pr(*x* = 1) is specified as 0.5, or dunif(0,1), or dbeta(0.5, 0.5), the posterior probability that these twins are identical is 2/3 in all cases. Efron (2013b) says the different priors lead to different results, but this result is incorrect, and the correct answer (2/3) is given in Efron (2013a)3. Nevertheless, a prior that specifies Pr(*x* = 1) = 0.5 does indicate uncertainty about whether this particular set of twins is identical (but certainty in the population level frequency of twins). And Efron’s (2013a) result is consistent with Pr(*x* = 1) having a uniform prior. Therefore, both claims in the quote above are incorrect.

It is probably easiest to show the (lack of) influence of the prior using MCMC sampling. Here is WinBUGS code for the case using Pr(*x* = 1) = 0.5.

*x*is 2/3; this is the posterior probability that

*x*= 1.

Instead of using pr_ident_twins <- 0.5, we could set this probability as being uncertain and define pr_ident_twins ~ dunif(0,1), or pr_ident_twins ~ dbeta(0.5,0.5). In either case, the posterior mean value of *x* remains 2/3 (contrary to Efron 2013b, but in accord with the correction in Efron 2013a).

Note, however, that the value of the population level parameter pr_ident_twins is different in all three cases. In the first it remains unchanged at 1/2 where it was set. In the case where the prior distribution for pr_ident_twins is uniform or beta, the posterior distributions remain broad, but they differ depending on the prior (as they should – different priors lead to different posteriors4). However, given the biased sample size of 1, the posterior distribution for this particular parameter is likely to be misleading as an estimate of the population-level frequency of twins.

So why doesn’t the choice of prior influence the posterior probability that these twins are identical? Well, for these three priors, the prior probability that any single set of twins is identical is 1/2 (this is essentially the mean of the prior distributions in these three cases).

If, instead, we set the prior as dbeta(1,2), which has a mean of 1/3, then the posterior probability that these twins are identical is 1/2. This is the same result as if we had set Pr(*x* = 1) = 1/3. In both these cases (choosing dbeta(1,2) or 1/3), the prior probability that a single set of twins is identical is 1/3, so the posterior is the same (1/2) given the data (the twins have the same gender).

Further, Amrhein *et al.* also seem to misunderstand the data. They note:

“Although there is one data point (a couple is due to be parents of twin boys, and the twins are fraternal)...”

This is incorrect. The parents simply know that the twins are both male. Whether they are fraternal is unknown (fraternal twins being the complement of identical twins) – that is the question the parents are asking. This error of interpretation makes the calculations in Box 1 and subsequent comments irrelevant.

Box 1 also implies Amrhein *et al.* are using the data to estimate the population frequency of identical twins rather than the state of this particular set of twins. This is different from the aim of Efron (2013a) and the stated question.

Efron suggests that Bayesian calculations should be checked with frequentist methods when priors are uncertain. However, this is a good example where this cannot be done easily, and Amrhein *et al.* are correct to point this out. In this case, we are interested in the probability that the hypothesis is true given the data (an inverse probability), not the probabilities that the observed data would be generated given particular hypotheses (frequentist probabilities). If one wants the inverse probability (the probability the twins are identical given they are the same gender), then Bayesian methods (and therefore a prior) are required. A logical answer simply requires that the prior is constructed logically. Whether that answer is “correct” will be, in most cases, only known in hindsight.

However, one possible way to analyse this example using frequentist methods would be to assess the likelihood of obtaining the data for each of the two hypothesis (the twins are identical or fraternal). The likelihood of the twins having the same gender under the hypothesis that they are identical is 1. The likelihood of the twins having the same gender under the hypothesis that they are fraternal is 0.5. Therefore, the weight of evidence in favour of identical twins is twice that of fraternal twins. Scaling these weights so they sum to one (Burnham and Anderson 2002), gives a weight of 2/3 for identical twins and 1/3 for fraternal twins. These scaled weights have the same numerical values as the posterior probabilities based on either a Laplace or Jeffreys prior. Thus, one might argue that the weight of evidence for each hypothesis when using frequentist methods is equivalent to the posterior probabilities derived from an uninformative prior. So, as a final aside in reference to Efron (2013a), if we are being “violators” when using a uniform prior, are we also being “violators” when using frequentist methods to weigh evidence? Regardless of the answer to this rhetorical question, “checking” the results with frequentist methods doesn’t give any more insight than using uninformative priors (in this case). However, this analysis shows that the question can be analysed using frequentist methods; the single data point is not a problem for this. The claim in Armhein *et al.* that a frequentist analyis "is impossible because there is only one data point, and frequentist methods generally cannot handle such situations" is not supported by this example.

In summary, the comment by Amrhein *et al.* raises some interesting points that seem worth discussing, but it makes important errors in analysis and interpretation, and misrepresents the results of Efron (2013a). This means the current version should not be approved.

References

Burnham, K.P. & D.R. Anderson. 2002. Model Selection and Multi-model Inference: a Practical Information-theoretic Approach. Springer-Verlag, New York.Colyvan, M. 2008. Is Probability the Only Coherent Approach to Uncertainty? *Risk Anal. * 28: 645-652.

Efron B. (2003a) Bayes’ Theorem in the 21st Century. *Science* 340(6137): 1177-1178.

Efron B. (2013b) A 250-year argument: Belief, behavior, and the bootstrap. *Bull Amer. Math Soc. * 50: 129-146.

Footnotes

- The twins are both male. However, if the twins were both female, the statistical results would be the same, so I will simply use the data that the twins are the same gender.
- In reality, the frequency of twins that are identical is likely to vary depending on many factors but we will accept 1/3 for now.
- Efron (2013b) reports the posterior probability for these twins being identical as “a whopping 61.4% with a flat Laplace prior” but as 2/3 in Efron (2013a). The latter (I assume 2/3 is “even more whopping”!) is the correct answer, which I confirmed via email with Professor Efron. Therefore, Efron (2013b) incorrectly claims the posterior probability is sensitive to the choice between a Jeffreys or Laplace uninformative prior.
- When the data are very informative relative to the different priors, the posteriors will be similar, although not identical.

## One thought on “Peer Review Of Research Paper”