(Perhaps not so) surprising findings regarding the reality of NIH and NSF grant proposal reviews

Almost everyone knows that grant proposals submitted to both the NIH and the NSF (and the USDA and NASA, among others) are evaluated for scientific quality by a process termed peer review. Conceptually, what this means is that your peers (your reviewers) are best qualified to evaluate the merits of your research ideas. However, exactly what constitutes a “peer” and how “close” to the area of the research that “peer” reviewer should be is an interesting question. The issue is important because it may well have a significant impact on the evaluation of those ideas. Very recently, Kevin Boudreau and his colleagues undertook a comprehensive study designed specifically to answer these questions. Their results may very well surprise you!

The traditional perspective regarding peer review (and one endorsed by most federal funding agencies), is that reviewer expertise closely related to the subject area of the proposal is a necessary condition for optimally fair evaluations. This does, of course, seem reasonable if not even intrinsically obvious. However, is it possible for a reviewer to be too close to the subject area to provide a fair review? Moreover, can really close expertise of a reviewer actually be detrimental to an applicant’s evaluation score? In addition, most federal funding agencies stress the importance of novelty and innovation. Indeed, “Innovation” is a key review criterion for NIH grant applications. Can an applicant’s ideas be considered too novel and/or too innovative? Can this actually have a negative impact on applicant’s evaluation score?

Both of these questions have been critically addressed in a very recent publication by K.J. Boudreau, E.C. Guinan, K.R. Lakhani and C. Reidl published in May in Management Science, Publisher INFORMS – see http://pubsonline.informs.org/doi/10.1287/mnsc.2015.2285 for the full article. These authors point out that the capacity to accurately select among numerous competing projects for those that are truly innovative is a core management task for any forward-thinking organization. Clearly, this applies to the NIH and the NSF as well. Based upon this fundamental concept, the authors explore the question of the relationship between reviewer expertise and evaluation outcomes of NIH and NSF proposals. Considerable attention was devoted to ensuring that extraneous variables that might possibly confound results were excluded from the analysis. The actual study involved 150 research proposals in the field of endocrine-related disease submitted to the NIH and the NSF from investigators at a leading medical research institute. As reviewers, 142 “world-class researchers” (the authors’ words) were recruited, some considered to be experts in the field, and some who were not. Each was asked to review 15 proposals, resulting in 2130 unique evaluator-proposal pairs.

The authors’ basic conclusions from this comprehensive study were essentially twofold. First, the authors determined that there were “systematically lower scores (given) to research proposals that were closer to their own areas of expertise.” Moreover, the differences were not trivial and can be attributable to true cause-effect relationships. Indeed, the authors point out that the “relationships are strikingly large and driven by behaviors across a wide mainstream of the population.” Among the reasons cited for these differences are the fact that reviewers with expertise very close to the area encompassed by the proposal were more likely to be highly discriminatory in identifying potential issues or problems likely to be experienced during the proposed tenure of the study.

The second basic conclusion was that the submission of highly innovative research proposals was likely to be detrimental to an applicant’s evaluation score. In this respect, the authors found a positive correlation between relatively low innovation to modestly innovative projects and evaluation score outcomes. However, the relationship was biphasic and proposals perceived to have relatively high levels of innovation actually scored lower than proposals with lower levels of perceived innovation. The magnitude of these effects was found to be comparable to the differences in evaluation outcomes based upon intellectual distance between the knowledge base of the reviewer relative to that of the subject matter of the proposal. Once again, the effects were determined to be true cause-effect relationships, and not an artifact due to unrelated factors.

What exactly does this mean for you, the applicant? Obviously, there is no easy answer to this question and, within the framework of NSF, there is probably not too much that can be done about it. Applicants should, however, probably be a little cautious in decisions as to who exactly to recommend as reviewers. For the NIH, there is likely to be a little more flexibility in that applicants do have the opportunity to recommend a specific peer review group, or Study Section. In this respect, if an option were to present itself where two, or even three, Study Sections would be possibilities, careful attention to who exactly is serving as reviewers in those Study Sections may well allow you to identify individuals who have interests related to the field of your own proposal but are not perceived to be “true experts”. In addition, applicants may want to think long and hard about submitting proposals likely to be perceived by individuals with expertise in the field as being “too novel”.

(A special thank you to Dr. Arthur Roberts, University of Georgia, for bringing this article to our attention.)