Scientific research in the age of omics: the good, the bad, and the sloppy
- 1Department of Biostatistics, University of Washington, Seattle, Washington, USA
- 2Department of Statistics, Stanford University, Stanford, California, USA
- Correspondence to Dr Daniela M Witten,
- Received 27 March 2012
- Accepted 13 September 2012
- Published Online First 4 October 2012
It has been claimed that most research findings are false, and it is known that large-scale studies involving omics data are especially prone to errors in design, execution, and analysis. The situation is alarming because taxpayer dollars fund a substantial amount of biomedical research, and because the publication of a research article that is later determined to be flawed can erode the credibility of an entire field, resulting in a severe and negative impact for years to come. Here, we urge the development of an online, open-access, postpublication, peer review system that will increase the accountability of scientists for the quality of their research and the ability of readers to distinguish good from sloppy science.
A messy situation
A spate of recent articles have bemoaned the state of scientific research and, in particular, large-scale biomedical studies. It has now been established that many published research findings are false,1 that large studies based on high-dimensional datasets are especially prone to problems in design, analysis, and reporting,2 and that a change in attitudes and practices by funders, academic journals, and researchers is required in order for the situation to improve.3 Fundamentally, a constellation of incentives leads to sloppy science: researchers are constantly enticed to publish new, exciting, and positive results for professional advancement, journals are motivated to publish hot new findings and not to seriously consider or address criticisms of published work, and funders are encouraged to fund the most prolific researchers who publish in the top journals. This vicious circle leads to a constant rush to publish exaggerated claims, few negative repercussions for researchers who perform shoddy work, and little reward for those who perform careful and high-quality science. Outside researchers who attempt to perform a thorough and critical reanalysis of published studies are hampered by a lack of access to the detailed study protocol and the original data or code used to perform the analyses, are often unable to publish work calling into question the results of high-profile publications, and gain little from such endeavours owing to lack of funding and a general sentiment by the scientific community that such work is unoriginal or derivative.
Here we propose the development of an online, open-access, postpublication, peer review system aimed at introducing a higher level of accountability and quality into the scientific research process.
Prepublication peer review: a failed system?
Prepublication peer review is intended to ensure that only correct and valuable papers are published in top journals. But it can be successful only in fields in which a peer reviewer is familiar with all of the relevant techniques and tools. However, in our experience, this is not the case in much of biomedical research.
We live in an era of big science that requires collaboration between large teams of researchers with complementary and often non-overlapping areas of expertise. Consequently, a single paper can have many possible points of failure, and so a great number of peer reviewers whose areas of expertise cover all of the relevant disciplines would be needed in order to ensure accurate review. Of course, this is infeasible. Peer reviewers are volunteers, and it simply is not possible for each paper to have a half-dozen peer reviewers. Instead, journals typically rely on one to three peer reviewers whose areas of expertise almost certainly do not span the full spectrum required to evaluate the paper.
Furthermore, in our experience, proper review of a complex biomedical study requires far more time than one could possibly expect an unpaid reviewer to contribute. And in general, truly adequate peer review is precluded by the fact that authors often do not provide sufficient detail for an outside researcher to understand how the experiments and analyses were performed, and almost never provide raw data and code in the prepublication peer review stage, if at all. Some of this is changing: for instance, there have been moves to improve the standards of reporting in large-scale omics studies, and Science recently introduced a policy requiring that authors make data and code available whenever possible.3–7 But given the substantial computational requirements for many omics analyses, it can be computationally, cost, and time prohibitive for an independent researcher to re-run an analysis even if the data and code are made available!8 ,9 Consequently, it is not possible to catch all or even most of a paper's potential problems in the peer review stage. Peer reviewers can be expected to spot grossly implausible claims or obviously flawed statistical analyses, but more subtle problems will almost certainly go unnoticed.
We wish to emphasize that when we refer to problems in papers that may go unnoticed in peer review, we are specifically not referring to scientific misconduct or intentional misrepresentation of data or results. We are instead referring to what is, in our experience, far more common: innocent mistakes by well-meaning and talented scientists, stemming from poorly controlled experimental conditions, conceptual misunderstanding of the assumptions underlying a particular statistical technique, bugs in computer programs, and so forth. Such events rarely make headlines and are incredibly widespread. Unfortunately, once published, a paper generally becomes enshrined in the academic literature. Some publications have immediately elicited a strong negative reaction from the scientific community, leading to retraction (a recent example is documented by Mandavilli10). But this is the exception rather than the norm.
Typically, once a paper makes it through peer review, the experimental and statistical methods used are considered justified by other investigators because they have been published. Sometimes those investigators’ motives may be cynical: we have encountered collaborators who have stubbornly insisted on using a particular statistical approach, despite our arguments that it is invalid and leads to inflated significance levels, solely on the basis that the method was published before and so is ‘good enough’ to lead to more publications. (Our collaborations with those individuals tend to be short-lived.) But more often than not, investigators innocently rely on a method or a result from a publication without being aware of its flaws. For instance, in recent years, quite a bit of effort has been devoted to devising a principled statistical approach to assess the extent to which genes sharing gene ontology terms tend to be co-expressed.11–14 Unfortunately, hypergeometric tests have repeatedly been used for this purpose in the literature, both before and after it was rigorously shown15 that the resulting p values do not measure the quantity of interest and yield grossly inflated significance estimates.
As statisticians, we have often stumbled upon publications that suffer from problems in the statistical analysis, either glaringly obvious or subtle (but nonetheless important). In our experience it requires a huge amount of work to convince a journal to publish even a well-reasoned and verifiably correct comment criticizing a published paper, presumably because for the journal there is little to be gained from doing so. The second author has had two experiences of this kind,16 ,17 while the heroic efforts of Baggerly and Coombes2 deserve special mention. In most cases, the work involved in publicizing a problem with a published paper serves as a deterrent from pursuing the issue. Consequently, there is essentially no good way to get the word out to the scientific community about a problem in a published paper.
In view of the above problems, some people believe that we should abolish our current journal system in favour of a more informal, open, and dynamic publication model. We don't agree: we strongly believe in the essential role of high-quality, peer-reviewed journals in science. Expert and anonymous reviews are gatekeepers for quality. However, we feel that the current system is in need of a major upgrade.
A solution: postpublication peer review
We argue that there is a need for the creation and widespread adoption of a free website intended to promote rigorous postpublication peer review of academic papers. For those familiar with the popular ‘Internet Movie Data Base’ (IMDB), this can be thought of as an IMDB for scientific publications.
Such a site would have several key components:
The site automatically indexes all papers deposited on PubMed, arXiv, and other repositories, and every paper is given its own page on the site. The paper's page displays an average user-driven rating, and comments by users.
An individual researcher who provides an academic or professional email address as well as his or her real name, contact information, and professional qualifications can create an account on the site. All of the researcher's actions on the site will be linked to this information.
Any user of the site can review any published paper, by posting a comment and providing a numeric score.
Any user can indicate approval or disapproval of a given review; this information will be presented alongside the original review.
Authors are encouraged to post responses to comments about their own papers.
Average ratings for the papers published within a given journal can be used to compute an overall rating for the journal, which may also be displayed.
More sophisticated weighted-average ratings may also be available, weighing user ratings by some quality measure of each user's ratings.
Some of these features are already implemented in various forms in existing websites. For instance, some journals (such as those in the Public Library of Science series) do provide readers with an opportunity to comment on published papers in an online forum. But such comments and associated ratings generally are not prominently displayed, and so the author of a published paper does not suffer serious consequences if valid criticisms of his or her work are raised. The websites Community of Science and ResearchGate allow researchers to create accounts, and ResearchGate indexes all papers deposited on PubMed and arXiv.
But to the best of our knowledge, there is no prominent forum for users to comment upon and rate each other's publications. The Faculty of 1000 provides postpublication peer review by a select group of experts, but does not facilitate comments and ratings by the broader scientific community. Perhaps for this reason, it has only one comment—and a positive one at that—about a high-profile, extensively criticized, and subsequently retracted article on the genetic basis for human longevity.10 Furthermore, Faculty of 1000 suffers from the same problem as traditional prepublication peer review: an individual member of the Faculty cannot reasonably be expected to put in the necessary time to deeply evaluate a paper, and an individual who does evaluate a paper and identifies a problem but who is not part of the Faculty cannot provide a review.
If broadly adopted, such a postpublication peer review website could be valuable for an investigator wishing to gauge the value of a published paper. Rather than relying on the prestige of the publishing journal, the investigator could log onto the site and view the ratings and comments about the paper in order to determine whether valid criticisms have been raised.
Moreover, this system could disrupt the current vicious circle of scientific research. From the perspective of researchers, with such a system in place, the goal is no longer simply to publish a paper in a top-tier journal by wowing the editor and satisfying a couple of peer reviewers. The authors must now think about the paper's long-term viability. Likewise, journals are no longer motivated just by a hot new paper but by high-quality science that will stand the test of time. And finally, funders can evaluate a researcher's productivity and track record based on the journals in which that researcher has published, and also by the overall user-driven ratings of the researcher's publications, which indicate the long-term quality and impact of his or her work.
Today, citation counts are used as evidence of a paper's impact or validity, and some may argue that they serve much the same role in certifying a paper's quality and long-term impact as the user-driven rating system that we have just described. But citation counts can be highly problematic: often authors do not thoroughly read the papers that they cite, and there are instances of papers receiving many hundreds of citations before being completely discredited and retracted. In some cases, a retracted paper continues to be cited for years after its retraction! On the other hand, with the proposed system in place, a publication's value could be assessed using its user-driven rating, and the recent ratings for a paper that has been discredited would be very low.
Thus far, we have focused on the ability of a postpublication peer review system to help identify problems with published papers. But such a system will also provide additional benefits to the scientific community. For instance, it will allow researchers to comment on the novelty of a publication, on its connections to previous work, and on its relationship to unanswered scientific questions. This will be useful for other investigators seeking perspective on how a given publication fits into the scientific literature.
In order for a postpublication peer review website to succeed, a critical mass of users would need to adopt the site, and would need to be honest and fair in their ratings. We believe that such honesty would be self-enforced by the transparency and community-driven nature of the website that we have proposed. It is reasonable to worry that friends and colleagues might rate each other's publications highly as part of a mutual agreement intended to boost their own publications’ ratings. However, we feel that given the vast size of the scientific community, efforts such as these to game the system are unlikely to have a substantial and lasting effect on a publication's overall rating.
Will our proposal solve all of the problems that are plaguing scientific research in the era of omics? Of course not. But we are optimistic that such a system could help fix what is increasingly becoming a dysfunctional system, by better aligning researchers, journals, and funders towards the common goal of high-quality, careful research that stands the test of time.
Contributions DW and RT wrote the manuscript.
Competing interests None.
Provenance and peer review Not commissioned; externally peer reviewed.