Importance of Funding Replication Studies: NIH on Credibility of Basic Biomedical Studies
Curator: Aviva Lev-Ari, PhD, RN
Policy: NIH plans to enhance reproducibility
Francis S. Collins and Lawrence A. Tabak discuss initiatives that the US National Institutes of Health is exploring to restore the self-correcting nature of preclinical research.
A growing chorus of concern, from scientists and laypeople, contends that the complex system for ensuring the reproducibility of biomedical research is failing and is in need of restructuring1, 2. As leaders of the US National Institutes of Health (NIH), we share this concern and here explore some of the significant interventions that we are planning.
Science has long been regarded as ‘self-correcting’, given that it is founded on the replication of earlier work. Over the long term, that principle remains true. In the shorter term, however, the checks and balances that once ensured scientific fidelity have been hobbled. This has compromised the ability of today’s researchers to reproduce others’ findings.
Let’s be clear: with rare exceptions, we have no evidence to suggest that irreproducibility is caused by scientific misconduct. In 2011, the Office of Research Integrity of the US Department of Health and Human Services pursued only 12 such cases3. Even if this represents only a fraction of the actual problem, fraudulent papers are vastly outnumbered by the hundreds of thousands published each year in good faith.
Instead, a complex array of other factors seems to have contributed to the lack of reproducibility. Factors include poor training of researchers in experimental design; increased emphasis on making provocative statements rather than presenting technical details; and publications that do not report basic elements of experimental design4. Crucial experimental design elements that are all too frequently ignored include blinding, randomization, replication, sample-size calculation and the effect of sex differences. And some scientists reputedly use a ‘secret sauce’ to make their experiments work — and withhold details from publication or describe them only vaguely to retain a competitive edge5. What hope is there that other scientists will be able to build on such work to further biomedical progress?
Exacerbating this situation are the policies and attitudes of funding agencies, academic centres and scientific publishers. Funding agencies often uncritically encourage the overvaluation of research published in high-profile journals. Some academic centres also provide incentives for publications in such journals, including promotion and tenure, and in extreme circumstances, cash rewards6.
Then there is the problem of what is not published. There are few venues for researchers to publish negative data or papers that point out scientific flaws in previously published work. Further compounding the problem is the difficulty of accessing unpublished data — and the failure of funding agencies to establish or enforce policies that insist on data access.

Preclinical problems
Reproducibility is potentially a problem in all scientific disciplines. However, human clinical trials seem to be less at risk because they are already governed by various regulations that stipulate rigorous design and independent oversight — including randomization, blinding, power estimates, pre-registration of outcome measures in standardized, public databases such as ClinicalTrials.gov and oversight by institutional review boards and data safety monitoring boards. Furthermore, the clinical trials community has taken important steps towards adopting standard reporting elements7.
Preclinical research, especially work that uses animal models1, seems to be the area that is currently most susceptible to reproducibility issues. Many of these failures have simple and practical explanations: different animal strains, different lab environments or subtle changes in protocol. Some irreproducible reports are probably the result of coincidental findings that happen to reach statistical significance, coupled with publication bias. Another pitfall is overinterpretation of creative ‘hypothesis-generating’ experiments, which are designed to uncover new avenues of inquiry rather than to provide definitive proof for any single question. Still, there remains a troubling frequency of published reports that claim a significant result, but fail to be reproducible.
Proposed NIH actions
As a funding agency, the NIH is deeply concerned about this problem. Because poor training is probably responsible for at least some of the challenges, the NIH is developing a training module on enhancing reproducibility and transparency of research findings, with an emphasis on good experimental design. This will be incorporated into the mandatory training on responsible conduct of research for NIH intramural postdoctoral fellows later this year. Informed by this pilot, final materials will be posted on the NIH website by the end of this year for broad dissemination, adoption or adaptation, on the basis of local institutional needs.
“Efforts by the NIH alone will not be sufficient to effect real change in this unhealthy environment.”
Several of the NIH’s institutes and centres are also testing the use of a checklist to ensure a more systematic evaluation of grant applications. Reviewers are reminded to check, for example, that appropriate experimental design features have been addressed, such as an analytical plan, plans for randomization, blinding and so on. A pilot was launched last year that we plan to complete by the end of this year to assess the value of assigning at least one reviewer on each panel the specific task of evaluating the ‘scientific premise’ of the application: the key publications on which the application is based (which may or may not come from the applicant’s own research efforts). This question will be particularly important when a potentially costly human clinical trial is proposed, based on animal-model results. If the antecedent work is questionable and the trial is particularly important, key preclinical studies may first need to be validated independently.
Informed by feedback from these pilots, the NIH leadership will decide by the fourth quarter of this year which approaches to adopt agency-wide, which should remain specific to institutes and centres, and which to abandon.
The NIH is also exploring ways to provide greater transparency of the data that are the basis of published manuscripts. As part of our Big Data initiative, the NIH has requested applications to develop a Data Discovery Index (DDI) to allow investigators to locate and access unpublished, primary data (see go.nature.com/rjjfoj). Should an investigator use these data in new work, the owner of the data set could be cited, thereby creating a new metric of scientific contribution unrelated to journal publication, such as downloads of the primary data set. If sufficiently meritorious applications to develop the DDI are received, a funding award of up to three years in duration will be made by September 2014. Finally, in mid-December, the NIH launched an online forum called PubMed Commons (see go.nature.com/8m4pfp) for open discourse about published articles. Authors can join and rate or contribute comments, and the system is being evaluated and refined in the coming months. More than 2,000 authors have joined to date, contributing more than 700 comments.
Community responsibility
Clearly, reproducibility is not a problem that the NIH can tackle alone. Consequently, we are reaching out broadly to the research community, scientific publishers, universities, industry, professional organizations, patient-advocacy groups and other stakeholders to take the steps necessary to reset the self-corrective process of scientific inquiry. Journals should be encouraged to devote more space to research conducted in an exemplary manner that reports negative findings, and should make room for papers that correct earlier work.
Related stories
We are pleased to see that some of the leading journals have begun to change their review practices. For example, Nature Publishing Group, the publishers of this journal, announced8 in May 2013 the following: restrictions on the length of methods sections have been abolished to ensure the reporting of key methodological details; authors use a checklist to facilitate the verification by editors and reviewers that critical experimental design features have been incorporated into the report, and editors scrutinize the statistical treatment of the studies reported more thoroughly with the help of statisticians. Furthermore, authors are encouraged to provide more raw data to accompany their papers online.
Similar requirements have been implemented by the journals of the American Association for the Advancement of Science — Science Translational Medicine in 2013 and Science earlier this month9— on the basis of, in part, the efforts of the NIH’s National Institute of Neurological Disorders and Stroke to increase the transparency of how work is conducted10.
Perhaps the most vexed issue is the academic incentive system. It currently over-emphasizes publishing in high-profile journals. No doubt worsened by current budgetary woes, this encourages rapid submission of research findings to the detriment of careful replication. To address this, the NIH is contemplating modifying the format of its ‘biographical sketch’ form, which grant applicants are required to complete, to emphasize the significance of advances resulting from work in which the applicant participated, and to delineate the part played by the applicant. Other organizations such as the Howard Hughes Medical Institute have used this format and found it more revealing of actual contributions to science than the traditional list of unannotated publications. The NIH is also considering providing greater stability for investigators at certain, discrete career stages, utilizing grant mechanisms that allow more flexibility and a longer period than the current average of approximately four years of support per project.
In addition, the NIH is examining ways to anonymize the peer-review process to reduce the effect of unconscious bias (see go.nature.com/g5xr3c). Currently, the identifiers and accomplishments of all research participants are known to the reviewers. The committee will report its recommendations within 18 months.
Efforts by the NIH alone will not be sufficient to effect real change in this unhealthy environment. University promotion and tenure committees must resist the temptation to use arbitrary surrogates, such as the number of publications in journals with high impact factors, when evaluating an investigator’s scientific contributions and future potential.
The recent evidence showing the irreproducibility of significant numbers of biomedical-research publications demands immediate and substantive action. The NIH is firmly committed to making systematic changes that should reduce the frequency and severity of this problem — but success will come only with the full engagement of the entire biomedical-research enterprise.
http://www.nature.com/news/policy-nih-plans-to-enhance-reproducibility-1.14586
Rethinking Reproducibility Reporter in Genomeweb.com
Officials at the National Institutes of Health are contemplating changes to grant applications that would require researchers tovalidate some experimental procedures and results, “such as the foundational work that leads to costly clinical trials,” Nature News reports this week.
These measures are intended to combat the reproducibility problem that plagues many NIH-funded experiments and to help ensure that the agency’s tight research budget is spent on “verifiable science,” the article states.
Among other things, officials are considering “modifying peer review to bring greater scrutiny to the work a grant application is based on — perhaps just for applications that are likely to lead to clinical trials” as well as requiring that “independent labs validate the results of important preclinical studies as a condition of receiving grant funding,” Nature reports.
“There is certainly sufficient information now that the NIH feels it’s appropriate to look at this at a central-agency level,” Lawrence Tabak, the agency’s principal deputy director, tells Nature. He and other senior NIH officials are currently “assessing input gathered from the directors of the agency’s 27 institutes and centers” prior to meeting with NIH director Francis Collins, “who will decide what steps to take,” Nature adds.
Reactions to the possibility of a validation requirement are mixed. “It’s a disaster,” Peter Sorger, a systems biologist at Harvard Medical School, tells Nature arguing that “frontier science often relies on ideas, tools, and protocols that do not exist in run-of-the-mill labs, let alone in companies that have been contracted to perform verification.”
Others, such as, Elizabeth Iorns, chief executive of Science Exchange, a company in Palo Alto, California, say that requiring validation “either through random audits or selecting the highest-profile papers” would be a good idea. In fact, her company has launched a program with a German reagent vendor to independently validate research antibodies.
NIH to Researchers: Credibility Counts

The NIH is planning “significant interventions” to ensure that basic biomedical studies stand the test of time, its two top officials say.
In the long term, science remains self-correcting, according to NIH Director Francis Collins, MD, PhD, and Principal Deputy Director Lawrence Tabak, DDS, PhD.
But in the short term — and especially in preclinical research using animal models — “the checks and balances that once ensured scientific fidelity have been hobbled,” they argue in a Comment article in Nature.
One report has suggested that “as many as two-thirds of studies related to preclinical animal trials were not able to be reproduced,” Tabak told MedPage Today.
“Anecdotally, of course, we hear of other such circumstances,” he said, adding: “The truth is we don’t really know what the full scope of the problem is.”
Collins and Tabak said the problem is not scientific fraud, which is rare, but a combination of factors — including the pressure to publish rapidly and poor training in experimental design — that lead to lack of reproducibility.
The issue is significant because more advanced research is therefore often based on an insubstantial foundation, wasting effort and resources, Collins and Tabak argued.
The NIH is planning to investigate several approaches to try to improve matters, Collins and Tabak wrote, including:
- Mandatory training on responsible conduct of research for its intramural postdoctoral fellows.
- Checklists for its reviewers to make evaluation of grant applications more systematic.
- Ways to “anonymize” the peer-review process to reduce the effect of unconscious bias.
- Rejigging the biographical sketch that grant applicants are required to fill in so that it emphasizes advances resulting from previous work.
- At some career stages, offering flexible or longer funding to provide “greater stability” for investigators.
However, Tabak told MedPage Today, “NIH alone can’t solve this — this is something that requires the efforts of all our stakeholders, the academic community, those that publish scientific journals, and, of course, the scientists themselves.”
The NIH position met with a mixed reaction from investigators who have tackled the issue of reproducibility.
The article by Collins and Tabak is “really very welcome,” commented John Ioannidis, MD, PhD, of Stanford University School of Medicine in Stanford, Calif.
“Everything I read in the NIH comment seems very reasonable,” Ioannidis told MedPage Today. But he cautioned that it’s not clear which interventions will work and which will not.
The Nature piece comes just a few days after Ioannidis and colleagues published a series of articles in The Lancet outlining the issue of reproducibility and suggesting solutions, some of which are similar to those proposed by the NIH.
Ioannidis has long been known as a provocative and skeptical critic of much of the published biomedical research and a famous 2005 article — Why Most Published Research Findings Are False — cemented that reputation.
The field of “meta-research” — research into research — remains observational, he noted. But because it is large and varied, the NIH is actually in a position to conduct experiments and randomized trials to test what interventions work to improve reproducibility, he said.
“I feel a little bit uneasy about having experts — like myself — say what needs to be done and really not have the best evidence for that,” he said.
Another critic of the research enterprise, however, said the NIH doesn’t go far enough and because of that won’t know if any of its interventions succeeds.
“All of the suggestions are steps in the right direction,” said Elizabeth Iorns, PhD, CEO of Science Exchange in Palo Alto, Calif., which calls itself an “online marketplace for science experiments.”
But, she told MedPage Today, “What really needs to happen is for replication studies to be funded.”
In the absence of an NIH attempt to replicate a large number of studies, Iorns said, “There isn’t any baseline … so we won’t know if any of those changes actually made any difference.”
Her organization, she said, has just been given private funding to replicate 50 cancer biology studies, all from high-impact journals — a project she hopes to have completed within a year. That will help clarify the landscape, she said.
Tabak told MedPage Today that the reason the issue is at the forefront today is because of concern from the scientific community and “feedback” from scientists will show the NIH whether it’s on the right track.
But he added that if one of the NIH institutes is planning to invest in a major clinical trial based on preclinical animal studies, it might first replicate that basic research.
“That investment would not be small,” he said, but “it is much smaller than the actual cost to do a trial.”
REFERENCES in Nature 505, 612–613 (30 January 2014) doi:10.1038/505612a
http://www.nature.com/news/policy-nih-plans-to-enhance-reproducibility-1.14586
References
- Prinz, F., Schlange, T. & Asadullah, K. Nature Rev. Drug Disc. 10, 712–713 (2011).
- The Economist ‘Trouble at the Lab’ (19 October 2013); available at http://go.nature.com/dstij3
- US Department of Health and Human Services, 2011 Office of Research Integrity Annual Report 2011 (US HHS, 2011); available at http://go.nature.com/t7ykcv
- Carp, J. NeuroImage 63, 289–300 (2012).
- Vasilevsky, N. A. et al. PeerJ 1, e148 (2013).
- Franzoni, C., Scellato, G. & Stephan, P. Science 333, 702–703 (2011).
- Moher, D., Jones, A. & Lepage, L. for the CONSORT Group J. Am. Med. Assoc. 285,1992–1995 (2001).
- Nature 496, 398 (2013).
- McNutt, M. Science 343, 229 (2014).
- Landis, S. C. et al. Nature 490, 187–191 (2012).
SOURCES
NIH mulls rules for validating key results by Meredith Wadman US biomedical agency could enlist independent labs for verification
http://www.nature.com/news/nih-mulls-rules-for-validating-key-results-1.13469
Leave a Reply