Giant study finds untrustworthy trials pollute gold-standard medical reviews
A huge collaboration has confirmed growing concerns that fake or flawed research is polluting medical systematic reviews, which summarize evidence from multiple clinical trials and shape treatment guidelines worldwide. The study is part of an effort to address the problem by creating a short checklist that will help researchers to spot untrustworthy trials. Combined with automated integrity tools, this could help those conducting systematic reviews to filter out flawed work — in medicine and beyond.
In the study, which has taken two years and was posted on 26 November to the medRxiv preprint server1, a team of more than 60 researchers trawled through 50 systematic reviews published under the aegis of Cochrane, an organization renowned for its gold-standard reviews of medical evidence.
After applying a barrage of checks, the authors — many of whom are themselves editors or authors of Cochrane reviews — reported that they had “some concerns” about 25% of the clinical trials in the reviews, and “serious concerns” about 6% of them.
The study can’t provide an overall estimate of problematic trials in Cochrane reviews because the sample selected — for the purpose of trialling integrity checks — wasn’t random or representative, says co-author Lisa Bero, a senior research-integrity editor at Cochrane.
Still, “we definitely picked up some dodgy trials”, says Jack Wilkinson, a health and biostatistics researcher at the University of Manchester, UK, who led the project, titled INSPECT-SR. He adds that the proportion found in the study might be an overestimate, because some of the checks turned out to be subjective or difficult to implement.
A protocol for ‘trustworthiness’
The results echo previous concerns about rising numbers of problematic studies corrupting systematic reviews in medicine and other research fields2–5, probably owing to paper mills that produce fake science.
Recognizing this problem, Cochrane introduced guidance three years ago that researchers should try to spot untrustworthy trials and exclude them from reviews. But although scientists have used a variety of protocols to do this, there is no universally agreed tool to help identify an untrustworthy study, says Bero, who is also a bioethicist at the University of Colorado Anschutz Medical Campus in Aurora.
“Frankly, these tools haven’t been tested at all,” she says, adding that researchers are unlikely to use methods that are difficult, lengthy or unclear. One integrity checklist aimed at journal editors proposed more than 50 questions, for instance, which some scientists say is too many.
Testing red flags
The aim of the INSPECT-SR study was to test 72 potential integrity checks that might help to identify untrustworthy work, garnered from a previous wide-ranging consultation. They range from specific statistical checks on a trial’s data and methods to details of funding and grants, the date a trial was registered and its authors’ publication records.
The study found that some checks are too cumbersome or infeasible in practice. Now, the list has been sharpened to a shortlist of 21 questions in 4 areas: a study’s post-publication record (such as retractions or expressions of concern); its methods, governance and transparency (such as study registration, ethical approval and how participants were recruited); whether it has plagiarized text or manipulated figures; and detailed ways to check for discrepancies in the data and results. These might be further whittled down before the shortlist is published.
Wilkinson’s team is working on a similar checklist that journal editors might apply to papers, and a third checklist for statistical tests that could be used if a reviewer can access the individual participant data in a trial, rather than just summary results. Although many researchers argue that it should be mandatory to provide such data for reviewers, medical journals rarely require it.
Automated tools
Scientists’ main concern about checklists is the time it takes to assess each study, says Wilkinson. A Cochrane review might involve anything from a few studies to dozens of trials, but in systematic reviews outside medicine, there could be hundreds of papers to examine. That’s a bigger worry, say Kim Wever, a meta-science researcher who specializes in analysing systematic reviews, and René Aquarius, a neurosurgery researcher, both at Radboud University Medical Center in the Netherlands. In work not yet published, they have found many flawed papers among more than 600 studies on animal models of haemorrhagic stroke, as Retraction Watch and Science have reported.
Reviewers of preclinical work also typically have less funding for their studies, and must examine papers that tend to have fewer signals of stringent reporting — such as being recorded on an official registry — than do clinical trials, adds Torsten Rackoll, a systematic-review methodologist at the Berlin Institute of Health.
In the past few years, however, automated software tools have sprung up that can help with some checks. Software such as Imagetwin looks for duplicated images in papers, for instance, and tools such as Signals, Papermill Alarm and Argos raise alarm bells about author retraction records, the studies a paper is citing and other warning signs.
At a meeting in London on 3 December, computer scientist Daniel Acuña at the University of Colorado Boulder described a new tool called Reviewer Zero, which promises to check for statistical inconsistencies as well as image manipulation.
Enjoying our latest content?
Login or create an account to continue
Access the most recent journalism from Nature’s award-winning team
Explore the latest features & opinion covering groundbreaking research