Statistics from Altmetric.com
The Cochrane Collaboration should be congratulated for its dedication to documenting continuing clinical trials, teaching critical appraisal, and supporting research into new methods of reviewing the literature. However, Cochrane Reviews can be created by untrained people who simply follow an algorithmic approach and are unaware of important methodological issues. Therefore, the objective of this article is to highlight important limitations of Cochrane Reviews, including the Review Manager software1 that is required,2 the inappropriate use of a summary statistic, and finally the restriction to only randomised controlled trial (RCT) data. To illustrate these points, I have used a 1% random sample of Cochrane Reviews—that is, 16 studies numbered 1, 101, 201…1501 of 1596 of the Cochrane Database on 3 April 2003.3–18
LIMITATIONS OF THE COCHRANE APPROACH
There are important limitations to the software required by the Cochrane Collaboration (Review Manager). Most important is that Review Manager cannot include (a) results based on survival analyses—for example, most appropriate analysis for time to next injury—and instead calculates relative risks based on simple proportions (this leads to inappropriate estimates when patients have different lengths of follow up19) and (b) analyses adjusted for confounding—that is, multiple regression analysis. Software is available, but requires statistical expertise that would preclude the algorithmic approach.
Whereas the Cochrane Collaboration could improve its software, the other two major problems are process oriented. The Cochrane Collaboration promotes wide participation, and this leads to inexperienced authors and peer reviewers. For example, it is not always appropriate to pool data into one overall summary statistic,20 and, even when it is, there are different methods—that is, fixed and random effects models—to be used depending on the structure of the data.20,21 In the 1% sample of Cochrane reviews examined, five out of five authors used the fixed effects model when it was clearly inappropriate to do so.10–12,16,17 Secondly, only two of these five authors followed the Cochrane recommendations2 to discuss differences in methodology and outcomes as a source of discrepancy between studies, and the papers still passed the Cochrane peer review process.12,16
“The counter argument that disease management should not change without RCT evidence ignores the clinical reality facing the physician”
Finally, the Cochrane Reviews currently only include RCT studies, a decision that remains controversial at this time.22–31 What is the result? Of the 16 Cochrane Reviews retrieved, six included only zero, one, or two RCTs3–8 (Cochrane Reviews publish reviews in which zero studies meet the inclusion criteria). In an additional seven studies, almost every analysis included two or less RCTs because of the particular question being addressed (158/183 analyses).9,13–18 Given the limited number of RCTs included in 13/16 reviews, the inclusion of other types of evidence—for example, cohort, case-control, basic science—would probably have provided a stronger foundation for a rational decision of treatment. The counter argument that disease management should not change without RCT evidence ignores the clinical reality facing the physician. When a condition has not been appropriately studied with RCTs, and the currently prescribed treatment is based on a theory that has been proven incorrect, the physician has three options: (a) to continue to use a treatment that is unsupported by clinical, basic science or theoretical evidence; (b) to switch to a treatment that it is based on basic science or observational studies; (c) to abandon treatment altogether. If the operative word in “evidence based medicine” is “evidence”, then choice 2 is the most appropriate.
Alternative approaches to the Cochrane Review include best evidence synthesis, exploratory meta-analysis with or without meta-regression, and the Critical Review, all of which explore and focus on reasons for the differences in results between studies. Although some of these may be used within a Cochrane Review,2 they are not required. Further, they acknowledge that clinicians first and foremost need to know if they should use a specific treatment, and the precision of the estimate is less important. For example, a clinician will prescribe a drug if it lowers blood pressure by 20 mm Hg (95% confidence interval (CI) 10 to 30 mm Hg) or by 15 mm Hg (95% CI 13 to 17 mm Hg). Where treatment effects are small but important, and the pooling of studies is appropriate, a summary statistic can still be provided.
“As stronger types of evidence become available, weaker types of evidence are discarded”
The best evidence synthesis approach is one used by the legal system and has been proposed for the medical literature.24,32,33 As stronger types of evidence become available, weaker types of evidence are discarded. In legal terms, written notes supersede oral recall. The best evidence synthesis may be the basis of the Cochrane Review approach; if one RCT is available then all “weaker” data are ignored. However, it is not clear how much stronger the data needs to be to ignore “weaker” data.32 Further, it ignores the fact that other types of studies could help the reviewer discuss treatment effectiveness in other populations—for example, different age groups—or similar but not identical conditions—for example, patellar tendinopathy versus Achilles tendinopathy.
An exploratory meta-analysis uses standard summary plots to look for outlying studies,34 even if data pooling is appropriate. Reasons for the discrepancies between studies are explored, and conclusions are always cautious because this is a post hoc data driven approach. Preferably, authors look at subgroups based on their a priori knowledge of the condition.2 A quantitative version of the exploratory meta-analysis is meta-regression. In the regression model, each aspect of the study is an independent variable—for example, study type, ejection fraction—, and the outcome of interest—for example, five year re-injury rate—is the dependent variable. In general, meta-regression requires a minimum of five studies, but this increases when the number of variables increases.
The Critical Review approach35 is similar to that used in Critical Thinking in Logic.36 In contrast with the quantitative and exploratory meta-analyses approaches, the Critical Review approach suggests that one should first separate studies into those that show a positive effect and those that show no effect or a negative effect (based on both point estimates and confidence intervals). The author then examines for differences in methodology. If a consistent difference is noted between these groups of studies, the author investigates further. Because this is data driven, the traditional statistical approach would be to test the hypothesis using an independent data set.37 Because the original review already included all the RCTs (or RCT and cohort studies, etc), the Critical Review examines an independent type of data set—for example, basic science data, or cohort data if only RCTs were originally used—to see if the newly generated hypothesis is supported. Of course, this is still a post hoc approach and interpretations should be cautious.
This method was used to investigate whether stretching immediately before exercise prevents injury.38 At the time of the review, the results showed that three of four RCTs suggested that stretching immediately before exercise was beneficial, but all had the co-intervention of warm up. On the other hand, the investigators of the cohort studies adjusted for a warm up co-intervention, and almost all of these studies suggested that stretching had no effect. The basic science data and other theoretical arguments were reviewed and suggested that stretching immediately before exercise would not decrease injuries. Shortly thereafter, a new RCT was published that used stretching without warm up, and suggested that there was no effect.39 In two other studies that have been cited as evidence in favour of stretching before exercise, the intervention used was actually stretching outside periods of exercise. A review of basic science data suggests that long term stretching outside periods of exercise could theoretically increase the strength of tissue.40 In effect, the Critical Review approach allowed for a unifying theory for all the results of the different studies in this situation.
Cochrane Reviews currently enjoy a credibility level that may not be warranted. The required software is lacking in vital areas, the algorithmic approach to systematic review writing appears to lead to published papers that do not follow the Cochrane Collaboration’s own recommendations, and reviews that are based on zero, one, or two RCTs without reference to other types of evidence may be omitting important information that would be useful to a clinician who needs to treat actual patients. Another option is to promote critical thinking through the alternative approaches mentioned above.