On the effects of antibiotic stewardship: I met a analysis

Yet another meta-analysis telling us that we are doing something very valuable: antibiotic stewardship (AS). Nobody wants to (or should) question that good AS is important for our patients, just as hand hygiene, being sober when working and following the latest professional developments. How nice would it be if we could reliably quantify the effects of our good practice. One study is no study (say those that usually don’t perform studies), so the meta-analysis was invented. But what is told by a meta-analysis?

The purpose of a meta-analysis (in the beginning) was to collect and pool data from studies that were very similar in design, intervention and patient population, that yielded a similar trend of effect, but that were all underpowered to be conclusive. The pooling created a “new and larger” study, with less statistical uncertainty on the reported pooled effect, which could assist in designing the decisive study-to-be-done or even recommend or reject such a treatment if the pooled effect is so convincing or disappointing.

As meta-analyses have become a standard package, these original principles seem to have eroded, and I complained before. I was surprised to see how the new meta-analysis on AS in LID (previoulsy addressed on this page) pooled studies that have the term “AS” as commonality, but – not surprisingly – differ widely in intervention, in patient populations, in study designs and in methods. Most commonality is in the quality of the individual studies: 4% is good, the rest moderate and poor (which would make some (not me) say: “garbage in, is garbage out”). Inevitably, such a collection of studies reflects, as the authors state, large clinical heterogeneity between studies. Yet, there is no unity to quantify clinical heterogeneity.

The principle that individual studies suggest a similar direction of effect can be quantified objectively; for instance with I², which reflects the percentage of total variation across studies that is attributable to heterogeneity rather than chance; 0% is no heterogeneity, <25% low, 25-50% moderate and >50% high. This is called statistical heterogeneity.

A meta-analysis predicts that the next study (or your practice) will have the reported effect. A few years ago we simulated how well a meta-analysis predicts and how the I² adds to the reliability of that prediction. The bottom-line: if there is substantial clinical heterogeneity the predictive value is low, regardless of the I² value. And with little clinical heterogeneity the predictive value is high, regardless of the I² value. Yet, in most meta-analyses the clinical heterogeneity is difficult to categorize and then the I² may help: if high (>50%) the predictive value is low: the outcome should be interpreted with great caution, and the strength of a recommendation should be downgraded (as in the GRADE approach).

Real experts, such as Hans Reitsma in our department, are not even happy with I². It is a single number, but relative, and thereby sensitive to the sample sizes of the included individual studies. More informative from a clinical decision perspective is the prediction interval, which reflects the range in which the results of your next study will lie (similar to a confidence interval (CI) around an effect estimate). In an analysis of 920 Cochrane meta-analyses reporting a statistically significant effect of something (p<0.05), 479 had an estimated I²>0, and 72% (347/479) of these had both the 95% CI and the 95% prediction interval including the null effect (=no effect). With I² >60% this accounted for 98% (5/206). That would be a very bad diagnostic test!

Back to the AS meta-analysis: Clinical heterogeneity was high and I² was 76.2%-92.2% for the main analyses (Fig2-4) and 64.5%-94.6% for the subgroup analyses (Suppl Fig 4-9). Prediction intervals were not provided.

So, what now? I guess 99% of the readers will only see the abstract (and probably 95% of them only the conclusion), where space restrictions do not allow mentioning these uncertainties. It’s too simple to say “don’t pool in case of heterogeneity” as experts also plea to synthesize (pool) even such data, as a cutoff for not pooling does not exist. But they also emphasize to explain the consequences carefully. When reporting/reading a meta-analysis first thing to do is to address/look for these caveats.

Still I think that antibiotics stewardship is a good thing and that systemiatc reviews add a lot.