| Quality
of Evidence:
Secondary Studies
newest
05/08/08:
Review of Cochrane Groups’ Assessment of Bias in Studies
»
Contents
- Systematic
Reviews: Quality & Searching Tips
»
- Systematic
Reviews: Untrustable "Trustable" Sources?
»
- Includes
Delfini advice for working with secondary studies»
- newest
Review
of Cochrane Groups’ Assessment of Bias in Studies
»
- Includes
our issues with the Jadad Scale »
- Includes
a restatement on the Delfini approach to working with secondary
studies »
- Bell's
Palsy Update »
- Delfini
Letter to BMJ: Corticosteriods are Not Proven for Treatment
of Bell's Palsy
»
- Delfini
Letter to NEJM: Bell's Palsy - REDUX!
»
- Substandard
Evidence: POEMS and Diabetes — Readers, Beware!
»
- Quality
of Systematic Reviews: Misleading “POEM” on Hormone
Therapy »
Related
DelfiniClicks™
Primary
Studies
- 05/22/08:
A
Cautionary
Tale of Harms versus Benefits: Misleading Findings Due to Potentially
Inadequate Data Capture — Aprotinin Example »
|
Systematic
Reviews: Quality & Searching Tips
While
we believe you should still do a very thorough search — the
grey literature can dramatically impact how effective a treatment
may appear: Here's a nice piece on dealing with inaccessible literature
and literature of low quality when doing a systematic review. The
article makes a good case for concentrating more on a thorough quality
review of what you can easily get than digging too deeply to make
sure you've caught everything. What is clear from the article is
that including poorly designed studies can have substantial impact
on treatment effects — often making them appear more beneficial
than they actually are.
Health
Technology Assessment 2003; Vol.7: No.1 (Executive Summary)
How important are comprehensive literature searches and the assessment
of trial quality in systematic reviews? Empirical study
Egger, M, Juni P, Bartlett C, Holenstein, Sterne J.
http://www.ncchta.org/execsumm/summ701.htm
Also,
see our Recommended
Reading on Meta-analyses. |
| Systematic
Reviews: Untrustable "Trustable" Sources?
We
write this in grieving…
For
evidence-based information, there are three sources that are highly
respected and fairly universally considered “trustable.”
These sources are the Cochrane Collaboration, Clinical Evidence
(from the British Medical Journal) and the Database of Abstracts
of Reviews of Effects (DARE) from the Centre for Reviews and Dissemination
at the University of York, England.
- Cochrane
is billed as, “The reliable source of evidence in health
care.”
- Clinical
Evidence refers to itself as the “International source of
the best available evidence for effective health care.”
- DARE
(possibly wisely) offers up no such claim.
We’ve
always known that there is variation in quality even with these,
most respected, evidence-based information sources. However, because
their reputation is so high, coupled with transparent and reasonably
robust procedures, many users go directly to these sources and rely
on what they find there.
In
our efforts to provide a solid footing for our evidence reviews,
Delfini frequently conducts audits of even these, considered to
be the most rigorous, of evidence-based sources. Regrettably, we
routinely find that we cannot rely on conclusions of the reviewers.
The problems we frequently encounter fall into three camps:
1)
Poor assessment of study quality. This problem seems to happen for
a couple of reasons. a) The method to assess the original study
quality is poor (e.g., reliance on the JADAD scale); or b) potentially,
lack of critical appraisal skills of reviewers.
2)
Lack of exclusion of studies of uncertain validity or usefulness
— which has the likelihood of bias in favor of interventions.
3)
Most astonishingly — cause and effect conclusions drawn from
evidence which has been solidly declared invalid.
| Example
1: Cochrane Review
Reference:
Fouque D, Wang P, Laville M, Boissel JP. Low protein diets
for chronic renal failure in non-diabetic adults. The Cochrane
Database of Systematic Reviews 2000, Issue 4. Art. No.: CD001892.
DOI: 10.1002/14651858.CD001892.
Quality
Assessment: Data collected for each trial included inclusion
and exclusion criteria, patient details (age, gender), type
of diet prescribed (level of proposed protein intake, nature
of proteins, supplementation in energy or amino-acids), time
to the start of dialysis if available. The nature of renal
disease was recorded to verify that the distribution of prognostic
factors was balanced between the groups. No quality
assessment of the studies was performed.
Main
results: Two hundred and forty two renal deaths were recorded,
101 in the low protein diet and 141 in the higher protein
diet group, giving an odds ratio of 0.62 with a 95% confidence
interval of 0.46 to 0.83 (p=0.006). To avoid one renal death,
four to 56 patients need to be treated with a low protein
diet during one year.
Authors'
conclusions: Reducing protein intake in patients with chronic
renal failure reduces the occurrence of renal death by about
40% as compared with higher or unrestricted protein intake.
The optimal level of protein intake cannot be confirmed from
these studies.
Comment:
Assessment of study quality is an “absolute must”
and defines critical appraisal of the medical literature.
It is well known that including low quality studies in a systematic
review is likely to yield invalid results and conclusions.
Example
2: Cochrane Review of serenoa repens (saw palmetto) for BPH
Reference
Wilt T, Ishani A, Mac Donald R. Serenoa repens for benign
prostatic hyperplasia. The Cochrane Database of Systematic
Reviews 2002,Issue 3. Art. No.: CD001423. DOI: 10.1002/14651858.CD001423.
The
authors of this review concluded that, “The evidence
suggests that Serenoa repens provides mild to moderate improvement
in urinary symptoms and flow measures.”
However,
a well-done RCT — Bent S, Kane C, Shinohara K, et al.
Saw palmetto for benign prostatic hyperplasia. N Engl J Med
2006;354:557-66 — demonstrated that saw palmetto did
not improve symptoms or objective measures of benign prostatic
hyperplasia.
Comment:
We believe that explanation for differing results is that
poor quality studies were included in the Cochrane review.
Example
3: Clinical Evidence
A
systematic review of primary prevention of cardiovascular
disease utilized two studies (study 1, Blood Pressure Lowering
Treatment Trialists Collaborative. Effects of ACE inhibitors,
calcium antagonists, and other blood pressure lowering drugs:
results of prospectively designed randomised trials. Lancet
2000;355:1955–1964. Study 2, Staessen JA, Wang JG &
Thijs L. Cardiovascular prevention and blood pressure reduction:
a quantitative overview updated until 1 March 2003. Journal
of Hypertension 2003, 21: 1055–1076) which when audited
by Delfini were lethally threatened.
Example
4: DARE
In
an appraisal of a systematic review of hypertension, DARE
rated the review as “good.” Delfini’s audit
of this DARE appraisal did not find the audited article to
be valid. (See above reference to study 1: Trialists). |
Our
Advice for Working with Secondary Studies
Our
advice is as follows for Cochrane, Clinical Evidence and for secondary
studies that appear to pass a critique by DARE (or which pass your
own critical appraisal):
First,
determine if they made their conclusions on the basis of studies
they deemed high quality or not.
If
they included low quality studies, such that you have to disregard
their conclusions as untrustable, can you otherwise benefit from
any of their efforts?
a) Can you accept their search output? (Remember to update following
the date of their search.)
b) Are you comfortable accepting their exclusions?
c) Are you comfortable accepting their designation of a low grade
so that you can exclude these studies as well?
c) Are you comfortable with accepting their inclusions as the basis
for your critical appraisals? If yes, appraise every article they
give a high score to.
If
they appear to have only included studies they grade high:
a) Review their methods for grading — do you agree?
b) From the included studies, select a study or two that is deemed
of high quality and a study or two of the lowest quality, and critically
appraise your selection yourself. There will probably be agreement
over a low quality study — but since you want to determine
if low quality studies are being misgraded high, it’s their
high ground you want to head for.
c) If you are in agreement over which you have audited reach a high
quality grade, do a face validity check of the number of studies
getting high marks. Since probably over 80% of the medical literature
achieves a Delfini Grade U for uncertain validity and/or usefulness,
there are likely to be very few studies that are worthy of inclusion
in any review.
And
then, let us pray for a quality source we can all trust.
|
Review
of Cochrane Groups’ Assessment of Bias in Studies
05/08/08
The
Cochrane Collaboration publishes systematic reviews developed by
various groups and, after peer review, are edited by Cochrane Review
Groups who use their own assessment recommendations as well as those
published by the Cochrane Handbook of Systematic Reviews of Interventions.
An important study from the Nordic Cochrane Centre has examined
how the different Cochrane review groups currently recommend assessment
and handling of the risk of bias in studies evaluated for inclusion
in systematic reviews. The authors focus on the use of study components
versus numerical scales, and suggest possible improvements[1] as
there is significant variation in the quality of approaches by Cochrane
groups. Some key points from the study are presented below.
Background
It is important in developing systematic reviews to evaluate for
bias each study being considered for inclusion in the review. There
are four major areas of bias that should be considered: selection
bias, performance bias, attrition bias and assessment bias. Anything
that leads away from "truth" is a bias, excepting for
chance effects. The reason for this assessment is that biased studies
are likely to result in unreliable estimates of effect (ie, misleading
results). Unreliable conclusions then lead to inaccurate predictions
of outcomes of benefits and harms for users. Assessing studies for
bias has generally been done by Cochrane groups using two different
approaches, namely —
- Component
Approach, meaning assessing the study components for bias: Evaluating
the methodological areas of each study for bias, e.g., details
of randomization, blinding, similarity of care experiences except
for the intervention being studied, handling of drop-outs, methods
for calculating outcomes. This approach is supported by empirical
evidence.
- Scale
Approach, meaning assessing the study by assigning an overall
quality score: This requires providing numerical values to validity
threats. The Jadad scale is an example of this approach. The use
of scales is not well-supported by empirical research. For example,
Jüni et al[2] used 25 existing scales to identify high-quality
trials, and found that the effect estimates and conclusions of
the same meta-analysis varied substantially with the scale used.
The Cochrane Collaboration advises against use of scales for assessing
studies.
Methods
The authors from the Nordic Cochrane Centre examined the instructions
to authors of the 50 Cochrane Review Groups that focus on clinical
interventions for recommendations on methodological quality assessment
of studies.
Results
The following table summarizes the main findings.
| COMPONENT
APPROACH |
SCALE
APPROACH |
| Groups
recommending a component approach, 41/50 (82%) |
Groups
recommending a scale approach, 9/50 (19%) |
| 23
groups had their own checklists ranging from 4 to 23 items
|
5
groups recommend Jadad |
| Areas
Recommended for Assessment by Component Group (%) |
Areas
Recommended for Assessment by Scale Group (%) |
| Sequence
Generation, 26 (63)
|
9
(100) |
| Concealment
of allocation, 41 (100)
|
2
(22) |
| Blinding
of patients, 33 (80)
|
9
(100) |
| Blinding
of caregivers, 32 (78)
|
1
(11) |
| Blinding
of outcome assessors, 39 (95)
|
9
(100) |
| Follow-up,
38 (93)
|
9
(100) |
| Intention-to-treat
analysis, 20 (49) |
1
(11) |
| Recommendations
for using quality assessments of individual studies in
reviews by Component Group (%) |
Recommendations
for using quality assessments of individual studies in
reviews by Scale Group (%)
|
| Analytical
Approach, e.g, sensitivity analysis to test if including only
trials of higher methodological quality changes the effect
estimates, 20 (49)
|
8
(89) |
| Descriptive
Approach, 1 (2)
|
0
(0) |
| No
information, 20 (49) |
1
(11) |
| Recommendations
for Type of Analysis by Component Group
|
Recommendations
for Type of Analysis by Scale Group
|
| Sensitivity
analysis, 17 (85) |
7 (88) |
| Threshold,
4 (20) |
3
(38) |
| Subgroup
analysis, 4 (20) |
3
(38) |
| Cumulative
analysis, 1 (5) |
2
(25) |
| Weights,
1 (5) |
1
(13) |
| Meta-regression,
0 (0) |
1
(13) |
Authors’
Conclusions
- Cochrane
Reviews are undertaken by authors with different levels of methodological
training, and following Cochrane Review Groups’ guidelines
for assessing bias can be problematic if the guidelines are not
in accordance with the empirical research on bias.
- Despite
advising against scales, the Cochrane Handbook actually recommends
a ranking scale. The scale distinguishes between low risk of bias
(all criteria met), moderate risk of bias (one or more criteria
partly met) and high risk of bias (one or more criteria not met).
However, the Cochrane Handbook does not specify the criteria to
be used. Instead it states that the criteria used should be few
and address substantive threats to the validity of the study results.
- The
Jadad scale consists of three items, and up to two points are
given for randomization, two for double blinding and one for description
of withdrawals and dropouts. An overall score between zero and
five is assigned, where three is commonly regarded as adequate
trial quality. The Jadad scale is problematic for a variety of
reasons (we agree and present some of our observations below).
Also, studies
have shown low interrater agreement, particularly for withdrawals
and dropouts, where kappa values below zero have been reported,
which is an agreement that is worse than that expected by chance.
- The
Cochrane Handbook recommends analyzing all data according to the
intention-to-treat principle using different methods without specifying
what those methods should be. The Handbook should give clearer
recommendations to ensure a more homogeneous methodology.
- The
authors list multiple errors made by the various Cochrane groups
— for example, the grading system recommended by the Back
Group has five levels of evidence and was developed using a consensus
method. Consistent findings among multiple, low-quality non-randomized
studies are considered to be the same level of evidence as one
high-quality randomized trial, which is not in accordance with
findings from empirical studies or with the Cochrane Handbook.
The four-level grading system used by the Musculoskeletal Group
is also based on consensus and is also highly problematic. The
system is based on arbitrary cut-points such as sample size above
50 and more than 80% follow-up, which are not based on empirical
evidence. The only difference between platinum and gold evidence
is that there needs to be two randomized trials for platinum and
one for gold, which is not reasonable as, for example, the platinum
trials could involve 60 patients each and the gold trial, 500
patients. Silver level can be either a randomized trial with a
'head-to-head' comparison of agents or a high-quality case-control
study, which is not supported by empirical research, and bronze
level can be a high-quality case series without controls or expert
opinion.
Delfini
Comment on the Jadad Scale
We
have long had issues with the Jadad Scale.
- Points
can be awarded merely for reporting rather than giving appropriate
attention to methodological quality.
- For
randomization, the scale addresses explicitly the sequence generation,
but not concealment of allocation of the sequence.
- The
scale does not address intention-to-treat analysis among many
other considerations. Therefore, randomized trials with an appropriate
randomization sequence, but with no concealment of allocation,
with large numbers of dropouts that are well described, using
only a per-protocol analysis and having myriad other biases such
as differences between groups, may be scored as of the highest
methodological quality (five points).
After
years of doing critical appraisals involving thousands of studies,
we feel confident in stating that no scale can be developed that
is sufficient to evaluate studies. Instead an understanding of critical
appraisal concepts combined with critical and clinical thinking
is required.
Delfini
Approach to Secondary Studies
As
to use of systematic reviews, Delfini frequently starts evidence-based
reviews and quality improvement projects by checking Cochrane for
recent systematic reviews. We have over the years noted a great
deal of variation in the quality of the reviews (and have reported
this to Cochrane). This report from the Cochrane Nordic Centre is
further evidence that we should keep doing what we have been doing—auditing
Cochrane reviews to be sure that the included studies are of high
quality and not just accepting the results and conclusions of all
reviews as valid because they come from a “most-trusted source.”
This
is how we approach use of systematic reviews:
- The
critical appraisal process for secondary studies includes a number
of elements, including a review of the systematic search methods
employed and an assessment of whether the secondary source includes
only valid and clinically useful primary sources. To assess the
latter, the source is reviewed to attempt to determine whether
critical appraisal of the content from included studies was performed,
along with an attempt to assess the skill-level of the appraisers
and the robustness of their review. One or two of included primary
studies considered to be of the highest quality are critically
appraised for validity and usefulness as an audit by Delfini.
If these studies pass the audit, one or two included primary studies
of the lowest quality are critically appraised as well. If these
lower quality studies also pass, it is assumed that the authors
have employed good critical appraisal techniques.
- If
the source passes an audit for validity and usefulness, the source’s
efficacy and safety conclusions are used in the Delfini evidence
synthesis and new research published following the date of the
source’s search strategy is sought.
- If
the source does not pass the audit for validity and usefulness,
but has utilized a sound search strategy and sound criteria for
excluding efficacy studies lacking relevance, validity or for
other problems, all the primary studies selected for inclusion
by the source are critically appraised, and valid, useful studies
form the basis of the Delfini review, which will then be updated
with any new valid and clinically useful primary studies published
since the date of the secondary source’s search.
We
applaude the Nordic Cochrane Group for their efforts in helping
to improve an important resource.
References
1. Lundh A, Gotzsche PC. Recommendations by Cochrane Review Groups
for assessment of the risk of bias in studies. BMC Med Res Methodol.
2008 Apr 21;8(1):22 [Epub ahead of print]. PMID: 18426565
2.
Jüni P, Witschi A, Bloch R, Egger M: The hazards of scoring
the quality of clinical trials for metaanalysis. JAMA 1999, 282:1054-60.
PMID: 10493204.
|
Bell’s
Palsy Update
02/11/08
Things have changed. It now appears that early treatment
with prednisone most likely does benefit patients with Bell’s
Palsy. In the past, we were critical of narrative reviews published
in the BMJ and the NEJM because — as had been nicely pointed
out by a Cochrane review — there were no high quality RCTs
demonstrating improved outcomes with steroids in the treatment of
acute Bell’s Palsy. (See our letters in the boxes below.)
Sullivan et al. (N Engl J Med 2007;357:1598-607.
PMID: 17942873) have now presented probably valid and clinically
useful data to change our conclusions about the use of steroids
in Bell’s Palsy. In a 4-arm 9 month study comparing 1) prednisolone
+ placebo versus 2) acyclovir + placebo versus 3) prednisolone +
acyclovir versus 4) 2 placebo capsules, the investigators report
rates of complete recovery of 94.4% for patients who received prednisolone
and 81.6% for those who did not, a difference of 12.8 percentage
points (95% CI, 7.2 to 18.4; P<0.001). They concluded that, “in
patients with Bell’s palsy, early treatment with prednisolone
significantly improves the chances of complete recovery at 3 and
9 months.”
Critical Appraisal of the Sullivan Study
We critically appraised this study and found the following threats
to validity:
- No
details of sequence generation.
- No
mention of co-interventions or allowed/disallowed concommitnant
Rx.
- 7
patients assigned to prednisolone received the wrong drug.
- 3
patients assigned to placebo received the wrong drug.
- In
the prednisolone + placebo group, the loss to follow-up was 11/138=7.9%;
in the double placebo group the loss was 19/141=13.5%. For the
acyclovir + prednisolone group the drop out rate was 10/124=8%,
and for the acyclovir + placebo the dropout was 15/123=12%. Total
loss = 55 / 526 = 10.5%.
True ITT analysis not done and an adequate sensitivity
analysis was not done by the authors. Therefore we “redid”
the analysis using the following assumptions:
- No
Predisolone Group: We applied the percent-recovered rate in the
no prednisolone group to those missing or who discontinued (which
agreed with statistics on the natural history of the condition),
excepting those who sought active treatment who were counted as
treatment failures.
- Prednisolone
Group: We failed all who were missing and those who sought active
treatment.
Our reanalysis yielded a statistically significant
difference between those subjects receiving prednisolone and those
who did not, at a p-value of 0.0487.
Comment: Because of author’s
analysis, reported efficacy results are likely to be inflated, but
the evidence suggests that prednisolone is likely to be an effective
agent for adults with acute onset of Bell’s palsy when administered
within 72 hours of onset. Because of loss to follow-up some uncertainty
remains.
Grade
B-U: Possible to uncertain usefulness
- The
evidence might be sufficient to use in making health care decisions;
however, there remains sufficient uncertainty that the evidence
cannot fully reach a Grade B and the uncertainty is not great
enough to fully warrant a Grade U.
- Study
quality is such that it appears likely that the evidence is sufficient
to use in making health care decisions; however, there are some
study issues that raise continued uncertainty. Health care decision-makers
should be fully informed of the evidence quality.
|
| Delfini
Letter to BMJ: Corticosteriods are Not Proven for Treatment of Bell's
Palsy (see update here)
The
cover feature of the September 4, 2004 BMJ is about improving outcomes
in Bell's palsy. Yet they publish an invalid study — Holland
JN, Weiner GM, Recent developments in Bell's palsy. BMJ 2004;329:553-557
(4 September) Full
Text
Despite
a good Cochrane review that exposes some fatal flaws in prior research,
the above reference review not only includes this flawed study,
but excludes a valid one. Here is our
response to BMJ (reprinted below):
Corticosteroids
are Not Proven for Treatment of Bell's Palsy
Editor—Holland
and Eisner support the use of corticosteroids for treatment of
moderate to severe facial palsy based on two systematic reviews
[1,2]. We believe that such a conclusion is not justified by the
medical evidence presented by the authors and that the two systematic
reviews are fatally flawed.
In the Ramsey review, three studies met the authors’ criteria
for validity [3,4,5], but in Ramsey’s analysis the study
by May et al.[3]-- a valid study—-was excluded because it
was an “outlier”, i.e., the results were not consistent
with the other two studies included in the review. Excluding a
study on the basis of results rather than methodology is inappropriate.
Results from non-valid studies should not be utilized in decision-making.
May
et al. reported that corticosteroids resulted in poorer facial
recovery than placebo. It should be noted that the quality index
score of the excluded (May) study was better than one of the studies
included in Ramsey’s review. If the May et al study is not
excluded, the results do not support the authors’ conclusion
of benefit from corticosteroid treatment.
It
should also be noted that a Cochrane systematic review [6]included
the May study as a valid study, but excluded the Austin study
because 29 percent of subjects in the Austin study were lost to
follow-up. Cochrane also excluded the Shafshak study because it
was a non-randomized study.
As pointed out by the Cochrane group, the Grogan “practice
parameter” is probably invalid because Grogan included the
Shafshak and Austin studies and that if these two trials were
excluded from the pooled estimate, the results were no longer
in favor of steroids for the treatment of Bell’s palsy.
-
Grogan PM, Gronseth GS. Practice parameter: steroids, acyclovir,
and surgery for Bell's palsy (an evidence-based review): report
of the Quality Standards Subcommittee of the American Academy
of Neurology. Neurology 2001;56: 830-6
-
Ramsey MJ, DerSimonian R, Holtel MR, Burgess LP. Corticosteroid
treatment for idiopathic facial nerve paralysis: a meta-analysis.
Laryngoscope 2000;110: 335-41
-
May M, Wette R, Hardin WB Jr, Sullivan J. The use of steroids
in Bell’s palsy: a prospective controlled study. Laryngoscope
1976; 86:1111–1122
-
Austin JR, Peskind SP, Austin SG, Rice DH. Idiopathic facial nerve
paralysis: a randomized double blind controlled study of placebo
versus prednisone. Laryngoscope 1993; 103:1326–1333.
-
Shafshak TS, Essa AY, Bakey FA. The possible contributing factors
for the success of steroid therapy in Bell’s palsy: a clinical
and electrophysiological study. J Laryngol Otol 1994; 108:940–943.
-
The Cochrane Database of Systematic Reviews The Cochrane Library,
Copyright 2003, The Cochrane Collaboration: Corticosteroids for
Bell's palsy (idiopathic facial paralysis) [Review] Salinas, RA;
Alvarez, G; Alvarez, MI; Ferreira, J Date of Most Recent Update:
26-November-2001. Date of Most Recent Substantive Update: 15-
October-2001.
November
5, 2004 Update:
BMJ
Responds |
| Delfini
Letter to NEJM: Bell's Palsy - REDUX! (see
update here)
Is
it something about the pull of the moon? No sooner do we write a
letter to the BMJ about flaws in a recent article on Bell's Palsy,
that the New England Journal of Medicine publishes a review based
on some of the same poor research--Gilden DJ. Clinical practice.
Bell's Palsy. N Engl J Med. 2004 Sep 23;351(13):1323-31. No abstract
available. PMID: 15385659 [PubMed - in process]
We
disagree with Gilden who concludes that that “the data suggest
that glucocorticoids decrease the incidence of permanent facial
paralysis…” What does Gilden base this conclusion on?
Answer: An observational study, a RCT with 29 percent of subjects
lost to follow-up and the American Academy of Neurology’s
practice parameter stating that early treatment with corticosteroids
is “probably effective”. The Cochrane group has pointed
out that the American Academy of Neurology’s practice parameter
is probably invalid because the Academy's evidence synthesis included
a fatally flawed trial and a non-randomized trial. We concur with
the Cochrane group and believe corticosteroids have not been demonstrated
to be effective in Bell’s palsy. Cause and effect conclusions
cannot be reliably drawn from observational studies and results
from non-valid studies should not be utilized in drawing conclusions
regarding effectiveness and making treatment decisions.
This
is also another reminder about the problems with systematic reviews.
See our systematic review appraisal tool and our tips sheet: Systematic
Review Validity Tool &
The Problems
with Narrative Reviews |
| Substandard
Evidence: POEMS and Diabetes: Readers, Beware!
Allen
Shaughnessy and David Slawson, EBM pioneers who in 1994 gave us
POEMS (Patient-oriented evidence that matters) and DOEs (disease-oriented
outcomes) have published in the BMJ an article dealing with how
experts represented the results of a clinically useful study —
the UKPKS study. Full text is available at http://bmj.com/cgi/content/full/327/7409/266.
The
reason the authors picked the UKPKS study is that it is a good example
of patient-oriented evidence that matters — it evaluated the
effect of intensive blood glucose control on various outcomes that
matter:
- Tight
control did not prevent premature mortality.
- Metformin
decreased mortality and diabetes-related outcomes in overweight
patients.
- Tight
BP control decreased complications (greater effect than blood
glucose control).
- Quality
of life was not affected by tight glucose control.
Shaughnessy
and Slawson systematically reviewed reviews that met their inclusion
criteria (see http://www.delfini.org/Delfini_Tool_SR.doc).
They
found that:
Information
that tight glucose control does not change overall mortality or
diabetes related mortality was mentioned in only 6/35 reviews.
-
Only 14 of the reviews mentioned the effect of metformin on diabetes-related
outcomes in overweight patients.
-
17 of the reviews did not mention the need for BP control in patients
with diabetes.
- Only
5 reviewers reported that diabetic patients with hypertension
benefit more from BP control than glucose control.
- Only
7 of the reviews reported that ACE inhibitors and beta blockers
were equivalent as starting drugs for hypertension.
Implications
from this review: Important POEMs are missing from recent
diabetes review articles. Narrative reviews and expert reviews remain
problematic in that these reviews represent major vehicles for transmitting
research to clinicians, and often these reviews are of poor quality.
Clinicians may be getting from expert reviews what these authors
call PROSE—"Prescriptive recommendations based on substandard
evidence."
This
is another reminder that systematic reviews are a good place to
start when looking for valid, relevant information. Reader
beware of expert reviews which continue to appear in the best journals.
|
| Quality
of Systematic Reviews: Misleading “POEM” on Hormone
Therapy
Here's
a letter Dr. Brian Alper and Delfini submitted to the American Family
Physician which was not accepted, but we think makes some very important
points:
Misleading
“POEMs and Tips” item on hormone therapy
TO
THE EDITOR: As “Tips from Other Journals” was found
to be one of your four most popular departments[1], it has a substantial
impact on disseminating research results.
We
would like to alert your readers to a recent “Tip” that
increases the dissemination of a flawed meta-analysis concluding
that hormone replacement therapy (HRT) reduced mortality in women
less than sixty years old [2,3]. This particular meta-analysis may
“get past” some critical appraisal screens because it
describes a comprehensive search and seemingly appropriate inclusion
and quality criteria. However, detailed appraisal finds four fatal
flaws in this meta-analysis.
First,
the subgroup analysis addressing younger women was based on trials
with a mean age less than sixty years (n = 4,141 women). The use
of mean age for each included trial rather than actual age of included
women has been previously reported as a significant flaw[4]. The
method of subgroup analysis used does not account for most of the
available data on women less than sixty years old. The Women’s
Health Initiative trial alone provided data on HRT in a population
that had a mean age of 63.3 years but included 5,522 women aged
50-59 years.
Second,
the authors of the meta-analysis evaluated the methodology of the
studies they included but did not fully consider the impact of study
methods on their results. For example, seven of the seventeen trials
in the subgroup analysis failed to meet the quality criteria of
being double-blinded. All seven of these trials had results that
tended to favor HRT, while only two of the ten double-blinded trials
had results that tended to favor HRT.
Third,
a meta-analysis must provide appropriate weights to individual studies
included in the analysis. This usually is done by providing greater
weights to larger studies, basing the weight on the denominator
or sample size. In this meta-analysis, weights were based on the
number of deaths reported. For example, a study with 406 patients
and 1 death contributed 1.9% of the meta-analysis summary statistic,
while a study with 130 patients and 73 deaths accounted for 10.3%
of the final results for trials with mean age less than sixty years.
Fourth,
the study accounting for 10.3% of this result [5] was a trial involving
postmenopausal women with ovarian cancer (mean age 51 years), excluding
those with low malignant potential. Including (and emphasizing)
such a highly select population invalidates the use of this analysis
for conclusions regarding the general population.
Any
one of these fatal flaws is sufficient to invalidate this conclusion.
The evidence does not support the concept of lower mortality from
HRT use in younger women.
Brian
S. Alper MD, MSPH
Editor-in-Chief, DynamicMedical.com
Research Assistant Professor in Family and Community Medicine, University
of Missouri-Columbia
Michael
Stuart, MD
President, Delfini Group, LLC
Clinical Assistant Professor, University of Washington School of
Medicine
Sheri
Ann Strite
Principal & Managing Partner, Delfini Group, LLC
1.
Merriman JA. Reader surveys help determine AFP’s direction.
Am Fam Physician 2005;71:1459.
2. Wellberry C. Does beginning HT earlier decrease mortality? Am
Fam Physician 2005;71:1594. (page number may be wrong, accessed
via web at http://www.aafp.org/afp/20050415/tips/7.html)
3. Salpeter SR, et al. Mortality associated with hormone replacement
therapy in younger and older women. J Gen Intern Med July 2004;19:791-804.
4. Furberg CD, Psaty BM. Review: hormone replacement therapy may
reduce the risk for death in younger but not older postmenopausal
women. ACP J Club 2005;142:1.
5. Guidozzi F, Daponte A. Estrogen replacement therapy for ovarian
carcinoma survivors: A randomized controlled trial. Cancer 1999;86:1013-8. |
|