| More
on The Problem with Drawing Cause-Effect Conclusions from Observational
Studies Our
last teaching engagement was in Framingham, Massachusetts and reminds
us of the value of observational studies to assist us in developing
risk stratification models. The Framingham Study began in 1948 as
the first prospective study of cardiovascular disease and is important
because through observations it has identified cardiovascular disease
(CVD) risk factors which can be associated with morbidity and mortality.
But
there is good evidence that basing cause and effect conclusions
from observational studies is unreliable. Cause and effect conclusions
should be based on randomized controlled trials (RCTs) where bias,
confounding and chance have been ruled out as possible explanations
for the observed association between the intervention and the outcome.
Because there are so many observational studies published each week
and because we keep seeing health professionals inappropriately
basing treatment decisions on them, it is worthwhile summarizing
an excellent review of the literature on this topic.
The
study and literature review can be found in the reference:
Deeks
JJ, Dinnes J, D'Amico R, Sowden AJ, Sakarovitch C, Song F, Petticrew
M, Altman DG. Evaluating non-randomised intervention studies. Health
Technology Assessment 2003; Vol. 7: No. 27.
Some key points from this article:
-
Comparison of results of randomized and non-randomized studies
across multiple interventions in multiple studies demonstrate
that, in the majority of cases, observational studies are not
consistent with the results of RCTs
- This
study, using meta-epidemiological techniques, demonstrates that
—
-
None of the study results can be adequately adjusted for bias
in observational studies using historic and concurrent controls
-
Logistic regression on average increases bias when applied
to observational studies
Conclusions
- Non-randomized
studies may still give seriously misleading results even when
those treated and control groups appear similar in key prognostic
factors
-
Standard methods of case-mix adjustment do not guarantee removal
of bias
-
Omission of important confounding factors can explain failure
of adjustment as a substitute for randomization
-
There is no known method for reliably adjusting for confounding
factors in observational studies
Delfini
Commentary
Extreme
caution is urged when considering results of observational studies
in interventions for screening, prevention and therapy. Cause and
effect conclusions should only be drawn from RCTs.
One
reason for this is that there may be major differences in the characteristics
(prognostic factors) of individuals who choose a therapy compared
to people who do not choose that therapy. A classic example is hormone
replacement therapy after myocardial infarction (MI) in women. Most
observational studies reported that roughly twice as many women
who did not choose to take hormone replacement therapy (HRT) had
a recurrent MI compared to women who chose to take HRT. This led
people to believe — incorrectly — that HRT caused this
benefit. Later, well-done RCTs were conducted and no such benefit
was found. Why? The most likely reason is that the observational
studies were highly prone to biases resulting from differences between
the groups which could not be eliminated even with statistical adjustments
in which researchers try to balance confounders between the groups,
such as adjusting for smoking.
Another
reason is that in observations, investigators do not “control”
all elements of the study as they do in RCTs. The end result is
that in observational studies other aspects affecting the groups
are almost certain to be different in important ways which are likely
to explain or affect the study results.
Key
Point
Any difference between groups — except for what is being studied
(e.g., HRT use) —
is a bias.
In
the case of HRT after MI, selection bias was present in that women
who chose to take HRT were probably more likely to be “health-conscious,”
exercise, watch their diets, etc., making them different from the
women who did not take HRT. It is also likely that there were other
differences in how the two groups experienced their health care
because in observational studies there is no formal protocol and
so there will be differences in many ways that could affected observed
outcomes such as other therapies used, how outcomes are assessed,
frequency for follow-up, and so on.
Even
with statistical adjustments for differences between potential and
known prognostic characteristics of the groups, bias cannot be reliably
eliminated because whatever is actually responsible for the outcome
(i.e., the confounder) is what would have to be adjusted. This would
entail having advance knowledge of cause and effect (but that is
why the study is being conducted). Plus statistical adjustment has
limitations. How could every single factor that made the HRT users
different be adjusted? Humans embody an infinite number of variables
such as characteristics and exposures.
Comparisons
of RCTs and observational studies of the same interventions have
repeatedly demonstrated that even with the most meticulous statistical
adjustments, bias cannot be reliably eliminated from observational
studies. The key message is that without randomization and assurance
that interventions and assessments are the same for both study and
comparison groups, one cannot reliably draw conclusions about cause
and effect relationships. Associations between interventions and
outcomes in observational studies are very likely to be due to bias
or confounding. Therefore, observational studies are only useful
for hypothesis-generating when considering questions of preventive,
screening or therapeutic interventions.
Database
Studies
Some groups have tried to demonstrate improved health outcomes (e.g.,
death, stroke, etc.) through studies of their databases. It should
be remembered that this type of study is an observational study
and prone to bias and confounding for the reasons explained above,
plus it is highly prone to chance findings of statistical significance.
Therefore, database studies may be useful for suggesting areas for
further study, but they should not be thought of as valid studies
from which cause and effect relationships can be concluded. |