Evidence-based Medicineebm mike floradelfini ebm boatebm sheri museum hotelebm aquarium

Evidence-based medicine essentials: tips at a high level view

Get Our
Updates!

Sheri_DelfiniGram

Page Contents

ebm time

Delfini Group EBM DolphinDelfini Group EBM Dolphin

 

Video Tutorial: Measures of Outcomes (12 min)

WELCOME
Welcome to a brief (and hopefully fun) online tutorial on the basics of critical appraisal of therapeutic interventions. The emphasis is on evaluating efficacy claims of primary studies using superiority designs.

WHY THIS MATTERS
Here's the short story on the huge problem we are trying to help solve: Delfini Flier
. For more details on why critical appraisal matters so much, go to our Critical Appraisal Facts page.

RESOURCES AT OUR SITE: ABOUT SCIENCE & MORE THAN SCIENCE
There is a wealth of freely available resources throughout our whole web site. The broad focus of our work is healthcare information, clinical improvement and patient communications. We specialize, however, in the evaluation of medical science as that area is particularly neglected and the need is so great. At this page you will find basic instruction in critical appraisal of the medical literature with an emphasis on efficacy of therapies as evaluated through primary superiority studies (original research). We strongly recommend you read this first:
Why Critical Appraisal Matters.

CAVEATS
There are exceptions to everything we say. Think of everything on our site as being a generality. There is no perfection in this work. Much judgment is required.

MORE ON RESOURCES
Many instructional materials that go more deeply can be found at the Delfini Library of Tools & Educational Materials. For a satellite view, see the Overview at Possibly Our Most Important Reading at the Library.

ebm clock Time saving tips are available at Delfini Library of Tools & Educational Materials—see Pearls: Basics of Evaluating Evidence in Superiority Trials for Therapies.

We list and link relevant additional reading below specific sections. Sometimes we link the same document in several places to keep key reading with the right context.

Also, we link to PDFs here for easier online reading; however, for templates we have Word documents available at the Delfini Library to enable you to easily fill in forms as you like.

Let's Go On A Little Journey! Into the Land of Critical Appraisal of the Medical Literature We Go!

ebm journey

Delfini Definition of Evidence-based Medicine (EBM)
"Evidence-based medicine is the use of the scientific method and application of valid and useful science to inform health care provision, practice, evaluation and decisions."

Science helps us by improving our ability to predict probability of outcomes.

We believe it is fine to make decisions on factors other than science. Know the science first—and do not confuse opinion or other factors with science.

More on Evidence-based Medicine for consumers and their clinicians.

Delfini's Hallmarks of Evidence-based Practice

1) When seeking information on a topic, a systematic search is conducted for science and science-based information using evidence-based searching and filtering techniques.

2) All sources of information to guide medical decision-making are critically appraised, using science-based principles, for validity and usefulness.

3) Any conclusions drawn from the science are carefully crafted to be as valid as possible.

4) Methods used and reporting are transparent so that the work can be evaluated for quality, replicated and updated.

5) Clinical information sources are updated when significant new information becomes available and such information is periodically sought.

We then apply evidence-based medicine to clinical quality improvement. To learn more about the Delfini approach to clinical quality improvement & value, read about the Delfini Evidence- and Value-based Clinical Quality Improvement Model.

Key Terns (Oops! That would be Terms... Oops, this is a Gull...No! This is the Delfini Flier!!!)

Go to our page on Why Critical Appraisal Matters to see a short list of key terms (to the left on that page).

ebm five a Five "A"s of Evidence-based Medicine: Process Steps

  1. Ask: Question framing drives your whole project. Consider PICO: Population/condition, intervention, comparator, outcome.
  2. Acquire: Use filtering techniques to narrow your yield. Match study type to clinical question. For efficacy of therapies, look for randomized controlled trials (RCTs). See the Searching & Sources Tool at Delfini Library of Tools & Educational Materials.
  3. Appraise: Critically appraise all scientific sources and references for validity (closeness to truth) and clinical usefulness (see Key Terms).
  4. Apply: See the Overview at Welcome. Other clinical quality improvement tools are available at the Delfini Library of Tools & Educational Materials.
  5. "A"s Again: Update! See the Searching & Sources Tool at Delfini Library of Tools & Educational Materials for advice on updating.

Modified from Leung 01 PMID: 12597509.

The 3 Steps of Critical Appraisal (To Avoid Getting Burnt!)

  1. Match study design to clinical question.
  2. Assess validity of the study.
  3. Assess usefulness of the results.
ebm hotseat
Step 1. Match Study Design to Clinical Question

The 2 General Study Types

ebm head bonkers

Basic Basics

  1. You need at least two groups to compare for differences.
  2. These groups should be from the same pool of people at the same time.
  3. The groups should be similar (except for what you are studying). What if these groups started out different from each other (as illustrated) and this was a study of headache? Ow! There's a little bias going on.

Medical research is about comparing the difference in outcomes between groups. For efficacy of therapies, look for valid randomized controlled trials (RCTs).

Magnified ReadingFor Further Reading

From the Delfini Library of Tools & Educational Materials, a short primer, Problems with Case Series.

ebm study types

Research studies are observations or experiments.

  • In observations, you observe what happens naturally. Observations are highly prone to bias no matter what.
  • Experiments, of which randomized controlled trials (RCTs) are the ideal, are the best study type for studying efficacy of therapies, unless you have all-or-none results, which are very rare.

Observational studies have many opportunities for flaws that the RCT design corrects.

Observations_vs_RCTs

The Quick Method to Identify Experiments

Did the patient (or his or her physician) CHOOSE the treatment? If yes, this is an observational study. Choice often is associated with other factors which could affect the results.

Magnified ReadingFor Futher Reading

From the Delfini Library of Tools & Educational Materials—

  • A short primer to learn some about problems with Observational Studies. Observational studies have many important uses, but they are highly prone to bias in evaluations of therapies.
  • Also see our 1-pager on "Real World" Data

ebm random

Randomization helps to create equal groups by distributing prognostic variables across groups. You want NO DIFFERENCE between groups except what you are studying.

Step 2. Assess Validity of the Study

The 2 General Validity Types

  1. External validity is the truth of the study outside of the study context (i.e., "real life" application). Considerations include similarity of studied patients to real life patients and circumstances of care. If these are different, outcomes may no longer be predictable.
  2. Internal validity is the truth of the study within its own context. To assess this, we perform critical appraisal looking for anything which may invalidate the study.

Magnified ReadingPreparation: Internal Validity Section 1-Pager

This section is summarized in the Delfini Library of Tools & Educational Materials in Pearls: Basics of Evaluating Evidence in Superiority Trials for Therapies. You may wish to print it out now for notetaking as you read and for later review. We strongly recommend using this tool—or other similar tool that you like—to help you be mindful of what to look for in a study to assess its validity.

Internal Validity: Key Questions

  1. What can explain the results other than "truth?"
  2. We look for bias, confounding and chance as possible explanations for our results. If we can rule out these, then it may be reasonable to conclude we have found "truth." (In general, we assess the likelihood of chance outcomes as part of assessing results.)
  3. Another way to state this is that there are 4 associations for outcomes: bias, confounding, chance or cause and effect. Key point: not all associations are causal.
  4. We are alert to whether anything may favor the intervention.
  5. We need transparent details of what was done (such as details of randomization) to be able to evaluate likely success.
  6. Problems or no details go on a list of "threats to validity."
  7. One or more threats can render study results uncertain.
  8. We assign a grade to our evaluation. See Delfini Library of Tools & Educational Materials for more advice about grading and about wording conclusions from studies or see below at Further Reading for this section.

The 4 Phases of A Clinical Trial

ebm study schema

There are four phases of a clinical trial.

  1. Selection: Who is studied, how were they selected, how were they assigned to their groups?
  2. Performance: What is being studied and what is it being compared to?
  3. Data Collection & Attrition: What information is collected, how is it collected and what is missing?
  4. Assessment: How is the difference in outcomes between the groups evaluated?

Bias

Bias is anything that "systematically" leads away from truth. This just means something led away from truth other than through random chance.

We look for bias in all 4 phases of a study. Bias may happen because of a problem affecting all groups equally, such as biased data collection. Or bias may occur because of a difference between groups except for what is being studied. Any difference between groups except for what is being studied is automatically a bias.

Bias in studies tends to favor the intervention under investigation. Certain kinds of bias have been shown to distort research results up to a relative 50 percent or more—for each flaw.

See Delfini Library of Tools & Educational Materials sections for critical appraisal tools and primers or see Further Reading below this section.

Confounding

ebm confounders

 

Confounding is a special kind of bias in which we are confused into thinking that one variable is responsible for an outcome when it is actually another variable that is responsible.

People who make healthy choices might be more likely than others to take vitamins. Therefore, taking vitamins is "linked" to healthy lifestyle.

So are vitamins responsible for reduced risk of coronary heart disease? Or are other choices that go along with having a healthy lifestyle?

Magnified ReadingFor Further Reading

From the Delfini Library of Tools & Educational Materials, see

Step 3. Assess Usefulness of the Results

Chanceebm di

Are the results just due to tricksy random chance??? We look to p-values (probability values) to determine whether results are likely to be a chance effect.

(Technically, p-values address the statistical probability of a difference occurring in the sample size that is equal to or larger, by chance, than in the larger population. There are some complex issues here which you can dig into more deeply at our glossary entry for p-values if you wish, but application is not very realistic or maybe even possible—consequently, we think of the p-value as an indicator.)

That said, keeping this practical and simple (though not exact), we are looking for statistically significant results which we use as a clue as to the likelihood of results being due to chance.

Some key points (oversimplified):

  • A p-value of <0.05 roughly and practically can be thought of to mean that there is a less than 1-in-20 chance that the findings are due to chance. (When you think about it, that's pretty high! I'd like that lotto ticket!)
  • P-values only have to do with measurements of chance. They cannot measure bias or tell you whether the results are true.
  • Performing multiple assessments (multiple outcomes or multiple analyses) increases the likelihood of a chance effect potentially as high as the number of outcomes or analyses: 20 outcomes evaluated could mean as high as a 20-in-20 risk of a chance finding—this is 100 percent!)
  • Stopping clinical trials early, regardless of utilizing stopping rules, puts results at very high risk of being due to chance.
  • We want certain things determined in advance of a study (a priori) to reduce the likelihood that findings are due to chance: research questions, populations for analysis and outcome measures.
  • Findings of non-significance could be chance effects—meaning there might have been too few people studied to realize an outcome. Look at the confidence intervals to see if there is a possibility of a meaningful result!

 

 

20-Sided Di

The 20-sided di is here as a reminder that, mathematically, we can ensure we get chance effects just by looking at enough variables. This is why database research is hypothesis-generating only.

 

ebm 20 sided di

Magnified ReadingFor Further Reading

From the Delfini Library of Tools & Educational Materials

From our Web Links, compute confidence intervals

Clinical Significance

Meaningful clinical benefit has two main considerations: outcome area + size of the outcome.

Meaningful areas of outcomes for study =

  1. Morbidity
  2. Mortality
  3. Symptom Relief
  4. Functioning (emotional or physical)
  5. Health-related Quality of Life

Outcomes that are not one of these areas (such as values from lab tests) are called intermediate or proxy markers and should only be considered if we have a strong causal chain of proof that affecting that marker leads to a beneficial affect in one of the five areas listed above.

A Little About Analysis

Analysis is a very big topic. And many peope worry overly about needing to have a deep understanding of statistics in order to evaluate the reliablity of studies. Frequently a general understanding of a few basic statistics as described below and an understanding of confidence intervals is sufficient, provided the study has been analyzed for potential distortion from bias, confounding or chance.

Missing data points are an important validity consideration. The real issue concerns whether those who provide study data and those who do not differ from the randomized population in prognostic variables which could affect study outcomes. If they do not differ in these ways, it is reasonable to conclude that the absence of data did not distort the study outcomes.

That said, there are many ways populations may be analyzed. We prefer to see several methods used because we think sensitivity analyses are helpful to provide clues to reliability. So it is useful to see completer analyses (meaning the population analyzed was restricted to only those who completed the study) or per protocol analyses (people for whom the protocol was strictly followed), but the method preferred by us and many others for evaluating efficacy of a therapy in a superiority trial is what is called intention-to-treat analysis (ITT).

ITT analysis which requires that patients are analyzed to the group to which they are randomized regardless of actual intervention received and regardless of study completion which requires the use of some reasonable method for assigning missing data points (data imputation). You can read more about ITT analysis here.

ITT is not appropriate for non-inferiority and equivalence trials if the data imputation methods favor no difference between groups.

Safety populations for study should be restricted to only those who actually received the study medication and should be done not by how randomized, but by treatment group. Including patients who do not receive the drug can inappropriately understate adverse effects.

Magnified ReadingFor Further Reading

From the Delfini Library of Tools & Educational Materials

Size of Results

I repeat: Meaningful clinical benefit has two main considerations: outcome area + size of the outcome.

Two key issues include—

  1. What is the kind of measure used to assess the size of the difference in outcomes between groups (remember! this is the whole point of the research study! the difference in the outcomes between the groups!) We will address the kind of measure forthwith after explaining that another key issue is—
  2. What is the associated time period within which the outcome occurred—this is the time period of the study. A key point about this is that it is "within" the study time period unless the study has been able to pinpoint that more directly.

ebm dolphins and shrimp

“Size matters,” according to a palate-weary Sunset panel, after blind-tasting more than 50 chickens, which I learn in my search for the perfect roast chicken. So too in research results.

Measures of Outcomes

This is a sparkly and fancy term (as many terms in medical science unnecessarily are) for measures used to assess the size of the difference between the groups studied. Terms which express this difference include estimates of effect and point estimates. Here are some ways in which these are expressed. (More, such as Odds Ratios, are detailed at Delfini Library of Tools & Educational Materials.)

Magnified ReadingFor Further Reading

From the Delfini Library of Tools & Educational Materials

Bee having a coffee...

Keep reading OR sit back and watch and listen to our video tutorial on Measures of Outcomes (12 min) or do both!

Risk With & Without Treatment (My Personal Favorite!)

risk and treatment

 

Study outcomes at their most basic. Simple! What happened to everyone! (Not always easy to find in trial information, shockingly, but what I, as a critical appraiser—and as a patient—most want to know.) Don't give me that fancy stuff.

Absolute & Relative Risk Reduction

arr rrr

  • Absolute measures are the percentage point differences between the percent outcomes in each group. In other words, if 15 percent of people die in the control group and 10 percent of people die in the study group, the Absolute Risk Reduction (ARR) is 5 percentage points.
  • Relative measures, such as Relative Risk Reduction (RRR) reflect the relative difference in size between groups. 10 is one 3rd smaller than 15 so it is 33 percent. Now I can sell more of everything because 33 sounds bigger than 5 even though it is the same result. Relative measures can (rarely) equal absolute measures, but they are never smaller. They are almost always bigger. Very bigger!
  • Relative begs the question,"Relative to what?" Only knowing the relative measure is like learning there is a 90 percent-off sale and not being told 90 percent-off of what—you know you need to know the base price. So too in medical research.

Number-Needed-to-Treat & Number-in-100

Number-needed-to-treat (NNT) is the reciprocal of the ARR. You take the ARR out of being a percentage: in the example above, 5 goes into 100, 20 times. You need to treat 20 people to benefit 1 person.

This is potentially confusing to patients (and probably to all of us) since we may be biased toward the larger number.

We like (thank you Dr. Tim Young of St. John Medical Center & OMNI Medical Group in Tulsa, Oklahoma) of simply taking the ARR number and expressing that as the number-out-of-a-hundred: in the above example, 5 out of 100 patients.

We like simplified things such as natural frequencies.

Magnified ReadingFor Further Reading

See Delfini Library of Tools & Educational Materials sections for critical appraisal tools and primers including 1-pagers on Analyzing Results including guidance on Confidence Intervals, Intention-to-Treat analysis, Time-to-Event analysis and more such as cross-over design and oncology outcomes, evaluating registries, critical appraisal of screeing and diagnostic tests and more. Much more including help regarding evidence synthesis, suggestions for decision support and communication aids, health care economic analysis tools, evidence-based QI tools, performance measurement tools and more...

Critical Appraisal Example
An example of a critical appraisal report The DelfiniClick™:
Radiofrequency for the Treatment of Gastro-esophageal Reflux Disease
Example: full story; appraisal only

Safety

It is challenging to assess safety for a number of reasons:

  • Outcomes are infrequent, hard to find and frequently due to chance.
  • Reporting may be selective.
  • Duration of follow-up may be too short.
  • We frequently need to resort to weaker evidence, but should be cautious when so doing.
  • BEWARE of statements of non-significance when it may be an issue of too few people studied: review the confidence intervals!
ebm winchie

Magnified ReadingFor Further Reading

From the Delfini Library of Tools & Educational Materials, our 1-pager on Safety.

Grading

Evidence grading is an assignment of a summary code or statement to represent the reliability and/or clinical usefulness of—well—some evidence! This may be an outcome, a conclusion, a study, a recommendation—whatever.

Here are some key points and tips.

  • There are a trillion systems (or somewhat fewer, but many!) In choosing a system, consider complexity and criteria.
  • In seeing a grade, look up the criteria. Many criteria are flawed and allow overstatement of quality.
  • Our system is simple and quick. A is outstanding. We celebrate B's—about 3 percent of the time. We don't want a C because that is passing, so we quickly leap downward into the bowels of the alphabet and grasp at U (for uncertain) about 90 percent of the time. About 7 percent of the time, we are perched at the border of B and U for a grade of BU.

See Delfini Library of Tools & Educational Materials for more information on Evidence Grading.

Magnified ReadingFor Further Reading

From the Delfini Library of Tools & Educational Materials, see

Secondary Studies & Secondary Sources

Garbage in, garbage out, as they say.

  1. The biggest problem with most (and we do mean the majority!) of secondary studies and sources is that invalid studies are used!
  2. A second big problem is that a systematic approach is not used (see above EBM Hallmarks).
  3. A third big problem is lack of transparency.

Buyer beware cannot be stated biggly enough!

ebm trash can

Magnified ReadingFor Further Reading

From the Delfini Library of Tools & Educational Materials, see

  • Secondary Studies (1-pager)
  • Secondary Studies (long tool): another important tool—our critical appraisal checklist for secondary studies (overviews/narrative reviews, systematic reviews and meta-analyses) including examples of good answers and examples of poor answers
  • Secondary Sources (1-pager)
  • Secondary Sources (long tool): another important tool—our critical appraisal checklist for secondary studies (overviews/narrative reviews, systematic reviews and meta-analyses) including examples of good answers and examples of poor answers
  • Audit advice for secondary studies and secondary sources (1-pager)

A Few Bottom LinesStarfish and Tools

  • One bottom line is that we like to see several valid studies done in slightly different ways by groups with different interests (potential conflicts) with consistent results to feel we have arrived at "truth!"
  • A very big bottom line is USE TOOLS! Find tools you like and that will not lead you into trouble. For example, avoid JADAD scoring at all cost! Use our tools at the Delfini Library if you like them. If not, find something you like that is likely to lead you to a valid assessment of a study.
  • We emphasize use of tools because they help remind you of what to assess and help remind you of what is a threat to validity because of what is missing.

.............................................
Contact DelfiniEBM Starfish

You can reach us at delfini (at) delfini.org. Write us for our DelfiniGram for bi-monthly announcements of updated tools and information, contact us with questions or just to say hello.

Warm best, Sheri & Mike

Get Our
Updates!

Sheri_DelfiniGram


Evidence Essentials

ebm dolphins at home

........
Use of our website implies agreement to our Notices.

Home
What's New
Services
Resources
Success Stories
Notices
About Us
About Our Work
Testimonials
Other
Site Navigator
Contact Us

Site Search
........
Quick Selections

........

Quick Navigator to Selected Resources:


"I have been meaning to thank you and Mike for all of the training and insight that you have given me. It really has made a huge difference in my professional life. For example, this summer I tackled an important safety debate. It took me many hours, but at the end of the day, I concluded that we, as a company, needed to make new recommendations. Without my Delfini training, I would have never had the confidence, nor been able to develop a convincing case. I know sometimes the jaded medical world may feel that evidence-based medicine and critical evaluations are nice academic exercises that rarely, if ever, influence member safety. We know that this is not the case. Keep carrying the banner of enlightenment in 2011! Your workshops help re-ignite for many the real reason they got involved in pharmacy and medicine in the first place. We definitely can make a difference..." Name Held in Confidence

EBM Delfini Guiding Lights

Delfini SERVICES
We offer a wide range of supportive services with the ultimate goal of increasing health care organizations’ and clinicians’ abilities to provide high quality, evidence-based care and to improve medical decision-making for organizations, leaders, teams, clinicians and patients.

Delfini About Our Work Delfini Group EBM DolphinDelfini Group EBM Dolphin

Resources at —

Return to Top

© 2002-2012 Delfini Group, LLC. All Rights Reserved Worldwide. Dolphin photo for Dolphins poolside attributed to RevolverOcelo. Delfini Boat thanks to Dr. Bat Shunatona. Thanks too to contributors to Openclipart.org and Wiki Media.
Use of this website implies your agreement to our Notices.

EBM Solutions for Evidence Based Health Care Quality