Freshman Statistics Seminar

Week 4: : Levels of Evidence-Observational Studies versus Experiments

Marta Shore

Objective: The overall goal is to understand that there are different levels of evidence, and these levels of evidence have increasing potency.

Article Summary: Wall 2006 “The Question of Causation Edit”

The handout is adapted from a handout given my Melanie Wall, professor of biostatistics in the department of Public health at the University of Minnesota.  In the handout, we define a concept called “counterfactual”.  In essence, the idea is that we can only prove that an action produces a desired result if (a) the result happens when the action is done, and (b) the counterfactual happens, which is that the result wouldn’t happen if we hadn’t done the action. Since we can’t go back in time to show the counterfactual, we have to proxy it, and there are different levels of evidence based on their ability to proxy the counterfactual.  The two supporting articles are mentioned in the handout, and just provide examples of an observational study and a randomized controlled trial (i.e. an experiment), which are two of the levels.

Suggested Lesson Structure: The idea is to go through the handout in class, starting with the first section.

The idea behind the first section of this handout is to get the students to think of how to prove a certain treatment produces a certain result.

  • Give them a chance to come up with the best way to prove a diet program works.
  • Ultimately, the only way to definitively prove that a treatment produces a result is try the treatment, get the result, then go back in time to the exact point of treatment (i.e. have the exact same conditions) and not do the treatment and show that the result doesn’t happen.
  • We obviously can’t do that.  So what can we do?

The first level of evidence is the anecdotal evidence level.  It’s better than having no evidence that a diet program works, but there are lots of ways to refute this evidence.   The class should try to come up with some.  Here are some questions to get the discussion started:
Discussion Points:

  • How many people had this level of success? How many people didn’t? Can you tell this from the  anecdotal evidence?
  • What level of success did the typical person experience?
  • How many people could stick with the program?
  • Are there other factors, such as exercise, that could’ve contributed to the result?
  • How do we know the “before” and “after” are really “before” and “after”? Do we even trust the evidence we have?
  • What is the counterfactual?

The second level of evidence is the observational study, as illustrated by the article from Nutrition Today.
Discussion Points:

  • How is this “better” evidence than the anecdotal evidence?  In particular, do the authors of this study have a vested interest in demonstrating the effectiveness of this program?  Do you get a sense of what the average lifetime member’s  results were?
  • Why do you think they included information about other information, such as age?
  • Do you think there could be any bias in this study?  What about the people that sign up for “lifetime membership” in weight watchers?  Are people who are willing to pay for weight loss more motivated to lose weight?
  • What is our negative control/counterfactual?

The third level of evidence is the experiment, as is illustrated by the JAMA article. Here we have people randomly assigned to either a treatment or control, and they were followed as they either followed the recommended diet plan or didn’t.

Discussion Points:

  • How is this “better” evidence than the observational study?  In particular, do you have a sense of what the results are for the average Weight Watcher as opposed to the lifetime member?
  • Why, in figure 3, do you think they included people who didn’t go much or didn’t go at all to Weight Watchers?  Why did they retain this data?  How does this make it more applicable to the average person?
  • What do you think of the negative control/counterfactual?  In particular, do you think it was good to have a negative control where there was actually some weight loss information given to the patients?  How does that strengthen the evidence for this program as opposed to other programs?

Final Discussion Point:
Finally, barring omniscience, which method is the most convincing?  Why?

