Section before scrollytell

Including post-treatment variables can drastically impact the results of experiments. The data for this example was simulated so we know that, on average, the pest-control caused plants to grow 1.52 inches taller than they would have grown without the pest-control. We can see that the analysis without the post-treatment variable bugs is very close to the true treatment effect. The analysis that includes the post-treatment variable bugs is far off from the true treatment effect and would lead an incorrect assessment of the non-toxic environmentally friendly pest-control!

To demonstrate how adjusting for post-treatment variables introduces bias, consider another hypothetical example: does an exercise program increase muscle strength? The treatment variable z records whether or not an individual participated in the exercise program, the pre-treatment variable. Mid-way through the study, weekly trips to the gym was recorded for each individual with the post-treatment variable gym.

Post-treatment variables, like gym, have different potential outcomes that depend on the treatment variable. This is the primary problem of controlling for post-treatment variables. The table below shows the potential outcomes and observed outcomes of gym for 2 individuals from our exercise study.In real contexts, we would not have access to both potential outcomes but we can imagine them for the purposes of this example. Notice that two individuals differ in z but have the same value for gym despite haviving different potential outcomes. Controlling for gym is not a fair comparison of strength because we are not accounting for how z changes the value of gym. This is not a problem for pre-treatment varibales becuase thier value can not be changed or influenced by z.

Notes

This is mostly just a UX experiment; causal definitions are loosely interpreted

Add single tree and true response curve?

State 1: Observations

We can plot the observed (factual) running timesoutcomes of all 9 runners in our study. In our study, rRunners 1, 4, 7 and 9 wore standard shoes while runners 2, 3, 5, 6, and 8 wore HyperShoes.

State 2: Mean diff

Your study had an ATE of ____. Because we have access to our potential outcomes time-machine, we can also plot each runner's unobserved (counterfactual) outcomes. When looking at counterfactual outcomes we see the running times runners 2,3,5,6 and 8 would have had if they were wearing sStandard sShoes which fills in the missing y0's. We also seeand the running times runners 1, 4, 7 and 9 would have had if they were wearing HyperShoes which fills in the missing y1s.

State 3: Linear regression

State 4: Regression tree

A single regression tree more closely recaptures the true response surface. But ... To demonstrate how adjusting for post-treatment variables introduces bias, consider another hypothetical example: does an exercise program increase muscle strength? The treatment variable

State 5: BART

Next section after scrollytell

Functional form

BART performs well across various functional forms. Drag the handles () to adjust the treatment group's functional form and then refit the models.

Generate points

Choose model

Add div here showing summary stats?

BART posterior

New version

Replace above animation with plot similar to slides and book chaper:the predicted treatment effect plot with vertical intervals

Why are the intervals inconsistent?

Common support

The previous plot could be easily extended to illustrate overlap by allowing the user to shift the distributions left-and-right.

Two plots connected showing scatter and posterior of treatment effect. Slider to go from "perfect-overlap" to "half-overlap" in 10 increments which are pre-generated