Section before scrollytell
Including post-treatment variables can drastically impact the results of experiments. The data for this example was simulated so we know that, on average, the pest-control caused plants to grow 1.52 inches taller than they would have grown without the pest-control. We can see that the analysis without the post-treatment variable bugs
is very close to the true treatment effect. The analysis that includes the post-treatment variable bugs
is far off from the true treatment effect and would lead an incorrect assessment of the non-toxic environmentally friendly pest-control!
To demonstrate how adjusting for post-treatment variables introduces bias, consider another hypothetical example: does an exercise program increase muscle strength? The treatment variable z
records whether or not an individual participated in the exercise program, the pre-treatment variable. Mid-way through the study, weekly trips to the gym was recorded for each individual with the post-treatment variable gym
.
Post-treatment variables, like gym
, have different potential outcomes that depend on the treatment variable. This is the primary problem of controlling for post-treatment variables. The table below shows the potential outcomes and observed outcomes of gym
for 2 individuals from our exercise study.In real contexts, we would not have access to both potential outcomes but we can imagine them for the purposes of this example. Notice that two individuals differ in z
but have the same value for gym
despite haviving different potential outcomes. Controlling for gym
is not a fair comparison of strength
because we are not accounting for how z
changes the value of gym
. This is not a problem for pre-treatment varibales becuase thier value can not be changed or influenced by z
.
Notes
This is mostly just a UX experiment; causal definitions are loosely interpreted
Add single tree and true response curve?
Next section after scrollytell
Including post-treatment variables can drastically impact the results of experiments. The data for this example was simulated so we know that, on average, the pest-control caused plants to grow 1.52 inches taller than they would have grown without the pest-control. We can see that the analysis without the post-treatment variable bugs
is very close to the true treatment effect. The analysis that includes the post-treatment variable bugs
is far off from the true treatment effect and would lead an incorrect assessment of the non-toxic environmentally friendly pest-control!
Functional form
BART performs well across various functional forms. Drag the handles () to adjust the treatment group's functional form and then refit the models.
Add div here showing summary stats?
BART posterior
To demonstrate how adjusting for post-treatment variables introduces bias, consider another hypothetical example: does an exercise program increase muscle strength? The treatment variable z
records whether or not an individual participated in the exercise program, the pre-treatment variable. Mid-way through the study, weekly trips to the gym was recorded for each individual with the post-treatment variable gym
.
New version
Replace above animation with plot similar to slides and book chaper:the predicted treatment effect plot with vertical intervals
Why are the intervals inconsistent?
Common support
The previous plot could be easily extended to illustrate overlap by allowing the user to shift the distributions left-and-right.
Two plots connected showing scatter and posterior of treatment effect. Slider to go from "perfect-overlap" to "half-overlap" in 10 increments which are pre-generated