Average Treatment Effects

Let's revisit the HyperShoe example from the potential outcomes module; a new high performance running shoe called the HyperShoe is released with the claim that wearing the shoe causes faster marathon running times than standard running shoes. We have collected data from 10 runners. 5 of the runners wore HyperShoes and 5 of the runners wore standard shoes.

What might we want to learn about the effect of the shoes on these runners? Ideally we'd like to know, for each runner, if the Hypershoe would reduce their running time in a given race. But we have shown [[link to potential outcomes module]] that this is an extremely difficult task since for each person we only observe their running time either with or without the Hypershoe. What if we instead considered the effect of the Hypershoe across all ten runners?

The average treatment effect (ATE) provides the average of all of the individual level causal effects of HyperShoes across all 10 of the runners in our sample. We saw in the potential outcomes module that there can be individual differences in the causal effect of HyperShoes. Some runners had a huge benefit from HyperShoes while others had no benefit at all. The average treatment effect masks these individual level differences and instead gives us an idea of the general trend. The average treatment effect still relies on within-person comparisons. This is different then simply comparing the average running times between the observed treatment group and the observed control group. We'll walk through calculating the ATE to make these ideas more concrete.




Factuals

We'll start by recording the observed (factual) running times of all 10 runners in our study.

As shown, 5 of the runners wore HyperShoes while the other 5 runners wore standard shoes.











Counterfactuals

Calculating the ATE also requires the unobserved counterfactual running times of all 10 runners. The counterfactual running time is the time each runner would have had if they had worn the other shoe.

This part may appear confusing, how can we use counterfactual outcomes if they aren't observed? We'll cover this later on but for now, imagine we had the impossible power to simultaneously observe factual and counterfactual outcomes at the same time.










Individual Causal Effects (ICE)

When we have both potential outcomes (a y1 and a y0) we can calculate a runners individual casual effect by taking the difference of an individual runner's y1 and y0. The indvidual casual effect tells us how much faster or slower the HyperShoes casued the runners finishign time to be.

You can hover over a point on the plot to see the Individual Causal Effect of the selected runner:

Runner 1 has an ICE of -1.69.










The ATE is the Average of Individual Causal Effects

One way of calculating the ATE is by taking the average of all 10 runners Individual Causal Effect's. After averaging all 10 Individual Causal Effects we see that the ATE is -1.33.

You can hover over any point to exploire a runner's Indivudal Causal Effect and how it related the the ATE.










The ATE is the average y1 - the average y0

Another way of caluating the ATE is by taking the difference between the average of all y1s and the average of all y0s.

It is important to note that y1 and y0 both include facutal and counterfactual observations. The values in y1 are finishing times for all 10 runners if they wore HyperShoes and values in y0 are the finishing times of all 10 runners if they wore standard shoes.

Notice the ATE is the same as when we averaged Individual Causal Effects of all 10 runners.










Potential Outcomes Table

We can also think of the ATE within a potential outcomes table, you can hover over the graph or the table for a closer look! Using a potential outcomes table has the benifit of displaying the data in way that may be more conducive to making calculations by hand.

The average treatment effect is a summary of the entire samples within person comparisons. Remember you can calculate the ATE by taking the average of all 10 runner Individual Causal Effects or by taking the difference of the average y1 and average y0.

Remember that calculating the ATE requiors knowing each runner's factual and counterfactual observations. For now, keep imagining that you have access to both potnetial outcomes (both y1 and y0) for all 10 runners.















Subsetting by the treatment group

Sometimes researchers are only interested in summarizing causal effects for a particular set of observations. In observational studies, where the treatment is not randomly assigned, individuals that received the treatment are often very different from individuals that did not receive the treatment. It may make sense to focus on group that received the treatment.

Let’s return to our original sample of 10 runners from the HyperShoe study. In our study, 5 runners received the treatment and wore HyperShoes while 5 did not receive the treatment and wore standard shoes. Suppose that all 5 runners that wore HyperShoes are semi-professional runners while only 2 of the 5 runners wearing standard shoes are semi-professionals and the remaining 3 are amateur-runners.

The average treatment effect on the treated (ATT) is the average causal effect of the sample with received the treatment (where z = 1 and y = y1 in potential outcomes notation). In our example, the ATT is the average casual effect for the runners who wore HyperShoes for their factual outcome.

We can calculate the ATT by taking averaging the individual causal effect of the 5 runners that received the treatment (z = 1) or by taking the difference between the average y1 and the average y0 for the group of runners that received the treatment (z = 1).


Here is the worked out example of how the ATT was calculated:

average y0 = (2.9 + 2.95 + 3.2 + 3.0 + 3.05 )/5 = 3.02

average y1 = ( 2.85 + 2.8 + 2.85 + 2.7 + 2.8)/5 = 2.8

ATT = average y1 - average y0 = -.22



Alternativly, you could also take the average of the indiviudal causal effects for the 5 runners that wore HyperShoes (z = 1)

ATT = average ICE = (-.91 + -.47 + -1.67 + -.6 + -2.7)/5 = -1.27

Remember calculating the ATT requiors we have both the factual and counterfacutal outcomes for all of the 5 runners! In practice you will never be able to see counterfacutal observations but for now we'll imagie that you have this ability.

Note in our example we divide by 5 because our sample has 5 runners with a z value of 1!

For our sample of 9 runners the ATT of -.22 is different from the ATE of -.15. This is because the ATE and ATT are summarizing different parts of our sample. The ATE is a summary of treatment effect across every runner in the sample while the ATT is a summary of only the 5 runners that wore the HyperShoe for their factual outcome.


< INSERT QUIZ >


Subsetting by the Control Group

In our last example we were interested in a causal question that only pertained to the treated group. We can ask similar questions about the control group. Imagine a different scenario, the governing body that organized marathon races is concerned that runners without HyperShoes are put at a disadvantage. How would the governing body know if runners without HyperShoes would have ran faster had they worn HyperShoes?

The average treatment effect on the control (ATC) is the average causal effect of the sample that did not receive the treatment (where z = 0). Calculating the ATC would answer if the runners without HyperShoes were at a disadvantage

Why do we only care about comparisons within the control group? Here we want to know if the group of runners without HyperShoes is disadvantaged without using HyperShoes. It is possible that the HyperShoes only have an effect on the runners that happened to be in the treated group (z = 1) to see if the runners without HyperShoes were disadvantaged we would need to compare the factual and counterfactual outcomes (y1 and y0) for the runners without HyperShoes (z = 0).


Closing paragraph