PROPENSITY-SCORE MATCHING
In order to apply the potential-outcome framework to get causal estimates that don't depend too strongly on untestable assumptions, we first need to make sure that the distributions of the treatment and control groups are balanced. This means, in other words, that we need to make sure that we are comparing apples with apples. To do so, we need to match those units that receive the treatment and those that do not receive the treatment, using a number of covariates (X). Going back to our example in Chapter 21, we need to find households that are identical in all possible, pre-treatment aspects (income, education, health, number of siblings, geographical region of origin, etc.) but that differ in their migratory experience. This procedure would create a smaller dataset with only the matched households. Once we accomplish this, we just need to estimate the average difference in means (E(Yγ − Yγ′) = E(Yγ) − E(Yγ′)) to find the impact of migration on children's emotional state. The life of an applied researcher, however, is not that easy. The introduction of a significant number of covariates (X) such as income, education, health, number of siblings, geographical region, and so on, makes it very difficult to match treated and control households. For example, if we match two households on income, then we are probably going to unmatch them on another dimension, such as number of siblings. Therefore, matching on a large number of covariates creates a high-dimensionality problem (Dehejia 2004).