# Causal Inference in Quasi- or Non-Experimental Studies: Heckman Two-Step Procedure

In the late 1970s, economist and Nobel laureate James Heckman pioneered a family of econometric methods to address the problem of selection effects in observational studies. His significant contribution in this area was to model selection as an omitted variable that may influence outcomes. In his two-step correction procedure, as in propensity score matching, the researcher first creates a model of the selection process using observed variables, the results of which yield a predicted probability of selection for each individual (Step 1). The error term from this selection equation represents both random error and the effect of unobserved variables driving selection -- in other words, larger error terms could indicate that unobserved variables not included in the selection equation are more influential in predicting participation in an intervention. A transformation of that error term is then modeled in the outcome equation as a proxy for the omitted variable (Step 2). Standard errors in the two-step procedure may be inflated because of multicollinearity of regressors and the correction term, particularly if exclusion restrictions are not used for identification and instead the model is identified through functional form restrictions alone. In addition, the method depends crucially on the assumption that the errors are jointly normal. If this assumption fails, estimators are inconsistent and may lead to misleading inference, especially in small samples. Finally, Heckman models are sensitive to specification, especially of functional forms and covariate choice. (Adapted from Hill, Rosenman, Tennekoon & Mandal, unpublished manuscript.)