library(grf)

## Conditional means and average treatment estimates

grf computes average treatments effects based on a double robust correction (giving rise to an augmented inverse-propensity weighted average treatment effect). For a causal forest with binary treatment the exact expression is (equation (8) of Athey and Wager, 2019):

$$\hat \Gamma_i = \hat \tau^{(-i)}(X_i) + \frac{W_i - \hat e^{(-i)}(X_i)}{\hat e^{(-i)}(X_i)[1 - \hat e^{(-i)}(X_i)]} [Y_i - \hat \mu^{(-i)}(X_i, W_i)]$$

where:

1. $$\hat \tau^{(-i)}(X_i)$$ is the treatment effect estimate (returned by grf’s predict method). Using the potential outcomes notation this is by definition $$E[Y_i(1) - Y_i(0) | X_i] = \mu(X_i, 0) - \mu(X_i, 1)$$.

2. $$W_i$$ is the binary treatment indicator for subject $$i$$.

3. $$\hat e^{(-i)}(X_i)$$ is the propensity score for subject $$i$$ (this is by default estimated by a regression forest on the treatment assignment).

4. $$Y_i$$ is the realized outcome for subject $$i$$.

5. $$\hat \mu^{(-i)}(X_i, W_i) = \hat m^{(-i)}(X_i) + [W_i - \hat e^{(-i)}(X_i)] \hat \tau^{(-i)}(X_i)$$ is an estimate of the realized conditional mean for subject $$i$$, where $$m(X_i) = E[Y | X_i]$$ (this is by default estimated using a regression forest, marginalizing over treatment).

The superscript $$-i$$ indicates cross-fitting, i.e. that the estimate is computed by leaving observation $$i$$ out. This holds by construction for out-of-bag forest estimates.

To see how the last expression arises, note that we have:

1. $$m(X_i)$$

$$= E[Y | X_i]$$

$$= E[Y_i(0) | X_i] + E[W_i [Y_i(1) - Y_i(0)] | X_i]$$

$$= \mu(X_i, 0) + e(X_i) \tau(X_i)$$

where there last line is due to unconfoundedness (conditioning on the set of covariates $$X_i$$, the potential outcomes are independent of the treatment).

We then obtain the following expressions for the counterfactual response surfaces:

1. $$\mu(X_i, 0) = m(X_i) - e(X_i) \tau(X_i)$$

2. $$\mu(X_i, 1) = \tau(X_i) + \mu(X_i, 0) = m(X_i) + [1 - e(X_i)] \tau(X_i)$$

These objects are all computed by the built-in ATE functions. The following code snippet illustrates how to manually obtain estimates of the conditional means $$E[Y | X_i, W_i]$$:

n <- 250
p <- 5
X <- matrix(rnorm(n * p), n, p)
W <- rbinom(n, 1, 0.5)
Y <- pmax(X[, 1], 0) * W + X[, 2] + pmin(X[, 3], 0) + rnorm(n)

# These are estimates of m(X) = E[Y | X]
forest.Y <- regression_forest(X, Y)
Y.hat <- predict(forest.Y)$predictions # These are estimates of the propensity score E[W | X] forest.W <- regression_forest(X, W) W.hat <- predict(forest.W)$predictions

c.forest <- causal_forest(X, Y, W, Y.hat, W.hat)
tau.hat <- predict(c.forest)\$predictions

# E[Y | X, W = 0]
mu.hat.0 <- Y.hat - W.hat * tau.hat
# E[Y | X, W = 1]
mu.hat.1 <- Y.hat + (1 - W.hat) * tau.hat

## References

Athey, Susan and Stefan Wager. Estimating Treatment Effects with Causal Forests: An Application. Observational Studies, 5, 2019. (arxiv)