| Title: | Outcome Weights of Treatment Effect Estimators |
|---|---|
| Description: | Many treatment effect estimators can be written as weighted outcomes. These weights have established use cases like checking covariate balancing via packages like 'cobalt'. This package takes the original estimator objects and outputs these outcome weights. It builds on the general framework of Knaus (2024) <doi:10.48550/arXiv.2411.11559>. This version is compatible with the 'grf' package and provides an internal implementation of Double Machine Learning. |
| Authors: | Michael C. Knaus [aut, cre] (ORCID: <https://orcid.org/0000-0002-7328-1363>), Henri Pfleiderer [ctb] |
| Maintainer: | Michael C. Knaus <[email protected]> |
| License: | GPL-3 |
| Version: | 0.2.0.9000 |
| Built: | 2026-05-28 06:35:59 UTC |
| Source: | https://github.com/mcknaus/outcomeweights |
Existing Double ML implementations are too general to easily extract smoother matrices required to be compatible with the get_forest_weights() method. This motivates yet another Double ML implementation.
dml_with_smoother( Y, D, X, Z = NULL, estimators = c("PLR", "PLR_IV", "AIPW_ATE", "Wald_AIPW", "AIPW_ATT", "AIPW_ATU"), smoother = "honest_forest", n_cf_folds = 5, n_reps = 1, ... )dml_with_smoother( Y, D, X, Z = NULL, estimators = c("PLR", "PLR_IV", "AIPW_ATE", "Wald_AIPW", "AIPW_ATT", "AIPW_ATU"), smoother = "honest_forest", n_cf_folds = 5, n_reps = 1, ... )
Y |
Numeric vector containing the outcome variable. |
D |
Optional binary treatment variable. |
X |
Covariate matrix with N rows and p columns. |
Z |
Optional binary instrumental variable. |
estimators |
String (vector) indicating which estimators should be run. Current menu: c("PLR","PLR_IV","AIPW_ATE","Wald_AIPW","AIPW_ATT","AIPW_ATU") |
smoother |
Indicate which smoother to be used for nuisance parameter estimation.
Currently only available option |
n_cf_folds |
Number of cross-fitting folds. Default is 5. |
n_reps |
Number of repetitions of cross-fitting. Default is 1. |
... |
Options to be passed to smoothers. |
A list with three entries:
results: a list storing the results, influence functions, and score functions of each estimator
NuPa.hat: a list storing the estimated nuisance parameters and the outcome smoother matrices
Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., & Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21(1), C1-C68.
Knaus, M. C. (2024). Treatment effect estimators as weighted outcomes, https://arxiv.org/abs/2411.11559.
# Sample from DGP borrowed from grf documentation n = 200 p = 5 X = matrix(rbinom(n * p, 1, 0.5), n, p) Z = rbinom(n, 1, 0.5) Q = rbinom(n, 1, 0.5) W = Q * Z tau = X[, 1] / 2 Y = rowSums(X[, 1:3]) + tau * W + Q + rnorm(n) # Run outcome regression and extract smoother matrix # Run DML and look at results dml = dml_with_smoother(Y,W,X,Z) results_dml = summary(dml) plot(dml) # Get weights omega_dml = get_outcome_weights(dml) # Observe that they perfectly replicate the original estimates all.equal(as.numeric(omega_dml$omega %*% Y), as.numeric(as.numeric(results_dml[,1]))) # The weights can then be passed to the cobalt package for example.# Sample from DGP borrowed from grf documentation n = 200 p = 5 X = matrix(rbinom(n * p, 1, 0.5), n, p) Z = rbinom(n, 1, 0.5) Q = rbinom(n, 1, 0.5) W = Q * Z tau = X[, 1] / 2 Y = rowSums(X[, 1:3]) + tau * W + Q + rnorm(n) # Run outcome regression and extract smoother matrix # Run DML and look at results dml = dml_with_smoother(Y,W,X,Z) results_dml = summary(dml) plot(dml) # Get weights omega_dml = get_outcome_weights(dml) # Observe that they perfectly replicate the original estimates all.equal(as.numeric(omega_dml$omega %*% Y), as.numeric(as.numeric(results_dml[,1]))) # The weights can then be passed to the cobalt package for example.
This is a generic method for getting outcome weights. It calculates the outcome weights for objects created by other packages.
get_outcome_weights(object, ...)get_outcome_weights(object, ...)
object |
An estimated object. See get_outcome_weight.<compatible_fct> in the package documentation for compatible object classes. |
... |
Additional arguments specific to object class implementations. See the documentation which object requires which additional arguments. |
A list of at least these components:
omega: matrix (number of point estimates x number of estimation units) of outcome weights
treat: the treatment indicator to make it compatible with the cobalt package
Knaus, M. C. (2024). Treatment effect estimators as weighted outcomes, https://arxiv.org/abs/2411.11559.
causal_forest functionPost-estimation command to extract outcome weights for causal forest
implemented via the causal_forest function from the grf package.
## S3 method for class 'causal_forest' get_outcome_weights( object, ..., S, newdata = NULL, S.tau = NULL, target = "CATE", checks = TRUE )## S3 method for class 'causal_forest' get_outcome_weights( object, ..., S, newdata = NULL, S.tau = NULL, target = "CATE", checks = TRUE )
object |
An object of class |
... |
Pass potentially generic get_outcome_weights options. |
S |
A smoother matrix reproducing the outcome predictions used in building the |
newdata |
Corresponds to |
S.tau |
Required if |
target |
Target parameter for which outcome weights should be extracted. Either |
checks |
Default |
get_outcome_weights object with omega containing weights and treat the treatment
Athey, S., Tibshirani, J., & Wager, S. (2019). Generalized random forest. The Annals of Statistics, 47(2), 1148-1178.
Knaus, M. C. (2024). Treatment effect estimators as weighted outcomes, https://arxiv.org/abs/2411.11559.
# Sample from DGP borrowed from grf documentation n = 500 p = 10 X = matrix(rnorm(n * p), n, p) W = rbinom(n, 1, 0.5) Y = pmax(X[, 1], 0) * W + X[, 2] + pmin(X[, 3], 0) + rnorm(n) # Run outcome regression and extract smoother matrix forest.Y = grf::regression_forest(X, Y) Y.hat = predict(forest.Y)$predictions outcome_smoother = grf::get_forest_weights(forest.Y) # Run causal forest with external Y.hats c.forest = grf::causal_forest(X, Y, W, Y.hat = Y.hat) # Predict on out-of-bag training samples. cate.oob = predict(c.forest)$predictions # Predict using the forest. X.test = matrix(0, 101, p) X.test[, 1] = seq(-2, 2, length.out = 101) cate.test = predict(c.forest, X.test)$predictions # Calculate outcome weights omega_oob = get_outcome_weights(c.forest,S = outcome_smoother) omega_test = get_outcome_weights(c.forest,S = outcome_smoother,newdata = X.test) # Observe that they perfectly replicate the original CATEs all.equal(as.numeric(omega_oob$omega %*% Y), as.numeric(cate.oob)) all.equal(as.numeric(omega_test$omega %*% Y), as.numeric(cate.test)) # Also the ATE estimates are perfectly replicated omega_ate = get_outcome_weights(c.forest,target = "all", S = outcome_smoother, S.tau = omega_oob$omega) all.equal(as.numeric(omega_ate$omega %*% Y), as.numeric(grf::average_treatment_effect(c.forest, target.sample = "all")[1])) # The omega weights can be plugged into balancing packages like cobalt# Sample from DGP borrowed from grf documentation n = 500 p = 10 X = matrix(rnorm(n * p), n, p) W = rbinom(n, 1, 0.5) Y = pmax(X[, 1], 0) * W + X[, 2] + pmin(X[, 3], 0) + rnorm(n) # Run outcome regression and extract smoother matrix forest.Y = grf::regression_forest(X, Y) Y.hat = predict(forest.Y)$predictions outcome_smoother = grf::get_forest_weights(forest.Y) # Run causal forest with external Y.hats c.forest = grf::causal_forest(X, Y, W, Y.hat = Y.hat) # Predict on out-of-bag training samples. cate.oob = predict(c.forest)$predictions # Predict using the forest. X.test = matrix(0, 101, p) X.test[, 1] = seq(-2, 2, length.out = 101) cate.test = predict(c.forest, X.test)$predictions # Calculate outcome weights omega_oob = get_outcome_weights(c.forest,S = outcome_smoother) omega_test = get_outcome_weights(c.forest,S = outcome_smoother,newdata = X.test) # Observe that they perfectly replicate the original CATEs all.equal(as.numeric(omega_oob$omega %*% Y), as.numeric(cate.oob)) all.equal(as.numeric(omega_test$omega %*% Y), as.numeric(cate.test)) # Also the ATE estimates are perfectly replicated omega_ate = get_outcome_weights(c.forest,target = "all", S = outcome_smoother, S.tau = omega_oob$omega) all.equal(as.numeric(omega_ate$omega %*% Y), as.numeric(grf::average_treatment_effect(c.forest, target.sample = "all")[1])) # The omega weights can be plugged into balancing packages like cobalt
dml_with_smoother functionPost-estimation command to extract outcome weights for double ML run with an outcome smoother.
## S3 method for class 'dml_with_smoother' get_outcome_weights(object, ..., all_reps = FALSE)## S3 method for class 'dml_with_smoother' get_outcome_weights(object, ..., all_reps = FALSE)
object |
An object of class |
... |
Pass potentially generic get_outcome_weights options. |
all_reps |
If |
If all_reps == FALSE: get_outcome_weights object
If all_reps == TRUE: additionally list omega_all_reps:
A list containing the outcome weights of each repetition.
Knaus, M. C. (2024). Treatment effect estimators as weighted outcomes, https://arxiv.org/abs/2411.11559.
DoubleML objectPost-estimation command to extract outcome weights for a
DoubleML object from the DoubleML package.
## S3 method for class 'DoubleML' get_outcome_weights(object, dml_data, ...)## S3 method for class 'DoubleML' get_outcome_weights(object, dml_data, ...)
object |
A |
dml_data |
The |
... |
Pass potentially generic get_outcome_weights options. |
Outcome weights are available for DoubleMLPLR, DoubleMLPLIV,
DoubleMLIIVM, and DoubleMLIRM models.
An object of class get_outcome_weights containing:
Numeric vector of outcome weights of length .
Numeric vector of treatment assignments.
Knaus, M. C. (2024). Treatment effect estimators as weighted outcomes, https://arxiv.org/abs/2411.11559.
Bach P, Kurz MS, Chernozhukov V, Spindler M, Klaassen S (2024). “DoubleML: An Object-Oriented Implementation of Double Machine Learning in R.” Journal of Statistical Software, 108(3), 1-56, https://doi.org/10.18637/jss.v108.i03.
## Not run: set.seed(123) # Set the parameters params <- list(alpha = 0, subsample = 1, max_delta_step = 0, base_score = 0) # Define DML objects ml_regr_xgb <- do.call(lrn, c(list("regr.xgboost"), params)) ml_l_plr <- ml_regr_xgb$clone() ml_m_plr <- ml_regr_xgb$clone() plr_data <- DoubleML::make_plr_CCDDHNR2018(500, dim_x = 10) plr_obj <- DoubleML::DoubleMLPLR$new(plr_data, ml_l_plr, ml_m_plr, n_folds = 3) # Fit the model plr_obj$fit(store_models = TRUE, store_predictions = TRUE) # Calculate the outcome weights omega_plr <- get_outcome_weights(plr_obj, plr_data) # Check equivalence all.equal(omega_plr$omega %*% plr_data$data$y, plr_obj$all_coef) ## End(Not run)## Not run: set.seed(123) # Set the parameters params <- list(alpha = 0, subsample = 1, max_delta_step = 0, base_score = 0) # Define DML objects ml_regr_xgb <- do.call(lrn, c(list("regr.xgboost"), params)) ml_l_plr <- ml_regr_xgb$clone() ml_m_plr <- ml_regr_xgb$clone() plr_data <- DoubleML::make_plr_CCDDHNR2018(500, dim_x = 10) plr_obj <- DoubleML::DoubleMLPLR$new(plr_data, ml_l_plr, ml_m_plr, n_folds = 3) # Fit the model plr_obj$fit(store_models = TRUE, store_predictions = TRUE) # Calculate the outcome weights omega_plr <- get_outcome_weights(plr_obj, plr_data) # Check equivalence all.equal(omega_plr$omega %*% plr_data$data$y, plr_obj$all_coef) ## End(Not run)
instrumental_forest functionPost-estimation command to extract outcome weights for instrumental forest
implemented via the instrumental_forest function from the grf package.
## S3 method for class 'instrumental_forest' get_outcome_weights( object, ..., S, S.tau = NULL, newdata = NULL, target = "CLATE", compliance.score = NULL, checks = TRUE )## S3 method for class 'instrumental_forest' get_outcome_weights( object, ..., S, S.tau = NULL, newdata = NULL, target = "CLATE", compliance.score = NULL, checks = TRUE )
object |
An object of class |
... |
Pass potentially generic get_outcome_weights options. |
S |
A smoother matrix reproducing the outcome predictions used in building the |
S.tau |
Required if |
newdata |
Corresponds to |
target |
Target parameter for which outcome weights should be extracted |
compliance.score |
Only relevant if |
checks |
Default |
get_outcome_weights object with omega containing weights and treat the treatment
Athey, S., Tibshirani, J., & Wager, S. (2019). Generalized random forest. The Annals of Statistics, 47(2), 1148-1178.
Knaus, M. C. (2024). Treatment effect estimators as weighted outcomes, https://arxiv.org/abs/2411.11559.
# Sample from DGP borrowed from grf documentation n = 2000 p = 5 X = matrix(rbinom(n * p, 1, 0.5), n, p) Z = rbinom(n, 1, 0.5) Q = rbinom(n, 1, 0.5) W = Q * Z tau = X[, 1] / 2 Y = rowSums(X[, 1:3]) + tau * W + Q + rnorm(n) # Run outcome regression and extract smoother matrix forest.Y = grf::regression_forest(X, Y) Y.hat = predict(forest.Y)$predictions outcome_smoother = grf::get_forest_weights(forest.Y) # Run instrumental forest with external Y.hats iv.forest = grf::instrumental_forest(X, Y, W, Z, Y.hat = Y.hat) # Predict on out-of-bag training samples. iv.pred = predict(iv.forest)$predictions omega_if = get_outcome_weights(iv.forest, S = outcome_smoother) # Observe that they perfectly replicate the original CLATEs all.equal(as.numeric(omega_if$omega %*% Y), as.numeric(iv.pred))# Sample from DGP borrowed from grf documentation n = 2000 p = 5 X = matrix(rbinom(n * p, 1, 0.5), n, p) Z = rbinom(n, 1, 0.5) Q = rbinom(n, 1, 0.5) W = Q * Z tau = X[, 1] / 2 Y = rowSums(X[, 1:3]) + tau * W + Q + rnorm(n) # Run outcome regression and extract smoother matrix forest.Y = grf::regression_forest(X, Y) Y.hat = predict(forest.Y)$predictions outcome_smoother = grf::get_forest_weights(forest.Y) # Run instrumental forest with external Y.hats iv.forest = grf::instrumental_forest(X, Y, W, Z, Y.hat = Y.hat) # Predict on out-of-bag training samples. iv.pred = predict(iv.forest)$predictions omega_if = get_outcome_weights(iv.forest, S = outcome_smoother) # Observe that they perfectly replicate the original CLATEs all.equal(as.numeric(omega_if$omega %*% Y), as.numeric(iv.pred))
Post-estimation command to calculate the outcome weights for
Instrumental-Variable Regression implemented via the ivreg command.
Automatically takes sampling weights into account if used.
## S3 method for class 'ivreg' get_outcome_weights(object, treat_pos = 2, ...)## S3 method for class 'ivreg' get_outcome_weights(object, treat_pos = 2, ...)
object |
An object of class |
treat_pos |
The position of the treatment indicator in the model matrix. By default 2, which is appropriate for the case with an intercept and the (endogneous) treatment indicator as the first explanatory variable. |
... |
Pass potentially generic get_outcome_weights options. |
A vector of implied weights.
Post-estimation command to calculate the outcome weights for linear regression implemented via the lm command.
## S3 method for class 'lm' get_outcome_weights(object, treat_pos = 2, ...)## S3 method for class 'lm' get_outcome_weights(object, treat_pos = 2, ...)
object |
An object of class |
treat_pos |
The position of the treatment indicator in the model matrix. By default 2, which is appropriate for the case with an intercept and the treatment indicator as the first explanatory variable. |
... |
Pass potentially generic get_outcome_weights options. |
get_outcome_weights object with omega containing weights and treat the treatment
lm_robust commandPost-estimation command to calculate the outcome weights for linear regression implemented via the lm_robust command.
## S3 method for class 'lm_robust' get_outcome_weights(object, treat_pos = 2, ...)## S3 method for class 'lm_robust' get_outcome_weights(object, treat_pos = 2, ...)
object |
An object of class |
treat_pos |
The position of the treatment indicator in the model matrix. By default 2, which is appropriate for the case with an intercept and the treatment indicator as the first explanatory variable. |
... |
Pass potentially generic get_outcome_weights options. |
get_outcome_weights object with omega containing weights and treat the treatment
This is a generic method for getting smoother weights. It calculates the smoother weights for objects created by other packages. See get_smoother_weight.<compatible_class> in the package documentation for compatible functions.
get_smoother_weights(object, ...)get_smoother_weights(object, ...)
object |
An object obtained from other packages. |
... |
Additional arguments specific to object class implementations. See the documentation for which objects require which additional arguments. |
An smoother matrix.
plasso::cv.plasso functionPost-estimation function to extract smoother weights for a post-Lasso regression
estimated via the cv.plasso function from the plasso package.
## S3 method for class 'cv.plasso' get_smoother_weights(object, Xnew = NULL, ...)## S3 method for class 'cv.plasso' get_smoother_weights(object, Xnew = NULL, ...)
object |
An object of class |
Xnew |
Optional covariate matrix of the test sample. If |
... |
Additional arguments passed to get_smoother_weights. |
An smoother matrix.
Knaus, M. C. (2024). Treatment effect estimators as weighted outcomes, https://arxiv.org/abs/2411.11559.
## Not run: set.seed(123) n <- 100 X <- matrix(rnorm(n * 3), ncol = 3) Y <- 2 * X[, 1] + 3 * X[, 2] - 1 * X[, 3] + rnorm(n) # Fit post-lasso model object <- plasso::cv.plasso(x = X, y = Y) # Get predictions preds <- predict(object, newx = X, type = "response", s = "optimal", se_rule = 0)$plasso # Get smoother matrix S <- get_smoother_weights(object, Xnew = X) # Check equivalence all.equal(as.numeric(S %*% Y), as.numeric(preds)) ## End(Not run)## Not run: set.seed(123) n <- 100 X <- matrix(rnorm(n * 3), ncol = 3) Y <- 2 * X[, 1] + 3 * X[, 2] - 1 * X[, 3] + rnorm(n) # Fit post-lasso model object <- plasso::cv.plasso(x = X, y = Y) # Get predictions preds <- predict(object, newx = X, type = "response", s = "optimal", se_rule = 0)$plasso # Get smoother matrix S <- get_smoother_weights(object, Xnew = X) # Check equivalence all.equal(as.numeric(S %*% Y), as.numeric(preds)) ## End(Not run)
drf::drf functionPost-estimation function to extract smoother weights for a distributional random
forest model estimated via the drf function from the drf package.
## S3 method for class 'drf' get_smoother_weights(object, Xnew = NULL, ...)## S3 method for class 'drf' get_smoother_weights(object, Xnew = NULL, ...)
object |
An object of class |
Xnew |
Optional covariate matrix of the test sample. If |
... |
Additional arguments passed to get_smoother_weights. |
An smoother matrix.
## Not run: set.seed(123) n <- 100 X <- matrix(rnorm(n * 3), ncol = 3) Y <- 2 * X[, 1] + 3 * X[, 2] - 1 * X[, 3] + rnorm(n) # Fit distributional forest object <- drf::drf(X = X, Y = Y) # Get predictions preds <- predict(object, newdata = X, functional = "mean")[["mean"]] # Get smoother matrix S <- get_smoother_weights(object) # Check equivalence all.equal(as.numeric(S %*% Y), as.numeric(preds)) ## End(Not run)## Not run: set.seed(123) n <- 100 X <- matrix(rnorm(n * 3), ncol = 3) Y <- 2 * X[, 1] + 3 * X[, 2] - 1 * X[, 3] + rnorm(n) # Fit distributional forest object <- drf::drf(X = X, Y = Y) # Get predictions preds <- predict(object, newdata = X, functional = "mean")[["mean"]] # Get smoother matrix S <- get_smoother_weights(object) # Check equivalence all.equal(as.numeric(S %*% Y), as.numeric(preds)) ## End(Not run)
stats::lm functionPost-estimation function to extract smoother weights for a linear regression
estimated via the lm function from the stats package.
## S3 method for class 'lm' get_smoother_weights(object, X, Xnew, ...)## S3 method for class 'lm' get_smoother_weights(object, X, Xnew, ...)
object |
An object of class |
X |
Covariate matrix of the training sample. |
Xnew |
Optional covariate matrix of the test sample. If |
... |
Additional arguments passed to get_smoother_weights. |
An smoother matrix.
## Not run: set.seed(123) n <- 100 X <- matrix(rnorm(n * 3), ncol = 3) Y <- 2 * X[, 1] + 3 * X[, 2] - 1 * X[, 3] + rnorm(n) # Fit linear model object <- lm(Y ~ 0 + X) # Get fitted values preds <- object$fitted.values # Get smoother matrix S <- get_smoother_weights(object, X = X, Xnew = X) # Check equivalence all.equal(as.numeric(S %*% Y), as.numeric(preds)) ## End(Not run)## Not run: set.seed(123) n <- 100 X <- matrix(rnorm(n * 3), ncol = 3) Y <- 2 * X[, 1] + 3 * X[, 2] - 1 * X[, 3] + rnorm(n) # Fit linear model object <- lm(Y ~ 0 + X) # Get fitted values preds <- object$fitted.values # Get smoother matrix S <- get_smoother_weights(object, X = X, Xnew = X) # Check equivalence all.equal(as.numeric(S %*% Y), as.numeric(preds)) ## End(Not run)
estimatr::lm_robust functionPost-estimation function to extract smoother weights for a linear regression
estimated via the lm_robust function from the estimatr package.
## S3 method for class 'lm_robust' get_smoother_weights(object, X, Xnew, ...)## S3 method for class 'lm_robust' get_smoother_weights(object, X, Xnew, ...)
object |
An object of class |
X |
Covariate matrix of the training sample. |
Xnew |
Optional covariate matrix of the test sample. If |
... |
Additional arguments passed to get_smoother_weights. |
An smoother matrix.
## Not run: set.seed(123) n <- 100 X <- matrix(rnorm(n * 3), ncol = 3) Y <- 2 * X[, 1] + 3 * X[, 2] - 1 * X[, 3] + rnorm(n) # Fit linear model object <- estimatr::lm_robust(Y ~ 0 + X) # Get fitted values preds <- object$fitted.values # Get smoother matrix S <- get_smoother_weights(object, X = X, Xnew = X) # Check equivalence all.equal(as.numeric(S %*% Y), as.numeric(preds)) ## End(Not run)## Not run: set.seed(123) n <- 100 X <- matrix(rnorm(n * 3), ncol = 3) Y <- 2 * X[, 1] + 3 * X[, 2] - 1 * X[, 3] + rnorm(n) # Fit linear model object <- estimatr::lm_robust(Y ~ 0 + X) # Get fitted values preds <- object$fitted.values # Get smoother matrix S <- get_smoother_weights(object, X = X, Xnew = X) # Check equivalence all.equal(as.numeric(S %*% Y), as.numeric(preds)) ## End(Not run)
Extract smoother matrices for nuisance parameter outcome models from a
NuisanceParameters object created by nuisance_parameters.
## S3 method for class 'NuisanceParameters' get_smoother_weights(object, NuPa, quiet = TRUE, subset = NULL, ...)## S3 method for class 'NuisanceParameters' get_smoother_weights(object, NuPa, quiet = TRUE, subset = NULL, ...)
object |
A |
NuPa |
Character vector specifying which nuisance parameters to extract
smoothers for. Available outcome nuisance parameters
are |
quiet |
If |
subset |
Logical vector indicating which observations to use when determining ensemble weights. If not provided, all observations are used. |
... |
Additional arguments. |
A list of matrices of dimension nrow(X) × nrow(X) containing ensemble smoother
matrices
ranger::ranger functionPost-estimation function to extract smoother weights for a random forest model
estimated via the ranger function from the ranger package.
## S3 method for class 'ranger' get_smoother_weights(object, X = NULL, Xnew = NULL, ...)## S3 method for class 'ranger' get_smoother_weights(object, X = NULL, Xnew = NULL, ...)
object |
An object of class |
X |
Optional covariate matrix of the training sample; used if not stored in the fitted object. |
Xnew |
Covariate matrix of test sample. If not provided, prediction is done for the training sample. |
... |
Additional arguments passed to get_smoother_weights. |
An smoother matrix.
## Not run: set.seed(123) n <- 100 X <- matrix(rnorm(n * 3), ncol = 3, dimnames = list(NULL, 1:3)) Y <- 2 * X[, 1] + 3 * X[, 2] - 1 * X[, 3] + rnorm(n) # Fit random forest object <- ranger::ranger(x = X, y = Y, keep.inbag = TRUE) # Get predictions preds <- predict(object, data = X)$predictions # Get smoother matrix S <- get_smoother_weights(object, X = X) # Check equivalence all.equal(as.numeric(S %*% Y), as.numeric(preds)) ## End(Not run)## Not run: set.seed(123) n <- 100 X <- matrix(rnorm(n * 3), ncol = 3, dimnames = list(NULL, 1:3)) Y <- 2 * X[, 1] + 3 * X[, 2] - 1 * X[, 3] + rnorm(n) # Fit random forest object <- ranger::ranger(x = X, y = Y, keep.inbag = TRUE) # Get predictions preds <- predict(object, data = X)$predictions # Get smoother matrix S <- get_smoother_weights(object, X = X) # Check equivalence all.equal(as.numeric(S %*% Y), as.numeric(preds)) ## End(Not run)
grf::regression_forest functionPost-estimation function to extract smoother weights for a random forest model
estimated via the regression_forest function from the grf package.
## S3 method for class 'regression_forest' get_smoother_weights(object, Xnew = NULL, ...)## S3 method for class 'regression_forest' get_smoother_weights(object, Xnew = NULL, ...)
object |
An object of class |
Xnew |
Optional covariate matrix of the test sample. If |
... |
Additional arguments passed to get_smoother_weights. |
An smoother matrix.
Athey, S., Tibshirani, J., & Wager, S. (2019). Generalized random forest. The Annals of Statistics, 47(2), 1148-1178.
Knaus, M. C. (2024). Treatment effect estimators as weighted outcomes, https://arxiv.org/abs/2411.11559.
## Not run: set.seed(123) n <- 100 X <- matrix(rnorm(n * 3), ncol = 3) Y <- 2 * X[, 1] + 3 * X[, 2] - 1 * X[, 3] + rnorm(n) # Fit regression forest object <- grf::regression_forest(X = X, Y = Y) # Get predictions preds <- predict(object, newdata = X)$predictions # Get smoother matrix S <- get_smoother_weights(object) # Check equivalence all.equal(as.numeric(S %*% Y), as.numeric(preds)) ## End(Not run)## Not run: set.seed(123) n <- 100 X <- matrix(rnorm(n * 3), ncol = 3) Y <- 2 * X[, 1] + 3 * X[, 2] - 1 * X[, 3] + rnorm(n) # Fit regression forest object <- grf::regression_forest(X = X, Y = Y) # Get predictions preds <- predict(object, newdata = X)$predictions # Get smoother matrix S <- get_smoother_weights(object) # Check equivalence all.equal(as.numeric(S %*% Y), as.numeric(preds)) ## End(Not run)
hdm::rlasso functionPost-estimation function to extract smoother weights for a post-Lasso regression
estimated via the rlasso function from the hdm package.
## S3 method for class 'rlasso' get_smoother_weights(object, Xnew = NULL, ...)## S3 method for class 'rlasso' get_smoother_weights(object, Xnew = NULL, ...)
object |
An object of class |
Xnew |
Optional covariate matrix of the test sample. If |
... |
Additional arguments passed to get_smoother_weights. |
An smoother matrix.
Knaus, M. C. (2024). Treatment effect estimators as weighted outcomes, https://arxiv.org/abs/2411.11559.
## Not run: set.seed(123) n <- 100 X <- matrix(rnorm(n * 3), ncol = 3) Y <- 2 * X[, 1] + 3 * X[, 2] - 1 * X[, 3] + rnorm(n) # Fit post-lasso model object <- hdm::rlasso(x = X, y = Y, post = TRUE) # Get predictions preds <- predict(object, newdata = X) # Get smoother matrix S <- get_smoother_weights(object, Xnew = X) # Check equivalence all.equal(as.numeric(S %*% Y), as.numeric(preds)) ## End(Not run)## Not run: set.seed(123) n <- 100 X <- matrix(rnorm(n * 3), ncol = 3) Y <- 2 * X[, 1] + 3 * X[, 2] - 1 * X[, 3] + rnorm(n) # Fit post-lasso model object <- hdm::rlasso(x = X, y = Y, post = TRUE) # Get predictions preds <- predict(object, newdata = X) # Get smoother matrix S <- get_smoother_weights(object, Xnew = X) # Check equivalence all.equal(as.numeric(S %*% Y), as.numeric(preds)) ## End(Not run)
xgboost::xgb.train functionPost-estimation function to extract smoother weights for an XGBoost model
estimated via the xgb.train function from the xgboost package.
## S3 method for class 'xgb.Booster' get_smoother_weights(object, X = NULL, Y = NULL, Xnew = NULL, ...)## S3 method for class 'xgb.Booster' get_smoother_weights(object, X = NULL, Y = NULL, Xnew = NULL, ...)
object |
An object of class |
X |
Covariate matrix of the training sample. |
Y |
Vector of outcomes of the training sample. |
Xnew |
Optional covariate matrix of the test sample. If |
... |
Additional arguments passed to get_smoother_weights. |
An smoother matrix.
For smoother extraction, the XGBoost model must be trained with:
alpha = 0, subsample = 1, max_delta_step = 0.
## Not run: set.seed(123) n <- 100 X <- matrix(rnorm(n * 3), ncol = 3) Y <- 2 * X[, 1] + 3 * X[, 2] - 1 * X[, 3] + rnorm(n) # Set the parameters nrounds <- 100 params <- list(alpha = 0, subsample = 1, max_delta_step = 0) # Prepare the data d_train <- xgboost::xgb.DMatrix(data = as.matrix(X), label = Y) d_test <- xgboost::xgb.DMatrix(data = as.matrix(X)) # Fit XGBoost object <- xgboost::xgb.train(data = d_train, nrounds = nrounds, params = params) # Get predictions preds <- predict(object, newdata = d_test) # Get smoother matrix S <- get_smoother_weights(object, X, Y) # Check equivalence all.equal(as.numeric(S %*% Y), as.numeric(preds)) ## End(Not run)## Not run: set.seed(123) n <- 100 X <- matrix(rnorm(n * 3), ncol = 3) Y <- 2 * X[, 1] + 3 * X[, 2] - 1 * X[, 3] + rnorm(n) # Set the parameters nrounds <- 100 params <- list(alpha = 0, subsample = 1, max_delta_step = 0) # Prepare the data d_train <- xgboost::xgb.DMatrix(data = as.matrix(X), label = Y) d_test <- xgboost::xgb.DMatrix(data = as.matrix(X)) # Fit XGBoost object <- xgboost::xgb.train(data = d_train, nrounds = nrounds, params = params) # Get predictions preds <- predict(object, newdata = d_test) # Get smoother matrix S <- get_smoother_weights(object, X, Y) # Check equivalence all.equal(as.numeric(S %*% Y), as.numeric(preds)) ## End(Not run)
This function estimates different nuisance parameters using the honest random forest implementation of the 'grf' package
NuPa_honest_forest( NuPa = c("Y.hat", "Y.hat.d", "Y.hat.z", "D.hat", "D.hat.z", "Z.hat"), X, Y = NULL, D = NULL, Z = NULL, n_cf_folds = 5, n_reps = 1, cluster = NULL, progress = FALSE, ... )NuPa_honest_forest( NuPa = c("Y.hat", "Y.hat.d", "Y.hat.z", "D.hat", "D.hat.z", "Z.hat"), X, Y = NULL, D = NULL, Z = NULL, n_cf_folds = 5, n_reps = 1, cluster = NULL, progress = FALSE, ... )
NuPa |
String vector specifying the nuisance parameters to be estimated.
Currently supported: |
X |
Covariate matrix with N rows and p columns. |
Y |
Optional numeric vector containing the outcome variable. |
D |
Optional binary treatment variable. |
Z |
Optional binary instrumental variable. |
n_cf_folds |
Number of cross-fitting folds. Default is 5. |
n_reps |
Number of repetitions of cross-fitting. Default is 1. |
cluster |
Optional vector of cluster variable if cross-fitting should account for clusters. |
progress |
If TRUE, progress of nuisance parameter estimation reported. |
... |
Options passed to the |
List of two lists.
predictions contains the requested nuisance parameters
smoothers contains the smoother matrices of requested outcome nuisance parameters
cf_mat Array of dimension n_reps x N x n_cf_folds storing indicators representing the folds used in estimation.
Wager, S., & Athey, S. (2018). Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113(523), 1228-1242.
This is a generic function taking pseudo-instrument, pseudo-treatment and the transformation matrix as inputs and returning outcome weights
pive_weight_maker(Z.tilde, D.tilde, T_mat)pive_weight_maker(Z.tilde, D.tilde, T_mat)
Z.tilde |
Numeric vector of pseudo-instrument outcomes. |
D.tilde |
Numeric vector of pseudo-treatment. |
T_mat |
Transformation matrix |
A vector of outcome weights.
Knaus, M. C. (2024). Treatment effect estimators as weighted outcomes, soon on 'arXiv'.
plot method for class dml_with_smoother
plot method for class dml_with_smoother
## S3 method for class 'dml_with_smoother' plot(x, ..., alpha = 0.05, contrast = FALSE)## S3 method for class 'dml_with_smoother' plot(x, ..., alpha = 0.05, contrast = FALSE)
x |
Object of class |
... |
Pass generic |
alpha |
Significance level for confidence intervals (default 0.05). |
contrast |
Shows the differences between the coefficients. |
ggplot with point estimates and confidence intervals.
Creates matrix of binary cross-fitting fold indicators (N x # cross-folds)
prep_cf_mat(n, cf, w_mat = NULL, cl = NULL)prep_cf_mat(n, cf, w_mat = NULL, cl = NULL)
n |
Number of observations. |
cf |
Number of cross-fitting folds. |
w_mat |
Optional logical matrix of treatment indicators (N x T+1). If specified, cross-fitting folds will preserve the treatment ratios from full sample. |
cl |
Optional vector of cluster variable if cross-fitting should account for clusters. |
Logical matrix of cross-fitting folds (N x # folds).
Calculates standardized mean differences between treated and controls and towards target means for an outcome weights matrix with potentially many rows like for CATEs.
standardized_mean_differences(X, treat, omega, target = NULL)standardized_mean_differences(X, treat, omega, target = NULL)
X |
Covariate matrix with N rows and p columns. |
treat |
Binary treatment variable. |
omega |
Outcome weights matrix with dimension number of weight vectors for which balancing should be checked x number of training units. |
target |
Optional matrix with dimension number of weight vectors for which balancing should be checked x p indicating the target values the covariates should be balanced towards. If NULL, average of X used as target of ATE. |
3D-array of dimension p x 6 x number of weight vectors for which balancing should be checked where the second dimension provides the following quantities:
"Mean 0": The weighted control mean
"Mean 1": The weighted treated mean
"SMD balancing": Standardized mean differences for covariate balancing (Mean 1 - Mean 0) / sd(X)
"SMD targeting 0": Standardized mean difference to assess targeting of control (Mean 0 - target) / sd(X)
"SMD targeting 1": Standardized mean difference to assess targeting of treated (Mean 1 - target) / sd(X)
Rosenbaum, P. R., & Rubin, D. B. (1984). Reducing bias in observational studies using subclassification on the propensity score. Journal of the American Statistical Association, 79 (387), 516–524.
summary method for class dml_with_smoother
summary method for class dml_with_smoother
## S3 method for class 'dml_with_smoother' summary(object, contrast = FALSE, quiet = FALSE, ...)## S3 method for class 'dml_with_smoother' summary(object, contrast = FALSE, quiet = FALSE, ...)
object |
Object of class |
contrast |
Tests the differences between the coefficients. |
quiet |
If TRUE, results are passed but not printed. |
... |
further arguments passed to |
Invisible matrix with estimator(s) in the rows and c("Estimate","SE","t","p") in the columns.
summary method for class outcome_weights
Calculates several summary measures of potentially many outcome weights.
## S3 method for class 'get_outcome_weights' summary(object, quiet = FALSE, digits = 4, epsilon = 1e-04, ...)## S3 method for class 'get_outcome_weights' summary(object, quiet = FALSE, digits = 4, epsilon = 1e-04, ...)
object |
get_outcome_weights object. |
quiet |
If TRUE, results are passed but not printed. |
digits |
Number of digits to be displayed. Default 4. |
epsilon |
Threshold below which in absolute values non-zero but small values should be displayed as < ... |
... |
further arguments passed to |
3D-array of dimension
c("Control","Treated") x
number of point estimates x
c("Minimum weight","Maximum weight","% Negative","Sum largest 10%","Sum of weights","Sum of absolute weights")
summary method for class standardized_mean_differences
Calls a C++ function to quickly summarize potentially many standardized mean differences.
## S3 method for class 'standardized_mean_differences' summary(object, ...)## S3 method for class 'standardized_mean_differences' summary(object, ...)
object |
Object of class |
... |
further arguments passed to summary method. |
3D-array of dimension
c("Maximum absolute SMD","Mean absolute SMD","Median absolute SMD", / % of absolute SMD > 20", "# / % of absolute SMD > 10","# / % of absolute SMD > 5") x
c("Balancing","Targeting") x
number of weight vectors for which balancing should be checked
Rosenbaum, P. R., & Rubin, D. B. (1984). Reducing bias in observational studies using subclassification on the propensity score. Journal of the American Statistical Association, 79 (387), 516–524.