Title: | R Package for Designing and Analyzing Randomized Experiments |
---|---|
Description: | Provides various statistical methods for designing and analyzing randomized experiments. One functionality of the package is the implementation of randomized-block and matched-pair designs based on possibly multivariate pre-treatment covariates. The package also provides the tools to analyze various randomized experiments including cluster randomized experiments, two-stage randomized experiments, randomized experiments with noncompliance, and randomized experiments with missing data. |
Authors: | Kosuke Imai [aut, cre], Zhichao Jiang [aut] |
Maintainer: | Kosuke Imai <[email protected]> |
License: | GPL (>=2) |
Version: | 1.2.1 |
Built: | 2024-12-22 03:32:06 UTC |
Source: | https://github.com/kosukeimai/experiment |
This function computes the sharp bounds on the average treatment effect when some of the outcome data are missing. The confidence intervals for the bounds are also computed.
ATEbounds( formula, data = parent.frame(), maxY = NULL, minY = NULL, alpha = 0.05, n.reps = 0, strata = NULL, ratio = NULL, survey = NULL, ... )
ATEbounds( formula, data = parent.frame(), maxY = NULL, minY = NULL, alpha = 0.05, n.reps = 0, strata = NULL, ratio = NULL, survey = NULL, ... )
formula |
A formula of the form |
data |
A data frame containing the relevant variables. |
maxY |
A scalar. The maximum value of the outcome variable. The default is the maximum sample value. |
minY |
A scalar. The minimum value of the outcome variable. The default is the minimum sample value. |
alpha |
A positive scalar that is less than or equal to 0.5. This will
determine the (1- |
n.reps |
A positive integer. The number of bootstrap replicates used for the construction of confidence intervals via B-method of Berran (1988). If it equals zero, the confidence intervals will not be constructed. |
strata |
The variable name indicating strata. If this is specified, the
quantities of interest will be first calculated within each strata and then
aggregated. The default is |
ratio |
A |
survey |
The variable name for survey weights. The default is
|
... |
The arguments passed to other functions. |
For the details of the method implemented by this function, see the references.
A list of class ATEbounds
which contains the following items:
call |
The matched call. |
Y |
The outcome variable. |
D |
The treatment variable. |
bounds |
The point estimates of the sharp bounds on the average treatment effect. |
bounds.Y |
The point estimates of the sharp bounds on the outcome variable within each treatment/control group. |
bmethod.ci |
The B-method confidence interval of the bounds on the average treatment effect. |
bonf.ci |
The Bonferroni confidence interval of the bounds on the average treatment effect. |
bonf.ci.Y |
The Bonferroni confidence interval of the bounds on the outcome variable within each treatment/control group. |
bmethod.ci.Y |
The B-method confidence interval of the bounds on the outcome variable within each treatment/control group. |
maxY |
The maximum value of the outcome variable used in the computation. |
minY |
The minimum value of the outcome variable used in the computation. |
nobs |
The number of observations. |
nobs.Y |
The number of observations within each treatment/control group. |
ratio |
The probability of treatment assignment (within each strata if
|
Kosuke Imai, Department of Government and Department of Statistics, Harvard University [email protected], https://imai.fas.harvard.edu;
Horowitz, Joel L. and Charles F. Manski. (1998). “Censoring of Outcomes and Regressors due to Survey Nonresponse: Identification and Estimation Using Weights and Imputations.” Journal of Econometrics, Vol. 84, pp.37-58.
Horowitz, Joel L. and Charles F. Manski. (2000). “Nonparametric Analysis of Randomized Experiments With Missing Covariate and Outcome Data.” Journal of the Americal Statistical Association, Vol. 95, No. 449, pp.77-84.
Harris-Lacewell, Melissa, Kosuke Imai, and Teppei Yamamoto. (2007). “Racial Gaps in the Responses to Hurricane Katrina: An Experimental Study”, Technical Report. Department of Politics, Princeton University.
This function estimates various average treatment effect in cluster-randomized experiments without using pre-treatment covariates. The treatment variable is assumed to be binary. Currently, only the matched-pair design is allowed. The details of the methods for this design are given in Imai, King, and Nall (2007).
ATEcluster( Y, Z, grp, data = parent.frame(), match = NULL, weights = NULL, fpc = TRUE )
ATEcluster( Y, Z, grp, data = parent.frame(), match = NULL, weights = NULL, fpc = TRUE )
Y |
The outcome variable of interest. |
Z |
The (randomized) cluster-level treatment variable. This variable should be binary. Two units in the same cluster should have the same value. |
grp |
A variable indicating clusters of units. Two units in the same cluster should have the same value. |
data |
A data frame containing the relevant variables. |
match |
A variable indicating matched-pairs of clusters. Two units in
the same matched-pair of clusters should have the same value. The default is
|
weights |
A variable indicating the population cluster sizes, which
will be used to construct weights for each pair of clusters. Two units in
the same cluster should have the same value. The default is |
fpc |
A logical variable indicating whether or not finite population
correction should be used for estimating the lower bound of CACE variance.
This is relevant only when |
A list of class ATEcluster
which contains the following
items:
call |
The matched call. |
n |
The total number of units. |
n1 |
The total number of units in the treatment group. |
n0 |
The total number of units in the control group. |
Y |
The outcome variable. |
Y1bar |
The cluster-specific (unweighted) average value of the observed outcome for the treatment group. |
Y0bar |
The cluster-specific (unweighted) average value of the observed outcome for the treatment group. |
Y1var |
The cluster-specific sample variance of the observed outcome for the treatment group. |
Y0var |
The cluster-specific sample variance of the observed outcome for the control group. |
Z |
The treatment variable. |
grp |
The cluster-indicator variable. |
match |
The matched-pair indicator variable. |
weights |
The weight variable in its original form. |
est |
The estimated average treatment effect based on the arithmetic mean weights. |
var |
The estimated variance of the average treatment effect estimator based on the arithmetic mean weights. This uses the variance formula provided in Imai, King, and Nall (2007). |
var.lb |
The estimated sharp lower bound of the cluster average treatment effect estimator using the arithmetic mean weights. |
est.dk |
The estimated average treatment effect based on the harmonic mean weights. |
var.dk |
The estimated variance of the average treatment effect estimator based on the harmonic mean weights. This uses the variance formula provided in Donner and Klar (1993). |
dkvar |
The estimated variance of the average treatment effect estimator based on the harmonic mean weights. This uses the variance formula provided in Imai, King, and Nall (2007). |
eff |
The estimated relative efficiency of the matched-pair design over the completely randomized design (the ratio of two estimated variances). |
m |
The number of pairs in the matched-pair design. |
N1 |
The population cluster sizes for the treatment group. |
N0 |
The population cluster sizes for the control group. |
w1 |
Cluster-specific weights for the treatment group. |
w0 |
Cluster-specific weights for the control group. |
w |
Pair-specific
normalized arithmetic mean weights. These weights sum up to the total number
of units in the sample, i.e., |
w.dk |
Pair-specific
normalized harmonic mean weights. These weights sum up to the total number
of units in the sample, i.e., |
diff |
Within-pair
differences if the matched-pair design is analyzed. This equals the
difference between |
Kosuke Imai, Department of Government and Department of Statistics, Harvard University [email protected], https://imai.fas.harvard.edu;
Donner, A. and N. Klar (1993). “Confidence interval construction for effect measures arising from cluster randomized trials.” Journal of Clinical Epidemiology. Vol. 46, No. 2, pp. 123-131.
Imai, Kosuke, Gary King, and Clayton Nall (2007). “The Essential Role of Pair Matching in Cluster-Randomized Experiments, with Application to the Mexican Universal Health Insurance Evaluation”, Technical Report. Department of Politics, Princeton University.
This function computes the standard “difference-in-means” estimate of the average treatment effect in randomized experiments without using pre-treatment covariates. The treatment variable is assumed to be binary. Currently, the two designs are allowed: complete randomized design and matched-pair design.
ATEnocov(Y, Z, data = parent.frame(), match = NULL)
ATEnocov(Y, Z, data = parent.frame(), match = NULL)
Y |
The outcome variable of interest. |
Z |
The (randomized) treatment variable. This variable should be binary. |
data |
A data frame containing the relevant variables. |
match |
A variable indicating matched-pairs. The two units in the same matched-pair should have the same value. |
A list of class ATEnocov
which contains the following items:
call |
The matched call. |
Y |
The outcome variable. |
Z |
The treatment variable. |
match |
The matched-pair indicator variable. |
ATEest |
The estimated average treatment effect. |
ATE.var |
The estimated variance of the average treatment effect estimator. |
diff |
Within-pair differences if the matched-pair design is analyzed. |
Kosuke Imai, Department of Government and Department of Statistics, Harvard University [email protected], https://imai.fas.harvard.edu;
Imai, Kosuke, (2008). “Randomization-based Inference and Efficiency Analysis in Experiments under the Matched-Pair Design”, Statistics in Medicine.
This function computes the no assumption bounds on the average treatment effect among always-observed pairs (ATOP) when some of the outcome data are missing. The confidence intervals for the ATOP are also computed.
ATOPnoassumption(Ya, Yb, Ra, Rb, Ta, Tb, l, u, alpha, rep)
ATOPnoassumption(Ya, Yb, Ra, Rb, Ta, Tb, l, u, alpha, rep)
Ya |
A vector of the outcomes of the first unit in the matched pairs. The missing values for |
Yb |
A vector of the outcomes of the second unit in the matched pairs. The missing values for |
Ra |
A vector of the missing data indicators of the first unit in the matched pairs. |
Rb |
A vector of the missing data indicators of the second unit in the matched pairs. |
Ta |
A vector of the treatment conditions of the first unit in the matched pairs. |
Tb |
A vector of the treatment conditions of the second unit in the matched pairs. |
l |
The lower limit of the outcome. |
u |
The upper limit of the outcome. |
alpha |
A positive scalar that is less than or equal to 0.5. This will
determine the (1- |
rep |
The number of repetitions for bootstraping. |
For the details of the method implemented by this function, see the references.
A list of class ATOPnoassumption
which contains the following items:
LB |
The lower bound for the ATOP. |
UB |
The upper bound for the ATOP. |
LB.CI |
The lower limit of the confidence interval for the ATOP. |
UB.CI |
The upper limit of the confidence interval for the ATOP. |
Kosuke Imai, Department of Government and Department of Statistics, Harvard University [email protected], https://imai.fas.harvard.edu; Zhichao Jiang, Department of Politics, Princeton University [email protected].
Kosuke Imai and Zhichao Jiang (2018). “A Sensitivity Analysis for Missing Outcomes Due to Truncation-by-Death under the Matched-Pairs Design”, Technical Report. Department of Politics, Princeton University.
data(seguro) attach(seguro) ATOPnoassumption(Ya,Yb,Ra,Rb,Ta,Tb,l=0,u=1,alpha=0.05,rep=100)
data(seguro) attach(seguro) ATOPnoassumption(Ya,Yb,Ra,Rb,Ta,Tb,l=0,u=1,alpha=0.05,rep=100)
This function computes the bounds on the average treatment effect among always-observed pairs (ATOP) with pre-specified sensivity parameters when some of the outcome data are missing. The sensivity parameters characterizes the degree of the within-pair similarity and the dependence between the potential missing indicators and the treatment. The confidence intervals for the ATOP are also computed.
ATOPobs(Ya, Yb, Ra, Rb, Ta, Tb, gamma, kappa1, kappa0, l, u, alpha, rep)
ATOPobs(Ya, Yb, Ra, Rb, Ta, Tb, gamma, kappa1, kappa0, l, u, alpha, rep)
Ya |
A vector of the outcomes of the first unit in the matched pairs. The missing values for |
Yb |
A vector of the outcomes of the second unit in the matched pairs. The missing values for |
Ra |
A vector of the missing data indicators of the first unit in the matched pairs. |
Rb |
A vector of the missing data indicators of the second unit in the matched pairs. |
Ta |
A vector of the treatment conditions of the first unit in the matched pairs. |
Tb |
A vector of the treatment conditions of the second unit in the matched pairs. |
gamma |
The sensitivity parameter which charaterizes the degree of the within-pair similarity. |
kappa1 |
The sensitivity parameter which charaterizes the dependence between |
kappa0 |
The sensitivity parameter which charaterizes the dependence between |
l |
The lower limit of the outcome. |
u |
The upper limit of the outcome. |
alpha |
A positive scalar that is less than or equal to 0.5. This will
determine the (1- |
rep |
The number of repetitions for bootstraping. |
For the details of the method implemented by this function, see the references.
A list of class ATOPsens
which contains the following items:
LB |
The lower bound for the ATOP. |
UB |
The upper bound for the ATOP. |
LB.CI |
The lower limit of the confidence interval for the ATOP. |
UB.CI |
The upper limit of the confidence interval for the ATOP. |
Kosuke Imai, Department of Government and Department of Statistics, Harvard University [email protected], https://imai.fas.harvard.edu; Zhichao Jiang, Department of Politics, Princeton University [email protected].
Kosuke Imai and Zhichao Jiang (2018). “A Sensitivity Analysis for Missing Outcomes Due to Truncation-by-Death under the Matched-Pairs Design”, Statistics in Medicine.
data(seguro) attach(seguro) ATOPsens(Ya,Yb,Ra,Rb,Ta,Tb,gamma=0.95,l=0,u=1,alpha=0.05,rep=100)
data(seguro) attach(seguro) ATOPsens(Ya,Yb,Ra,Rb,Ta,Tb,gamma=0.95,l=0,u=1,alpha=0.05,rep=100)
This function computes the bounds on the average treatment effect among always-observed pairs (ATOP) with pre-specified sensivity parameters when some of the outcome data are missing. The sensivity parameter characterizes the degree of the within-pair similarity. The confidence intervals for the ATOP are also computed.
ATOPsens(Ya, Yb, Ra, Rb, Ta, Tb, gamma, l, u, alpha, rep)
ATOPsens(Ya, Yb, Ra, Rb, Ta, Tb, gamma, l, u, alpha, rep)
Ya |
A vector of the outcomes of the first unit in the matched pairs. The missing values for |
Yb |
A vector of the outcomes of the second unit in the matched pairs. The missing values for |
Ra |
A vector of the missing data indicators of the first unit in the matched pairs. |
Rb |
A vector of the missing data indicators of the second unit in the matched pairs. |
Ta |
A vector of the treatment conditions of the first unit in the matched pairs. |
Tb |
A vector of the treatment conditions of the second unit in the matched pairs. |
gamma |
The sensitivity parameter which charaterizes the degree of the within-pair similarity. |
l |
The lower limit of the outcome. |
u |
The upper limit of the outcome. |
alpha |
A positive scalar that is less than or equal to 0.5. This will
determine the (1- |
rep |
The number of repetitions for bootstraping. |
For the details of the method implemented by this function, see the references.
A list of class ATOPsens
which contains the following items:
LB |
The lower bound for the ATOP. |
UB |
The upper bound for the ATOP. |
LB.CI |
The lower limit of the confidence interval for the ATOP. |
UB.CI |
The upper limit of the confidence interval for the ATOP. |
Kosuke Imai, Department of Government and Department of Statistics, Harvard University [email protected], https://imai.fas.harvard.edu; Zhichao Jiang, Department of Politics, Princeton University [email protected].
Kosuke Imai and Zhichao Jiang (2018). “A Sensitivity Analysis for Missing Outcomes Due to Truncation-by-Death under the Matched-Pairs Design”, Statistics in Medicine.
data(seguro) attach(seguro) ATOPobs(Ya,Yb,Ra,Rb,Ta,Tb,gamma=0.95,kappa1=1,kappa0=1,l=0,u=1,alpha=0.05,rep=100)
data(seguro) attach(seguro) ATOPobs(Ya,Yb,Ra,Rb,Ta,Tb,gamma=0.95,kappa1=1,kappa0=1,l=0,u=1,alpha=0.05,rep=100)
This function estimates AUPEC. The details of the methods for this design are given in Imai and Li (2019).
AUPEC(T, tau, Y)
AUPEC(T, tau, Y)
T |
The unit-level binary treatment receipt variable. |
tau |
The unit-level continuous score for treatment assignment. We assume those that have tau<0 should not have treatment. Conditional Average Treatment Effect is one possible measure. |
Y |
The outcome variable of interest. |
A list that contains the following items:
aupec |
The estimated Area Under Prescription Evaluation Curve |
sd |
The estimated standard deviation of AUPEC. |
Michael Lingzhi Li, Operations Research Center, Massachusetts Institute of Technology [email protected], http://mlli.mit.edu;
Imai and Li (2019). “Experimental Evaluation of Individualized Treatment Rules”,
This function estimates various complier average causal effect in cluster-randomized experiments without using pre-treatment covariates when unit-level noncompliance exists. Both the encouragement and treatment variables are assumed to be binary. Currently, only the matched-pair design is allowed. The details of the methods for this design are given in Imai, King, and Nall (2007).
CACEcluster( Y, D, Z, grp, data = parent.frame(), match = NULL, weights = NULL, ... )
CACEcluster( Y, D, Z, grp, data = parent.frame(), match = NULL, weights = NULL, ... )
Y |
The outcome variable of interest. |
D |
The unit-level treatment receipt variable. This variable should be binary but can differ across units within each cluster. |
Z |
The (randomized) cluster-level encouragement variable. This variable should be binary. Two units in the same cluster should have the same value. |
grp |
A variable indicating clusters of units. Two units in the same cluster should have the same value. |
data |
A data frame containing the relevant variables. |
match |
A variable indicating matched-pairs of clusters. Two units in
the same matched-pair of clusters should have the same value. The default is
|
weights |
A variable indicating the population cluster sizes, which
will be used to construct weights for each pair of clusters. Two units in
the same cluster should have the same value. The default is |
... |
Optional arguments passed to |
A list of class CACEcluster
which contains the following
items:
call |
The matched call. |
ITTY |
The output object from
|
ITTD |
The output object
from |
n1 |
The total number of units in the treatment group. |
n0 |
The total number of units in the control group. |
Z |
The treatment variable. |
est |
The estimated complier average causal effect. |
var |
The estimated variance of the complier average causal effect estimator. |
cov |
The estimated covariance between two ITT estimator. |
m |
The number of pairs in the matched-pair design. |
N1 |
The population cluster sizes for the treatment group. |
N0 |
The population cluster sizes for the control group. |
w |
Pair-specific normalized
arithmetic mean weights. These weights sum up to the total number of units
in the sample, i.e., |
Kosuke Imai, Department of Government and Department of Statistics, Harvard University [email protected], https://imai.fas.harvard.edu;
Imai, Kosuke, Gary King, and Clayton Nall (2007). “The Essential Role of Pair Matching in Cluster-Randomized Experiments, with Application to the Mexican Universal Health Insurance Evaluation”, Technical Report. Department of Politics, Princeton University.
This function computes the point estimates and variance estimates of the complier average direct effect (CADE) and the complier average spillover effect (CASE). The estimators calculated using this function are either individual weighted or cluster-weighted. The point estimates and variances of ITT effects are also included.
CADErand(data, individual = 1)
CADErand(data, individual = 1)
data |
A data frame containing the relevant variables. The names for the variables should be: “Z” for the treatment assignment, “D” for the actual received treatment, “Y” for the outcome, “A” for the treatment assignment mechanism and “id” for the cluster ID. The variable for the cluster id should be a factor. |
individual |
A binary variable with TRUE for individual-weighted estimators and FALSE for cluster-weighted estimators. |
For the details of the method implemented by this function, see the references.
A list of class CADErand
which contains the following items:
CADE1 |
The point estimate of CADE(1). |
CADE0 |
The point estimate of CADE(0). |
CADE1 |
The point estimate of CASE(1). |
CASE0 |
The point estimate of CASE(0). |
var.CADE1 |
The variance estimate of CADE(1). |
var.CADE0 |
The variance estimate of CADE(0). |
var.CASE1 |
The variance estimate of CASE(1). |
var.CASE0 |
The variance estimate of CASE(0). |
DEY1 |
The point estimate of DEY(1). |
DEY0 |
The point estimate of DEY(0). |
DED1 |
The point estimate of DED(1). |
DED0 |
The point estimate of DED(0). |
var.DEY1 |
The variance estimate of DEY(1). |
var.DEY0 |
The variance estimate of DEY(0). |
var.DED1 |
The variance estimate of DED(1). |
var.DED0 |
The variance estimate of DED(0). |
SEY1 |
The point estimate of SEY(1). |
SEY0 |
The point estimate of SEY(0). |
SED1 |
The point estimate of SED(1). |
SED0 |
The point estimate of SED(0). |
var.SEY1 |
The variance estimate of SEY(1). |
var.SEY0 |
The variance estimate of SEY(0). |
var.SED1 |
The variance estimate of SED(1). |
var.SED0 |
The variance estimate of SED(0). |
Kosuke Imai, Department of Government and Department of Statistics, Harvard University [email protected], https://imai.fas.harvard.edu; Zhichao Jiang, Department of Politics, Princeton University [email protected].
Kosuke Imai, Zhichao Jiang and Anup Malani (2018). “Causal Inference with Interference and Noncompliance in the Two-Stage Randomized Experiments”, Technical Report. Department of Politics, Princeton University.
This function computes the point estimates of the complier average direct effect (CADE) and four
different variance estimates: the HC2 variance, the cluster-robust variance, the cluster-robust HC2
variance and the variance proposed in the reference. The estimators calculated using this function
are cluster-weighted, i.e., the weights are equal for each cluster. To obtain the indivudal-weighted
estimators, please multiply the recieved treatment and the outcome by n_jJ/N
, where
n_j
is the number of individuals in cluster j
, J
is the number of clusters and
N
is the total number of individuals.
CADEreg(data)
CADEreg(data)
data |
A data frame containing the relevant variables. The names for the variables should be: “Z” for the treatment assignment, “D” for the actual received treatment, “Y” for the outcome, “A” for the treatment assignment mechanism and “id” for the cluster ID. The variable for the cluster id should be a factor. |
For the details of the method implemented by this function, see the references.
A list of class CADEreg
which contains the following items:
CADE1 |
The point estimate of CADE(1). |
CADE0 |
The point estimate of CADE(0). |
var1.clu |
The cluster-robust variance of CADE(1). |
var0.clu |
The cluster-robust variance of CADE(0). |
var1.clu.hc2 |
The cluster-robust HC2 variance of CADE(1). |
var0.clu.hc2 |
The cluster-robust HC2 variance of CADE(0). |
var1.hc2 |
The HC2 variance of CADE(1). |
var0.hc2 |
The HC2 variance of CADE(0). |
var1.ind |
The individual-robust variance of CADE(1). |
var0.ind |
The individual-robust variance of CADE(0). |
var1.reg |
The proposed variance of CADE(1). |
var0.reg |
The proposed variance of CADE(0). |
Kosuke Imai, Department of Government and Department of Statistics, Harvard University [email protected], https://imai.fas.harvard.edu; Zhichao Jiang, Department of Politics, Princeton University [email protected].
Kosuke Imai, Zhichao Jiang and Anup Malani (2021). “Causal Inference with Interference and Noncompliance in the Two-Stage Randomized Experiments”, Journal of the American Statistical Association.
This function estimates the average causal effects for randomized experiments with noncompliance and missing outcomes under the assumption of latent ignorability (Frangakis and Rubin, 1999). The models are based on Bayesian generalized linear models and are fitted using the Markov chain Monte Carlo algorithms. Various types of the outcome variables can be analyzed to estimate the Intention-to-Treat effect and Complier Average Causal Effect.
NoncompLI( formulae, Z, D, data = parent.frame(), n.draws = 5000, param = TRUE, in.sample = FALSE, model.c = "probit", model.o = "probit", model.r = "probit", tune.c = 0.01, tune.o = 0.01, tune.r = 0.01, tune.v = 0.01, p.mean.c = 0, p.mean.o = 0, p.mean.r = 0, p.prec.c = 0.001, p.prec.o = 0.001, p.prec.r = 0.001, p.df.o = 10, p.scale.o = 1, p.shape.o = 1, mda.probit = TRUE, coef.start.c = 0, coef.start.o = 0, tau.start.o = NULL, coef.start.r = 0, var.start.o = 1, burnin = 0, thin = 0, verbose = TRUE )
NoncompLI( formulae, Z, D, data = parent.frame(), n.draws = 5000, param = TRUE, in.sample = FALSE, model.c = "probit", model.o = "probit", model.r = "probit", tune.c = 0.01, tune.o = 0.01, tune.r = 0.01, tune.v = 0.01, p.mean.c = 0, p.mean.o = 0, p.mean.r = 0, p.prec.c = 0.001, p.prec.o = 0.001, p.prec.r = 0.001, p.df.o = 10, p.scale.o = 1, p.shape.o = 1, mda.probit = TRUE, coef.start.c = 0, coef.start.o = 0, tau.start.o = NULL, coef.start.r = 0, var.start.o = 1, burnin = 0, thin = 0, verbose = TRUE )
formulae |
A list of formulae where the first formula specifies the
(pre-treatment) covariates in the outcome model (the latent compliance
covariate will be added automatically), the second formula specifies the
compliance model, and the third formula defines the covariate specification
for the model for missing-data mechanism (the latent compliance covariate
will be added automatically). For the outcome model, the formula should take
the two-sided standard R |
Z |
A randomized encouragement variable, which should be a binary variable in the specified data frame. |
D |
A treatment variable, which should be a binary variable in the specified data frame. |
data |
A data frame which contains the variables that appear in the
model formulae ( |
n.draws |
The number of MCMC draws. The default is |
param |
A logical variable indicating whether the Monte Carlo draws of
the model parameters should be saved in the output object. The default is
|
in.sample |
A logical variable indicating whether or not the sample
average causal effect should be calculated using the observed potential
outcome for each unit. If it is set to |
model.c |
The model for compliance. Either |
model.o |
The model for outcome. The following five models are allowed:
|
model.r |
The model for (non)response. Either |
tune.c |
Tuning constants for fitting the compliance model. These
positive constants are used to tune the (random-walk) Metropolis-Hastings
algorithm to fit the logit model. Use either a scalar or a vector of
constants whose length equals that of the coefficient vector. The default is
|
tune.o |
Tuning constants for fitting the outcome model. These positive
constants are used to tune the (random-walk) Metropolis-Hastings algorithm
to fit logit, ordered probit, and negative binomial models. Use either a
scalar or a vector of constants whose length equals that of the coefficient
vector for logit and negative binomial models. For the ordered probit model,
use either a scalar or a vector of constants whose length equals that of
cut-point parameters to be estimated. The default is |
tune.r |
Tuning constants for fitting the (non)response model. These
positive constants are used to tune the (random-walk) Metropolis-Hastings
algorithm to fit the logit model. Use either a scalar or a vector of
constants whose length equals that of the coefficient vector. The default is
|
tune.v |
A scalar tuning constant for fitting the variance component of
the negative binomial (outcome) model. The default is |
p.mean.c |
Prior mean for the compliance model. It should be either a
scalar or a vector of appropriate length. The default is |
p.mean.o |
Prior mean for the outcome model. It should be either a
scalar or a vector of appropriate length. The default is |
p.mean.r |
Prior mean for the (non)response model. It should be either
a scalar or a vector of appropriate length. The default is |
p.prec.c |
Prior precision for the compliance model. It should be
either a positive scalar or a positive semi-definite matrix of appropriate
size. The default is |
p.prec.o |
Prior precision for the outcome model. It should be either a
positive scalar or a positive semi-definite matrix of appropriate size. The
default is |
p.prec.r |
Prior precision for the (non)response model. It should be
either a positive scalar or a positive semi-definite matrix of appropriate
size. The default is |
p.df.o |
A positive integer. Prior degrees of freedom parameter for the
inverse chisquare distribution in the gaussian and twopart (outcome) models.
The default is |
p.scale.o |
A positive scalar. Prior scale parameter for the inverse
chisquare distribution (for the variance) in the gaussian and twopart
(outcome) models. For the negative binomial (outcome) model, this is used
for the scale parameter of the inverse gamma distribution. The default is
|
p.shape.o |
A positive scalar. Prior shape for the inverse chisquare
distribution in the negative binomial (outcome) model. The default is
|
mda.probit |
A logical variable indicating whether to use marginal data
augmentation for probit models. The default is |
coef.start.c |
Starting values for coefficients of the compliance
model. It should be either a scalar or a vector of appropriate length. The
default is |
coef.start.o |
Starting values for coefficients of the outcome model.
It should be either a scalar or a vector of appropriate length. The default
is |
tau.start.o |
Starting values for thresholds of the ordered probit
(outcome) model. If it is set to |
coef.start.r |
Starting values for coefficients of the (non)response
model. It should be either a scalar or a vector of appropriate length. The
default is |
var.start.o |
A positive scalar starting value for the variance of the
gaussian, negative binomial, and twopart (outcome) models. The default is
|
burnin |
The number of initial burnins for the Markov chain. The
default is |
thin |
The size of thinning interval for the Markov chain. The default
is |
verbose |
A logical variable indicating whether additional progress
reports should be prited while running the code. The default is |
For the details of the model being fitted, see the references. Note that when always-takers exist we fit either two logistic or two probit models by first modeling whether a unit is a complier or a noncomplier, and then modeling whether a unit is an always-taker or a never-taker for those who are classified as non-compliers.
An object of class NoncompLI
which contains the following
elements as a list:
call |
The matched call. |
Y |
The outcome variable. |
D |
The treatment variable. |
Z |
The (randomized) encouragement variable. |
R |
The response indicator variable for
|
A |
The indicator variable for (known) always-takers, i.e., the control units who received the treatment. |
C |
The indicator variable for (known) compliers, i.e., the encouraged units who received the treatment when there is no always-takers. |
Xo |
The matrix of covariates used for the outcome model. |
Xc |
The matrix of covariates used for the compliance model. |
Xr |
The matrix of covariates used for the (non)response model. |
n.draws |
The number of MCMC draws. |
QoI |
The Monte carlo draws of quantities of interest from their
posterior distributions. Quantities of interest include |
If param
is set to TRUE
, the
following elments are also included:
coefO |
The Monte carlo draws of coefficients of the outcome model from their posterior distribution. |
coefO1 |
If |
coefC |
The Monte carlo draws of coefficients of the compliance model from their posterior distribution. |
coefA |
If always-takers exist, then this element contains the Monte carlo draws of coefficients of the compliance model for always-takers from their posterior distribution. |
coefR |
The Monte carlo draws of coefficients of the (non)response model from their posterior distribution. |
sig2 |
The Monte carlo draws of the variance parameter for the gaussian, negative binomial, and twopart (outcome) models. |
Kosuke Imai, Department of Government and Department of Statistics, Harvard University [email protected], https://imai.fas.harvard.edu;
Frangakis, Constantine E. and Donald B. Rubin. (1999). “Addressing Complications of Intention-to-Treat Analysis in the Combined Presence of All-or-None Treatment Noncompliance and Subsequent Missing Outcomes.” Biometrika, Vol. 86, No. 2, pp. 365-379.
Hirano, Keisuke, Guido W. Imbens, Donald B. Rubin, and Xiao-Hua Zhou. (2000). “Assessing the Effect of an Influenza Vaccine in an Encouragement Design.” Biostatistics, Vol. 1, No. 1, pp. 69-88.
Barnard, John, Constantine E. Frangakis, Jennifer L. Hill, and Donald B. Rubin. (2003). “Principal Stratification Approach to Broken Randomized Experiments: A Case Study of School Choice Vouchers in New York (with Discussion)”, Journal of the American Statistical Association, Vol. 98, No. 462, pp299–311.
Horiuchi, Yusaku, Kosuke Imai, and Naoko Taniguchi (2007). “Designing and Analyzing Randomized Experiments: Application to a Japanese Election Survey Experiment.” American Journal of Political Science, Vol. 51, No. 3 (July), pp. 669-687.
This function estimates the Population Average Prescription Difference with a budget constraint. The details of the methods for this design are given in Imai and Li (2019).
PAPD(T, Thatfp, Thatgp, Y, plim)
PAPD(T, Thatfp, Thatgp, Y, plim)
T |
The unit-level binary treatment receipt variable. |
Thatfp |
The unit-level binary treatment that would have been assigned by the first individualized treatment rule. |
Thatgp |
The unit-level binary treatment that would have been assigned by the second individualized treatment rule. |
Y |
The outcome variable of interest. |
plim |
The maximum percentage of population that can be treated under the budget constraint. Should be a decimal between 0 and 1. |
A list that contains the following items:
papd |
The estimated Population Average Prescription Difference |
sd |
The estimated standard deviation of PAPD. |
Michael Lingzhi Li, Operations Research Center, Massachusetts Institute of Technology [email protected], http://mlli.mit.edu;
Imai and Li (2019). “Experimental Evaluation of Individualized Treatment Rules”,
This function estimates the Population Average Prescription Effect with and without a budget constraint. The details of the methods for this design are given in Imai and Li (2019).
PAPE(T, That, Y, plim = NA)
PAPE(T, That, Y, plim = NA)
T |
The unit-level binary treatment receipt variable. |
That |
The unit-level binary treatment that would have been assigned by the individualized treatment rule. |
Y |
The outcome variable of interest. |
plim |
The maximum percentage of population that can be treated under the budget constraint. Should be a decimal between 0 and 1. Default is NA which assumes no budget constraint. |
A list that contains the following items:
pape |
The estimated Population Average Prescription Effect. |
sd |
The estimated standard deviation of PAPE. |
Michael Lingzhi Li, Operations Research Center, Massachusetts Institute of Technology [email protected], http://mlli.mit.edu;
Imai and Li (2019). “Experimental Evaluation of Individualized Treatment Rules”,
This function can be used to randomize the treatment assignment for randomized experiments. In addition to the complete randomization, it implements randomized-block and matched-pair designs.
randomize( data, group = c("Treat", "Control"), ratio = NULL, indx = NULL, block = NULL, n.block = NULL, match = NULL, complete = TRUE )
randomize( data, group = c("Treat", "Control"), ratio = NULL, indx = NULL, block = NULL, n.block = NULL, match = NULL, complete = TRUE )
data |
A data frame containing the observations to which the treatments are randomly assigned. |
group |
A numerical or character vector indicating the treatment/control groups. The length of the vector equals the total number of such groups. The default specifies two groups called “Treat” and “Control”. |
ratio |
An optional numerical vector which specifies the proportion of the treatment/control groups within the sample. The length of the vector should equal the number of groups. The default is the equal allocation. |
indx |
An optional variable name in the data frame to be used as the names of the observations. If not specified, the row names of the data frame will be used so long as they are available. If the row names are not available, the integer sequence starting from 1 will be used. |
block |
An optional variable name in the data frame or a formula to be
used as the blocking variables for randomized-block designs. If a variable
name is specified, then the unique values of that variable will form blocks
unless |
n.block |
An optional scalar specifying the number of blocks to be
created for randomized block designs. If unspecified, the unique values of
the blocking variable will define blocks. If specified, the blocks of
roughly equal size will be created based on the |
match |
An optional variable name in the data frame or a formula to be
used as the matching variables for matched-pair designs. This input is
applicable only to the case where there are two groups. Pairs of
observations will be formed based on the similar values of the matching
variable. If a formula is specified, the |
complete |
logical. If it equals |
Randomized-block designs refer to the complete randomization of the treatment within the pre-specified blocks which contain multiple observations. Matched-pair designs refer to the randomization of the binary treatment variable within the pre-specified pair of observations.
A list of class randomize
which contains the following items:
call |
the matched call. |
treatment |
The vector of randomized treatments. |
data |
The data frame that was used to conduct the randomization. |
block |
The blocking variable that was used to implement randomized-block designs. |
match |
The matching variable that was used to implement matched-pair designs. |
block.id |
The variable indicating which observations belong to which blocks in randomized-block designs. |
match.id |
The variable indicating which observations belong to which pairs in matched-pair designs. |
Kosuke Imai, Department of Government and Department of Statistics, Harvard University [email protected], https://imai.fas.harvard.edu;
This data set contains the outcome, missing indicator and the treatment for the application in Kosuke Imai and Zhichao Jiang (2018).
seguro
seguro
A data frame with 14,902 rows and 6 variables:
Satisfaction for the first unit in the matched pairs
Satisfaction for the second unit in the matched pairs
Missing indicator for the first unit in the matched pairs
Missing indicator for the second unit in the matched pairs
Treatment assignment for the first unit in the matched pairs
Treatment assignment for the second unit in the matched pairs
#'
data(seguro)
data(seguro)