-
Notifications
You must be signed in to change notification settings - Fork 6
/
day5_cartoons_covadj_exper.Rmd
989 lines (772 loc) · 39.7 KB
/
day5_cartoons_covadj_exper.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
---
title: |
| Taking Stock \&
| Linear Regression in Experiments and Observational Studies
date: '`r format(Sys.Date(), "%B %d, %Y")`'
author: |
| ICPSR 2023 Session 1
| Jake Bowers \& Tom Leavitt
bibliography:
- BIB/MasterBibliography.bib
fontsize: 10pt
geometry: margin=1in
graphics: yes
biblio-style: authoryear-comp
output:
beamer_presentation:
slide_level: 2
keep_tex: true
latex_engine: xelatex
citation_package: biblatex
template: icpsr.beamer
incremental: true
includes:
in_header:
- defs-all.sty
md_extensions: +raw_attribute-tex_math_single_backslash+autolink_bare_uris+ascii_identifiers+tex_math_dollars
---
<!-- To show notes -->
<!-- https://stackoverflow.com/questions/44906264/add-speaker-notes-to-beamer-presentations-using-rmarkdown -->
```{r setup1_env, echo=FALSE, include=FALSE}
library(here)
source(here::here("rmd_setup.R"))
```
```{r setup2_loadlibs, echo=FALSE, include=FALSE}
## Load all of the libraries that we will use when we compile this file
## We are using the renv system. So these will all be loaded from a local library directory
library(dplyr)
library(ggplot2)
library(estimatr)
library(coin)
library(DeclareDesign)
```
## Today
1. Agenda: Overview and context for where we are in the course, Modes of
Statistical Inference for Counterfactual Causal Inference, Linear
regresssion for estimation in randomized experiments, Linear regression for
covariance adjustment in randomized experiments
2. Maybe open time for work on HW 1 with TAs and me around.
3. Questions arising from the reading or assignments or life.
# An overview of approaches to statistical inference for causal quantities
## Three General Approaches To Learning About The Unobserved Using Data
\centering
\includegraphics[width=.5\textwidth]{images/MorrowPlots.jpg}
\hfill
\includegraphics[width=.5\textwidth]{images/aces-research-history-morrow-plots--uw.jpg}
## Potential Outcomes
\includegraphics[width=.65\textwidth]{images/cartoonNeymanBayesFisherCropped.pdf}
Imagine we would observe so many bushels of corn, $y$, if plot $i$ were randomly assigned to new fertilizer, $y_{i,Z_i=1}$ (where $Z_i=1$ means "assigned to new fertilizer" and $Z_i=0$ means "assigned status quo fertilizer") and another amount of corn, $y_{i,Z_i=0}$, if the same plot were assigned the status quo fertilizer condition.
These $y$ are are *potential* or *partially observed* outcomes.
## Notation
- *Treatment* $Z_i=1$ for treatment and $Z_i=0$ for control for units $i$
- In a two arm experiment each unit has at least a pair of *potential outcomes* $(y_{i,Z_i=1},y_{i,Z_i=0})$
(also written $(y_{i,1},y_{i,0})$ to indicate that $y_{1,Z_1=1,Z_2=1} = y_{1,Z_1=1,Z_2=0}$ --- unit 1's outcome depends only on treatment assignment to unit 1 and not to unit 2.)
- *Causal Effect* for unit $i$ is $\tau_i$, $\tau_i =
f(y_{i,1},y_{i,0})$. For example, $\tau_i = y_{i,1} - y_{i,0}$.
- *Fundamental Problem of (Counterfactual) Causality* We only see one
potential outcome $Y_i = Z_i y_{i,1} + (1-Z_i) y_{i,0}$ in our observed outcome, $Y_i$. Treatment reveals one potential outcome to us in a simple randomized experiment.
\bigskip
So how do we learn about $\tau_i$ if we cannot directly see it?
## Design Based 1: Compare Models of Potential Outcomes to Data
1. Make a guess about (or model of) $\tau_i = f(y_{i,1},y_{i,0})$. For example
$H_0: y_{i,1}=y_{i,0}+\tau_{i}$ and $\tau_i=0$ is the sharp null hypothesis
of no effects.
2. Measure consistency of the data with this model given the research design and choice of test statistic (summarizing the treatment-to-outcome relationship).
\centering
\includegraphics[width=.65\textwidth]{images/cartoonFisherQuestionMarks.pdf}
## Design Based 1: Compare Models of Potential Outcomes to Data
1. Make a guess (or model of) about $\tau_i$.
2. Measure consistency of data with this model given the design and test statistic.
\centering
\includegraphics[width=.65\textwidth]{images/cartoonFisher.pdf}
## Design Based 1: Compare Models of Potential Outcomes to Data
Comparing the model, $H_0: \tau_i = 0 \implies y_{i,1} = y_{i,0}$ to data:
\centering
\includegraphics[width=\textwidth]{images/cartoonFisherNew1.pdf}
## Design Based 1: Compare Models of Potential Outcomes to Data
\centering
\includegraphics[width=.9\textwidth]{images/cartoonFisherNew2.pdf}
## Design Based 1: Compare Models of Potential Outcomes to Data
\framesubtitle{Testing Models of No-Effects.}
Here is some fake data from a tiny experiment with weird outcomes.
```{r makesmdat, echo=FALSE}
smdat <- data.frame(Z=c(0,1,0,1),y0=c(16,22,7,3990),y1=c(16,24,10,4000))
smdat$Y <- with(smdat,Z*y1 + (1-Z)*y0)
smdat$zF <- factor(smdat$Z)
smdat$rY <- rank(smdat$Y)
print(smdat)
```
Next, define a function that compares treated to control outcomes:
```{r teststats_and_fns, echo=TRUE}
## A mean difference test statistic
tz_mean_diff <- function(z,y){
mean(y[z==1]) - mean(y[z==0])
}
## A mean difference of ranks test statistic
tz_mean_rank_diff <- function(z,y){
ry <- rank(y)
mean(ry[z==1]) - mean(ry[z==0])
}
```
And define a function to repeat the experimental randomization
```{r echo=TRUE}
## Function to repeat the experimental randomization
newexp <- function(z){
sample(z)
}
```
```{r echo=FALSE}
set.seed(12345)
```
## Design Based 1: Compare Models of Potential Outcomes to Data
\framesubtitle{Testing Models of No-Effects.}
Here we **approximate the randomization distribution**:
```{r repexp, echo=TRUE, cache=TRUE}
set.seed(12345)
rand_dist_md <- with(smdat,replicate(1000,tz_mean_diff(z=newexp(Z),y=Y)))
rand_dist_rank_md <- with(smdat,replicate(1000,tz_mean_rank_diff(z=newexp(Z),y=Y)))
obs_md <- with(smdat,tz_mean_diff(z=Z,y=Y))
obs_rank_md <- with(smdat,tz_mean_rank_diff(z=Z,y=Y))
c(observed_mean_diff=obs_md,observed_mean_rank_diff=obs_rank_md)
table(rand_dist_md)/1000 ## Probability Distributions Under the Null of No Effects
table(rand_dist_rank_md)/1000
p_md <- mean(rand_dist_md >= obs_md) ## P-Values
p_rank_md <- mean(rand_dist_rank_md >= obs_rank_md)
c(mean_diff_p=p_md, mean_rank_diff_p=p_rank_md)
## Now using R directly
library(coin)
coin_test1 <- oneway_test(Y~zF,data=smdat,distribution=approximate(nresample=1000),alternative="less")
coin_test2 <- oneway_test(rY~zF,data=smdat,distribution=approximate(nresample=1000),alternative="less")
pvalue(coin_test1)
pvalue(coin_test2)
```
## Design Based 1: Compare Models of Potential Outcomes to Data
\framesubtitle{Testing Models of Effects.}
To learn about whether the data are consistent with $\tau_i=100$ for all $i$ notice how treatment assignment reveals part of the unobserved outcomes: $Y_{i} = Z_i y_{i,1} + (1-Z_i) y_{i,0}$ and if $H_0: \tau_i=100$ or $H_0: y_{i,1}=y_{i,0}+100$ then:
\begin{align}
Y_{i} & = Z_i ( y_{i,0} + 100 ) + (1-Z_i) y_{i,0} \\
& = Z_i y_{i,0} + Z_i 100 + y_{i,0} - Z_i y_{i,0} \\
& = Z_i 100 + y_{i,0} \\
y_{i,0} = Y_{i} & - Z_i 100
\end{align}
## Design Based 1: Compare Models of Potential Outcomes to Data
\framesubtitle{Testing Models of Effects.}
To test a *model of causal effects* we adjust the observed outcomes to be consistent
with our hypothesis about unobserved outcomes and then repeat the experiment:
```{r model_effects, echo=TRUE}
tz_mean_diff_effects <- function(z,y,tauvec){
adjy <- y - z*tauvec
radjy <- rank(adjy)
mean(radjy[z==1]) - mean(radjy[z==0])
}
rand_dist_md_tau_cae <- with(smdat,replicate(1000,tz_mean_diff_effects(z=newexp(Z),y=Y,tauvec=c(100,100,100,100))))
obs_md_tau_cae <- with(smdat,tz_mean_diff_effects(z=Z,y=Y,tauvec=c(100,100,100,100)))
mean(rand_dist_md_tau_cae >= obs_md_tau_cae)
```
## Design Based 1: Compare Models of Potential Outcomes to Data
\framesubtitle{Testing Models of Effects.}
Now let's test a model with different effects for each unit --- $H_0: \symbf{\tau} = \{0,2,3,10\}$
\smallskip
```{r model_effects2, echo=TRUE}
rand_dist_md_taux<- with(smdat,replicate(1000,tz_mean_diff_effects(z=newexp(Z),y=Y,
tauvec=c(0,2,3,10))))
obs_md_taux <- with(smdat,tz_mean_diff_effects(z=Z,y=Y,tauvec=c(0,2,3,10)))
mean(rand_dist_md_taux >= obs_md_taux)
```
So: We can learn about claims about unobserved potential outcomes by gauging the
consistency our observed results with the distributions implied by the
claims+test statistics+research design (including sample size and randomization
scheme).
\bigskip
Questions about this first approach to using statistical inference to do counterfactual causal inference?
## Design Based 2: Estimate Averages of Potential Outcomes
1. Notice that the observed $Y_i$ are a sample from the (small, finite) population of unobserved potential outcomes $(y_{i,1},y_{i,0})$.
2. Decide to focus on the average, $\bar{\tau}$, because sample averages, $\hat{\bar{\tau}}$ are unbiased and consistent estimators of population averages.
3. Estimate $\bar{\tau}$ with the observed difference in means as $\hat{\bar{\tau}}$.
\centering
\includegraphics[width=.8\textwidth]{images/cartoonNeyman.pdf}
## Design Based 2: Estimate Averages of Potential Outcomes
\centering
\includegraphics[width=.95\textwidth]{images/cartoonNeyman.pdf}
## Design Based 2: Estimate Averages of Potential Outcomes
Here using Neyman's standard errors (same as HC2 SEs) and Central Limit Theorem
based $p$-values and 95% confidence intervals.
Notice that OLS/Linear Regression is just a difference of means calculator here.
\smallskip
```{r dim, echo=TRUE}
with(smdat,mean(Y[Z==1]) - mean(Y[Z==0]))
est_se_est_ate <- with(smdat,sqrt( var(Y[Z==1])/sum(Z) + var(Y[Z==0])/(sum(1-Z)) ))
est_se_est_ate
est1 <- difference_in_means(Y~Z,data=smdat)
tidy(est1)
lm1 <- lm_robust(Y~Z,data=smdat)
tidy(lm1)
```
Those $p$-values raise the next question: Where does the randomization
distribution for these tests come from?
## Design Based 2: Test hypotheses about averages.
A focus on the difference of average potential outcomes, on an
average causal effect, also allows for testing hypotheses about these average
causal efffects. This is called a test of the "weak null" hypothesis.
The challenge:
\medskip
A hypothesis like $H_0: \bar{\tau}=\tau_0$ is compatible with many different sharp
null hypotheses: for example, $\tau=\{-5,0,0,5\}$ and $\tau=\{0,0,0,0\}$
are both compatible with $\bar{\tau}=0$.
\medskip
And a hypothesis test requires that we compare what we observe with what the
hypothesis implies that we would observe. But a hypothesis about an average
implies many different probability distributions: one for each sharp
hypothesis.
## Hypothesis tests of the weak null
\begin{itemize}
\item The finite population CLT tells us that
\begin{align*}
\cfrac{\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right) - \E\left[\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)\right]}{\sqrt{\Var\left[\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)\right]}} & \overset{d}{\to} \mathcal{N}\left(0, 1\right)
\end{align*}
\item Diff-in-Means is unbiased, so write
\begin{align*}
\cfrac{\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right) - \tau}{\sqrt{\Var\left[\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)\right]}} & \overset{d}{\to} \mathcal{N}\left(0, 1\right)
\end{align*}
\item The CLT is an asymptotic results as $N \to \infty$
\item But we can bound error of Normal approximation for fixed $N$
\item Thus, with experiments of at least moderate size and outcomes that aren't too skewed or have extreme outliers,
\begin{align*}
\cfrac{\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right) - \tau}{\sqrt{\Var\left[\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)\right]}} & \overset{\text{approx.}}{\sim} \mathcal{N}\left(0, 1\right)
\end{align*}
\item This justifies use of standard Normal distribution for hypothesis tests
\end{itemize}
```{r, warning=FALSE}
y_C <- c(10, 15, 20, 20, 10, 15, 15)
y_T <- c(15, 15, 30, 15, 20, 15, 30)
tau <- y_T - y_C
ATE <- mean(tau)
## Imagine one Z:
Z <- c(1,1,0,0,0,0,0)
Y <- Z*y_T + (1-Z) * y_C
est_ATE <- mean(Y[Z==1]) - mean(Y[Z==0])
asymp_ests_exact <- function(.y_C,
.y_T,
.prop_T,
.h){
y_C = c(.y_C, rep(x = .y_C, times = .h - 1))
y_T = c(.y_T, rep(x = .y_T, times = .h - 1))
Omega = apply(X = combn(x = length(y_C),
m = length(y_C) * .prop_T,
simplify = TRUE),
MARGIN = 2,
FUN = function(x) as.integer(1:(length(y_C)) %in% x))
obs_pot_outs = sapply(X = 1:ncol(Omega),
FUN = function(x) { y_T * Omega[,x] + y_C * (1 - Omega[,x]) })
diff_means_ests = sapply(X = 1:ncol(Omega),
FUN = function(x) { mean(obs_pot_outs[,x][Omega[,x] == 1]) -
mean(obs_pot_outs[,x][Omega[,x] == 0]) })
return(diff_means_ests)
}
## Show asymptotic properties
asymp_ests_approx <- function(.y_C,
.y_T,
.prop_T,
.h){
y_C = c(.y_C, rep(x = .y_C, times = .h - 1))
y_T = c(.y_T, rep(x = .y_T, times = .h - 1))
true_EV_diff_means_est = mean(y_T) - mean(y_C)
true_var_diff_means_est = diff_means_var(.n = length(y_C),
.n_t = length(y_C) * .prop_T,
.y_C = y_C,
.y_T = y_T)
return(rnorm(n = 10^3,
mean = true_EV_diff_means_est,
sd = sqrt(true_var_diff_means_est)))
}
diff_means_var <- function(.n,
.n_t,
.y_C,
.y_T) {
var_y_C = mean((.y_C - mean(.y_C))^2)
var_y_T = mean((.y_T - mean(.y_T))^2)
cov_y_C_y_T = mean((.y_C - mean(.y_C)) * (.y_T - mean(.y_T)))
var = (1/(.n - 1)) * ((.n_t * var_y_C) / (.n - .n_t) +
(((.n - .n_t) * var_y_T) / .n_t) +
(2 * cov_y_C_y_T))
return(var)
}
diff_means_var_est <- function(.n,
.n_1,
.z,
.y) {
obs_y_C = .y[which(.z == 0)]
obs_y_T = .y[which(.z == 1)]
est_mean_y_C = mean(obs_y_C)
est_mean_y_T = mean(obs_y_T)
est_var_y_C = ((.n - 1)/(.n * ((.n - .n_1) - 1))) * sum((obs_y_C - est_mean_y_C)^2)
est_var_y_T = ((.n - 1)/(.n * (.n_1 - 1))) * sum((obs_y_T - est_mean_y_T)^2)
return((.n/(.n - 1)) * ( (est_var_y_C/(.n - .n_1)) + (est_var_y_T/.n_1)))
}
stand_ests_plot_data <- data.frame(est = c( (asymp_ests_exact(.y_C=y_C,.y_T=y_T,.prop_T=2/7,.h=1) - 5 )/sqrt(21.19048),
# (ests_cra - 5) / sqrt(21.19048),
(asymp_ests_exact(.y_C = y_C, .y_T = y_T, .prop_T = (2/7), .h = 3) - 5) / sqrt(6.357143),
(asymp_ests_approx(.y_C = y_C, .y_T = y_T, .prop_T = (2/7), .h = 15) - 5) / sqrt(0.4049136),
(asymp_ests_approx(.y_C = y_C, .y_T = y_T, .prop_T = (2/7), .h = 150) - 5) / sqrt(0.002690911)),
N = c(rep(x = "N = 7", times = 21 ),
rep(x = "N = 21", times = choose(21, 6)),
rep(x = "N = 100", times = 10^3),
rep(x = "N = 1000", times = 10^3)))
stand_ests_plot_data$N <- factor(x = stand_ests_plot_data$N, levels = c("N = 7", "N = 21", "N = 100", "N = 1000"))
##devtools::install_github("teunbrand/ggh4x")
library(ggh4x)
asymp_stand_ests_plot <- ggplot(data = stand_ests_plot_data,
mapping = aes(x = est)) +
geom_histogram(aes(y =..density..)) +
stat_theodensity(colour = "red") +
xlab(label = "Standardized Diff-in-Means estimates") +
ylab(label = "Probability") +
theme_bw() +
facet_wrap(facets = .~ N,
nrow = 1,
ncol = 4,
scales = "free") +
scale_x_continuous(labels = 0, breaks = 1) +
theme(axis.ticks.y = element_blank(),
axis.text.y = element_blank())
ggsave(plot = asymp_stand_ests_plot,
file = "asymp_stand_ests_plot.pdf",
width = 6,
height = 4,
units = "in",
dpi = 600)
```
## Hypothesis tests of the weak null
\begin{itemize}
\item In the ``village heads'' example we calculate an estimate for each of the
possible ways to randomize and subtract off the expected value ("5") such
that the distribution is now centered at 0 (a hypothesized value).
\begin{figure}[H]
\includegraphics[width=\linewidth]{asymp_stand_ests_plot.pdf}
\end{figure}
\end{itemize}
## Hypothesis tests of the weak null
\begin{itemize}
\item To test null hypothesis relative to alternative
\begin{align*}
H_0: & \tau = \tau_0 \text{ versus either } \\
H_A: & \tau > \tau_0, \, H_A: \tau < \tau_0 \text{ or } H_A: \left\lvert \tau \right\rvert > \left\lvert \tau_0 \right \rvert
\end{align*}
\item Calculate upper(u), lower(l) or two-sided(t) p-value as
\begin{align*}
p_u & = 1 - \Phi\left(\frac{\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right) - \tau_0}{\sqrt{\widehat{\Var}\left[\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)\right]}}\right) \\
p_l & = \Phi\left(\frac{\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right) - \tau_0}{\sqrt{\widehat{\Var}\left[\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)\right]}}\right) \\
p_t & = 2\left(1 - \Phi\left(\frac{\left\lvert\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right) - \tau_0\right\rvert}{\sqrt{\widehat{\Var}\left[\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)\right]}}\right)\right)
\end{align*}
\item If p-value is less than size $\alpha$-level of test, reject. Otherwise, don't
\item Note that, since we don't know $\Var\left[\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)\right]$, \\ we have used its conservative estimator instead, $\widehat{\Var}\left[\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)\right]$
\end{itemize}
## Hypothesis tests of the weak null
\bh{Hypothesis tests susceptible to two errors}:
\begin{itemize}
\item Type I error: Rejecting null hypothesis when it is true
\item Type II error: \textit{Failing} to reject null hypothesis when it is false
\end{itemize}
\bh{A good test} controls these errors:
\begin{enumerate}
\item Type I error probability is less than or equal to size ($\alpha$-level) of test
\item Power (1 - type II error probability) is at least as great as $\alpha$-level
\item Power tends to $1$ as $N \to \infty$
\end{enumerate}
## Hypothesis tests of the weak null
\begin{itemize}
\item We can prove that tests of weak null satisfy (1) -- (3) as $N \to \infty$ \vfill
\item Thus, when experiments are large, we can often safely use such tests \vfill
\item But (1) -- (3) may not always be satisfied when experiments are small, have skewed outcome distributions or extreme outliers \vfill
\end{itemize}
## Confidence intervals
\begin{itemize}
\item Equivalence between hypothesis testing and confidence intervals
\item Confidence interval is set of null hypotheses we fail to reject
\end{itemize}
Consider two-sided confidence interval, $\mathcal{C}_t$:
\footnotesize
\begin{equation*}
\begin{split}
\mathcal{C}_t & = \left\{\tau_0 : \left\lvert \cfrac{\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)- \tau_0}{\sqrt{\widehat{\Var}\left[\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)\right]}} \right\rvert \leq z_{1 - \alpha/2}\right\} \\
& = \left\{\tau_0 : - z_{1 - \alpha/2} \leq \cfrac{\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)- \tau_0}{\sqrt{\widehat{\Var}\left[\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)\right]}} \leq z_{1 - \alpha/2} \right\} \\
& = \left\{\tau_0 : - z_{1 - \alpha/2} \sqrt{\widehat{\Var}\left[\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)\right] }\leq \hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)- \tau_0 \leq z_{1 - \alpha/2} \sqrt{\widehat{\Var}\left[\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)\right] }\right\} \\
& = \left\{\tau_0 : -\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)- z_{1 - \alpha/2} \sqrt{\widehat{\Var}\left[\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)\right] }\leq - \tau_0 \leq - \hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)+ z_{1 - \alpha/2} \sqrt{\widehat{\Var}\left[\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)\right] }\right\} \\
& = \left\{\tau_0 : \hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)+ z_{1 - \alpha/2} \sqrt{\widehat{\Var}\left[\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)\right] }\geq \tau_0 \geq \hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)- z_{1 - \alpha/2} \sqrt{\widehat{\Var}\left[\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)\right] }\right\} \\
& = \left\{\tau_0 : \hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)- z_{1 - \alpha/2} \sqrt{\widehat{\Var}\left[\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)\right] }\leq \tau_0 \leq \hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)+ z_{1 - \alpha/2} \sqrt{\widehat{\Var}\left[\hat{\tau}\left(\mathbf{Z}, \mathbf{Y}\right)\right] }\right\}
\end{split}
\end{equation*}
\normalsize
## Confidence intervals {.allowframebreaks}
Here is an example showing how a confidence interval is a collection of
not-rejected hypotheses (here using the weak null hypothesis)
```{r, echo=TRUE, warning=FALSE}
vh_dat <- data.frame(Y=Y,Z=Z,y_C=y_C,y_T=y_T)
p_val_fun <- function(h,thealpha=.05,return_p=FALSE){
## Return a p-value for each weak null hypothesis
test_res <- tidy(lm_robust(I(Y - h*Z)~Z,data=vh_dat))
the_p <- test_res %>% filter(term=="Z") %>% select(p.value)
if(return_p){
return(the_p[[1]])
} else {
return(thealpha - the_p[[1]])
}
}
## First just test many hypotheses in a row. This is a "grid-search"
the_hyps <- seq(-20,20,.1)
res <- sapply(the_hyps,function(h){
p_val_fun(h=h,return_p=TRUE)
})
hyp_res <- data.frame(h=the_hyps,p=res)
## Keep the largest and smallest hypotheses for which p>=.05
hyp_res %>% filter(abs(p)>=.05) %>% reframe(range(h))
## Second directly search for the hypotheses where alpha-p = 0.
upper_lim <- uniroot(p_val_fun,interval=c(0,20),extendInt="no",trace=2)
lower_lim <- uniroot(p_val_fun,interval=c(-20,0),extendInt="no",trace=2)
ci2 <- c(lower_lim$root,upper_lim$root)
## Third, use the analytic formula with standard errors etc.
lm2 <- lm_robust(Y~Z,data=vh_dat)
lm2_df <- tidy(lm2)
lm2_df %>% mutate(ll=estimate-qt(p=.975,df=5) * std.error,
ul=estimate+qt(p=.975,df=5) * std.error)
confint(lm2,parm="Z")
```
## Model Based 1: Predict Distributions of Potential Outcomes
\smallskip
\centering
\includegraphics[width=.95\textwidth]{images/cartoonBayesNew.pdf}
## Model Based 1: Predict Distributions of $(y_{i,1},y_{i,0})$
> 1. Given a model of $Y_i$:(see [this website](https://mc-stan.org/users/documentation/case-studies/model-based_causal_inference_for_RCT.html).)
\begin{equation}
\mathrm{Pr}(Y_{i}^{obs} \vert \mathrm{Z}, \theta) \sim \mathsf{Normal}(Z_{i} \cdot \mu_{1} + (1 - Z_{i}) \cdot \mu_{0} , Z_{i} \sigma_{1}^{2} + (1 - Z_{i}) \cdot \sigma_{0}^{2})
\end{equation}
where $\mu_{0}=\alpha$ and $\mu_{1}=\alpha + \tau$.
> 2. And a model of the pair $\{y_{i,0},y_{i,1}\} \equiv \{Y_{i}(0), Y_{i}(1)\}$ but random not fixed as before (and so written as upper-case):
\begin{equation}
\begin{pmatrix} Y_{i}(0) \\ Y_{i}(1) \end{pmatrix} \biggm\vert \theta
\sim
\mathsf{Normal}
\begin{pmatrix}
\begin{pmatrix} \mu_{0} \\ \mu_{1} \end{pmatrix},
\begin{pmatrix} \sigma_{0}^{2} & \rho \sigma_{0} \sigma_{1} \\ \rho \sigma_{0} \sigma_{1} & \sigma_{1}^{2} \end{pmatrix}
\end{pmatrix}
\end{equation}
> 3. And a model of $Z_i$ is known because of randomization so we can write: $\mathrm{Pr}(\mathrm{Z}|\mathrm{Y}(0), \mathrm{Z}(1)) = \mathrm{Pr}(\mathrm{Z})$
> 4. And given priors on $\theta= \{ \alpha$, $\tau$, $\sigma_c$, $\sigma_t \}$ (here make them all independent Normal(0,5))).
We can generate the posterior distribution of $\alpha$, $\tau$, $\sigma_c$, and $\sigma_t$ and thus can impute $\{Y_{i}(0),Y_{i}(1)\}$ to generate a distribution for $\tau_i$.
## Model Based 1: Predict Distributions of Potential Outcomes
A snippet of `rctbayes.stan`:
```
model {
// PRIORS
alpha ~ normal(0, 5);
tau ~ normal(0, 5);
sigma_c ~ normal(0, 5);
sigma_t ~ normal(0, 5);
// LIKELIHOOD
y ~ normal(alpha + tau*w, sigma_t*w + sigma_c*(1 - w));
}
```
## Model Based 1: Predict Distributions of Potential Outcomes
```{r loadstan, echo=FALSE, message=FALSE,results='hide'}
## Not putting rstan in the top because installation can be involved and we don't plan to use it otherwise in this course.
library(rstan)
rstan_options(auto_write = TRUE)
```
```{r stanbayes, echo=TRUE, cache=TRUE, results="hide",warning=FALSE,message=FALSE}
## rho is correlation between the potential outcomes
stan_data <- list(N = 4, y = smdat$Y, w = smdat$Z, rho = 0)
# Compile and run the stan model
fit_simdat <- stan(file = "rctbayes.stan", data = stan_data, iter = 5000, warmup=2500,chains = 4,control=list(adapt_delta=.99))
res <- as.matrix(fit_simdat)
```
```{r rctbayes2, echo=TRUE, results="markup"}
## Summary of the 2000 Predicted Treatment effects for units 1 and 4
t(apply(res[,c("tau_unit[1]","tau_unit[4]")],2,summary))
## Probability that effect on unit 1 is greater than 0
mean(res[,"tau_unit[1]"]>0)
## Overall mean of the effects:
mean_tau <- rowMeans(res[,c("tau_unit[1]","tau_unit[2]","tau_unit[3]","tau_unit[4]")])
summary(mean_tau)
```
## Summmary: Modes of Statistical Inference for Causal Effects
We can infer about unobserved counterfactuals by:
1. assessing claims or models or hypotheses about relationships between
unobserved potential outcomes (Fisher's sharp null testing approach via
Rosenbaum) (this includes testing hypotheses of "no effects at all", "same
effect for all", or different effects for each unit, or anything in
between).
2. estimating averages (or other summaries) of unobserved potential outcomes
(Neyman's estimation approach) (Unbiased estimators of effects,
conservative variance estimators and the Central Limit
Theorem allow us to use Normal Approximations to use the difference of
means as a test statistic to test weak null hypotheses, too).
3. predicting individual level outcomes based on probability models of
outcomes, interventions, etc. (Bayes's predictive approach via Rubin)
We can go from tests to intervals (even to quite narrow intervals) because a
confidence interval is a collection of not-rejected hypothesis tests.
## Summary: Modes of Statistical Inference for Causal Effects
Statistical inferences --- formalized reasoning about "what if" statements
("What if I had randomly assigned other plots to treatment?") --- and their properties (like bias, error rates, precision) arise from:
1. Repeating the design and using the hypothesis and test statistics to
generate a reference distribution that describes the variation in the hypothetical world. Compare the observed to the hypothesized to measure consistency between hypothesis, or model, and observed outcomes (*Fisher and Rosenbaum's
randomization-based inference for individual causal effects*).
2. Repeating the design and the estimation such that standard errors, $p$-values,
and confidence intervals reflect design-based variability. Probability distributions (like the Normal or t-distribution) arise from Limit Theorems in large samples.
(*Neyman's randomization-based inference for average causal effects*).
3. Repeatedly drawing from the probability distributions that generate the
observed data (that represent the design) --- the likelihood and the
priors --- to describe a posterior distribution for unit-level causal
effects. Calculate posterior distributions for aggregated causal effects (like averages of individual level
effects). (*Bayes and Rubin's predictive model-based causal inference*).
## Summary: Applications of the Model-Based Prediction Approach
Examples of use of the model-based prediction approach:
- Estimating causal effects when we need to model processes of missing outcomes, missing treatment indicators, or complex non-compliance with treatment \parencite{barnard2003psa}
- Searching for heterogeneity (subgroup differences) in how units react to treatment (ex. \parencite{hahn2020bayesian} but see also literature on BART, Bayesian Machine Learning as applied to causal inference questions).
## Summary: Applications of the Testing Approach
Examples of use of the testing approach:
- Assessing evidence of pareto optimal effects or no aberrant effect (i.e. no
unit was made worse off by the treatment) \parencite{caughey2016beyond, rosenbaum2008aberrant}.
- Assessing evidence that the treatment group was made better than the control
group (but being agnostic about the precise nature of the difference) (ex.
$p>.2$ with difference of means but $p<.001$ with difference of ranks in
Office of Evaluation Sciences study of General Services Administration
Auctions)
- Focusing on detection rather than on estimation (for example to identify
promising sites for future research in experiments with many blocks or
strata) (Bowers and Chen 2020 working paper).
- Assessing hypotheses of no effects in small samples, with rare outcomes,
cluster randomization, or other designs where reference distributions may
not be Normal (see for example, \parencite{gerber2012field}).
- Assessing structural models of causal effects (for example models of
treatment effect propagation across networks)
\parencite{bowers2016research,bowers2013sutva,bowers2018models}.
# Linear Regression to Enhance Precision in Experiments
## What does linear regression do in an experiment?
Let's make a slightly bigger dataset where the potential outcomes depend on a background covariate:
```{r newdat, echo=TRUE}
N <- 100
tau <- .2
dat <- data.frame(id=1:N,
x1 = rpois(n = N, lambda = 10))
dat <- mutate(dat,
y0 = .5*sd(x1)*x1 + runif(n = N,min=-2*sd(x1),max=2*sd(x1)),
y1 = y0 + tau * sd(y0) #+ runif(n = N,min=-.5*sd(y0),max=.5*sd(y0)),
)
set.seed(12345)
dat$Z <- complete_ra(N=N,m=floor(N/2))
dat <- mutate(dat,Y=Z*y1 + (1-Z)*y0)
dat$tau <- with(dat,y1-y0)
head(dat)
##summary(lm(y0~x1,data=dat))$r.squared
## blah <- lm_robust(Y~Z,data=dat); blah$p.value["Z"]
##blah2 <- lm_robust(Y~Z+x1,data=dat); blah2$p.value["Z"]
```
## What does linear regression do in an experiment?
Let's make a slightly bigger dataset where the potential outcomes depend on a background covariate:
Notice that background outcomes are now a function of a background covariate:
```{r echo=TRUE,out.width="50%"}
with(dat,cor(x1,y0))
summary(lm(y0~x1,data=dat))$r.squared
with(dat,plot(x1,y0))
```
## What does linear regression do in an experiment?
First, how does linear regression "adjust" for x1?
```{r simpcovadj, echo=TRUE}
unadj_est <- lm_robust(Y~Z,data=dat)
adj_est <- lm_robust(Y~Z+x1,data=dat)
c(coef(unadj_est)[2],coef(adj_est)[2])
```
## What does linear regression do in an experiment?
First, how does linear regression "adjust" for x1? (Answer: by removing linear relationships between `x1` and `Z` and between `x1` and `Y`):
```{r covadj2, echo=TRUE}
lm_Y_x1 <- lm(Y~x1,data=dat)
lm_Z_x1 <- lm(Z~x1,data=dat)
dat$resid_Y_x1 <- resid(lm_Y_x1)
dat$resid_Z_x1 <- resid(lm_Z_x1)
lm_resid_Y_x1 <- lm(resid_Y_x1~x1,data=dat)
lm_resid_Z_x1 <- lm(resid_Z_x1~x1,data=dat)
```
## What does linear regression do in an experiment?
First, how does linear regression "adjust" for x1? (Answer: by removing linear relationships between `x1` and `Z` and between `x1` and `Y`):
```{r plotresids, echo=FALSE,out.width=".8\\textwidth"}
par(mfrow=c(2,2),mar=c(2,3,1,0),mgp=c(1.25,.5,0),oma=rep(0,4))
with(dat,plot(x1,Y,col=c("black","blue")[dat$Z+1],pch=c(1,19)[dat$Z+1]))
abline(lm_Y_x1)
with(dat,plot(x1,resid_Y_x1,ylab="Y - b*x1 or Y without linear relation with x1",col=c("black","blue")[dat$Z+1],pch=c(1,19)[dat$Z+1]))
abline(lm_resid_Y_x1)
# with(dat,plot(x1,jitter(Z,factor=.1)))
with(dat,plot(x1,Z,col=c("black","blue")[dat$Z+1],pch=c(1,19)[dat$Z+1]))
abline(lm_Z_x1)
with(dat,plot(x1,resid_Z_x1,ylab="Y - b*x1 or Y without linear relation with x1",col=c("black","blue")[dat$Z+1],pch=c(1,19)[dat$Z+1]))
abline(lm_resid_Z_x1)
```
## What does linear regression do in an experiment?
Linear regression "adjusts" the $Z \rightarrow Y$ relationship for a background covariate $x$ by removing linear relationships: For example, here we show how it works after removing the linear relationships between `x1` and `Z` and between `x1` and `Y`):
```{r residlm, echo=TRUE}
coef(adj_est)[2]
adj_est2 <- lm(resid_Y_x1 ~ resid_Z_x1,data=dat)
coef(adj_est2)[2]
stopifnot(all.equal(coef(adj_est)[[2]],coef(adj_est2)[[2]]))
```
## What does linear regression do in an experiment?
Linear regression "adjusts" the $Z \rightarrow Y$ relationship for a background covariate $x$ by removing linear relationships: For example, here we show how it works after removing the linear relationships between `x1` and `Z` and between `x1` and `Y`):
```{r plot2, out.width=".8\\textwidth"}
dat$ZF <- factor(dat$Z)
with(dat,plot(x1,Y,col=c("black","blue")[dat$Z+1],pch=c(1,19)[dat$Z+1]))
preddat <- expand.grid(Z=c(0,1),x1=range(dat$x1))
preddat$fit <- predict(adj_est,newdata=preddat)
with(preddat[preddat$Z==0,],lines(x1,fit))
with(preddat[preddat$Z==1,],lines(x1,fit,col="blue",lwd=2))
```
## What does linear regression do in an experiment?
Why might we adjust in an experiment? The idea is to increase precision. Notice the smaller standard errors arising from the fact that we've removed noise independent of treatment from the outcome (recall that background covariates are indepedent of treatment assignment).
```{r expadj, echo=TRUE}
unadj_est$std.error["Z"]
adj_est$std.error["Z"]
```
But what about bias? The point estimates differ here (because in finite samples,
independence doesn't guarantee lack of correlation!)
```{r bias1, echo=TRUE}
coef(unadj_est)[["Z"]]
coef(adj_est)[["Z"]]
```
## How would we know whether the bias is too big? {.allowframebreaks}
Different may not mean worse. Let's see how these (and some other) estimators work. First, just specify a bunch of different estimators:
```{r biassimsetup, echo=TRUE}
## The truth:
trueATE_estimand <- with(dat,mean(y1-y0))
## A function to make a new experiment.
new_exp <- function(Z){
## This next shuffles Z
newZ <- sample(Z)
return(newZ)
}
## The est1.. functions are estimators of the trueATE, each returning a single number --- the estimatedATE
est1 <- function(Z,Y,x=NULL){
mean(Y[Z==1]) - mean(Y[Z==0])
}
est2 <- function(Z,Y,x=NULL){
coef(lm(Y~Z))[["Z"]]
}
est3 <- function(Z,Y,x){
## "controlling for" but not really because this is an experiment.
coef(lm(Y~Z+x))[["Z"]]
}
est4 <- function(Z,Y,x){
quantY <- quantile(Y,.9)
## Recode highest 10% of outcome scores to 90% percentile to minimize effect of outliers.
Y[Y>quantY] <- quantY
coef(lm(Y~Z+x))[["Z"]]
}
est5 <- function(Z,Y,x){
## Use the Lin method of covariance adjustment to minimize bias
thedat <- data.frame(Z=Z,Y=Y,x=x)
lm_lin_mod <- lm_lin(Y~Z,covariates=~x,data=thedat)
coef(lm_lin_mod)[["Z"]]
}
est6 <- function(Z,Y,x){
## Use the Rosenbaum method of covariance adjustment to increase precision
resids1 <- resid(lm(Y~x))
coef(lm(resids1~Z))[["Z"]]
}
## A function to use them all on a new experiment --- after repeating it and re-revealing potential outcomes.
compare_ests <- function(Z,y1,y0,x){
newZ <- new_exp(Z)
newY <- newZ * y1 + (1-newZ)*y0
est1_coef <- est1(Z=newZ,Y=newY)
est2_coef <- est2(Z=newZ,Y=newY)
est3_coef <- est3(Z=newZ,Y=newY,x=x)
est4_coef <- est4(Z=newZ,Y=newY,x=x)
est5_coef <- est5(Z=newZ,Y=newY,x=x)
est6_coef <- est6(Z=newZ,Y=newY,x=x)
return(c(est1=est1_coef,est2=est2_coef,est3=est3_coef,
est4=est4_coef,est5=est5_coef,est6=est6_coef))
}
```
## How would we know whether the bias is too big? {.allowframebreaks}
Different may not mean worse. Let's see how these (and some other) estimators work. First, just specify a bunch of different estimators:
```{r dobiassims, echo=TRUE, cache=TRUE}
set.seed(1235)
nsims <- 5000
dist_ests <- with(dat,
replicate(nsims,compare_ests(Z=Z,y1=y1,y0=y0,x=x1)))
apply(dist_ests,1,summary)
rbind(apply(dist_ests,1,sd),
apply(dist_ests,1,function(x){ trueATE_estimand - mean(x)}))
```
## How would we know whether the bias is too big?
Different may not mean worse. Let's see how these estimators work (black line is
true ATE):
```{r out.width=".9\\textwidth"}
estdat <- data.frame(tauhat = c(dist_ests[1,],dist_ests[2,],dist_ests[3,],dist_ests[4,],dist_ests[5,],dist_ests[6,]),
est=rep(c("mean diff","lm","cov adj","topcoded+adj","lin adj","rose adj"),each=nsims))
estmeans <- estdat %>% group_by(est) %>% summarize(mnest=mean(tauhat))
# http://www.cookbook-r.com/Graphs/Colors_(ggplot2)/#a-colorblind-friendly-palette
# The palette with black:
cbbPalette <- c("#000000", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
# To use for fills, add
# scale_fill_manual(values=cbPalette)
# To use for line and point colors, add
g1 <- ggplot(data=estdat,aes(x=tauhat,group=est,color=est))+
geom_density() +
geom_point(data=estmeans,aes(y=.1,x=mnest,group=est,color=est)) +
geom_vline(xintercept=trueATE_estimand) +
scale_colour_manual(values=cbbPalette) +
theme_bw()
print(g1)
```
## More on the Rosenbaum 2002 approach:
The first approach worked because the treatment was independent of the covariate. Yet, in a finite sample, there is still some relationship --- that was why the plot `Z` by `x1` and the plot of the `resid_Z_x1` by `x1` were not the same.
```{r echo=TRUE}
summary(lm(Z~x1,dat))$r.squared
```
Another approach (from @rosenbaum2002) does not adjust `Z` at all --- after all, we know that is randomized. We have no need to adjust `Z` we just want to remove non-treatment related noise in our outcome in order to increase precision. Notice that we also gain in precision here by not reducing the variation in `Z`.
```{r echo=TRUE}
rose_mod <- lm_robust(resid_Y_x1~Z,data=dat)
rose_mod$p.value["Z"]
unadj_est$p.value["Z"]
adj_est$p.value["Z"]
```
## More on the Lin 2013 approach:
@freedman2008a showed the bias in simple covariance adjusted estimates of the ATE. @lin2013 showed how to significantly reduce this bias.
```{r linapproach, echo=TRUE}
## remove the mean from the covariate, center the covariate.
dat$x1_c <- with(dat,x1 - mean(x1))
## fit a model interacting the centered covariate with the treatment
lm_lin1 <- lm_robust(Y~Z + Z*x1_c,data=dat)
coef(lm_lin1)[["Z"]]
lm_lin2 <- lm_lin(Y~Z,covariates=~x1,data=dat)
coef(lm_lin2)[["Z"]]
c(lm_lin2$std.error["Z"],
lm_lin1$std.error["Z"],
adj_est$std.error["Z"],
rose_mod$std.error["Z"],
unadj_est$std.error["Z"])
```
## Linear Regression for Covariance Adjustment in a RCT
Covariance adjustment is the removal of linear relationships. We can think of
"removing a linear relationship" as "residualization".
- In randomized experiments:
- Direct covariance adjustment can increase precision in the estimation of average
causal effects.
- The precision comes with some bias (perhaps a very small amount, perhaps worthwhile making the bias versus variance tradeoff in some cases, perhaps not in others).
- We can reduce the bias following @lin2013.
- More precision can arise if we don't adjust the treatment following @rosenbaum2002 (could involve some bias here too).
\medskip
We don't say "controlling for" when we adjust for covariates in an experiment
because we not removing their relationship with $Z$ (after all, randomization
is supposed to have already done that. If asked "What is the correlation
between $X$ and $Z$?" the answer is "On average, 0.")
## Summary of the Day
- There is more than one way to use what we observe to reason about potential
outcomes: testing, estimating, predicting.
- Linear regression is the most common way to "adjust for" background
covariates in non-experimentable studies. We learned about what it means to
adjust for covariates in experimental studies so that we can better
evaluate linear regression in observational studies (tomorrow).
- Structure of the course. Where we are now. Where we are going: away from
randomized experiments toward observational studies, but with the
statistical basis developed from our understanding of randomized
experiments.
## References