-
Notifications
You must be signed in to change notification settings - Fork 1
/
oneblast_analysis.Rmd
511 lines (395 loc) · 22.1 KB
/
oneblast_analysis.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
---
title: "DoD Military Community and Family Policy Military OneSource eNewsletter"
author: "Paul Testa and Jake Bowers"
date: '`r format(Sys.Date(), "%B %d, %Y")`'
output:
html_document:
toc: TRUE
---
```{r init,echo=F, results=F}
## Easy way to look for and install missing packages and load them
if (!require("pacman")){ install.packages("pacman") }
pacman::p_load("knitr","openssl","mosaic","plyr","coin","ggplot2")
opts_chunk$set(tidy=TRUE,echo=TRUE,results='markup',strip.white=TRUE,cache=FALSE,highlight=TRUE,width.cutoff=132,size='footnotesize',message=FALSE,warning=TRUE,comment=NA)
options(digits=4,width=100,scipen=8)
```
\tableofcontents
# Setup
- Download "wrkdat.csv" onto local machine in the data subdirectory
- Load files and libaries
```{r setup}
wrkdat<-read.csv("data/wrkdat.csv",as.is=TRUE)
```
The outcome is whether or not a member of the experimental pool subscribed to the newsletter. We currently assume that all members of the experimental pool who have a 0 for subscribed did not subscribe under a different name and email address. We return to scrutinize the effect of this assumption later.
```{r dv}
# Subscribed is the primary outcome
table(wrkdat$subscribed,exclude=c())
## Make some variables into factors for contingency table analysis
wrkdat$batchday2F<-factor(wrkdat$batchday2)
wrkdat$treatmentF<-factor(wrkdat$treatment)
wrkdat$subscribedF<-factor(wrkdat$subscribed)
```
Overall, we the active choices were better than the opt in:
```{r}
wrkdat$active<-ifelse(wrkdat$treatment %in% c("F","D","B"),1,0)
prop.table(with(wrkdat,table(active,subscribed)),margin=1)
wrkdat$short<-ifelse(wrkdat$treatment %in% c("F","E"),1,0)
prop.table(with(wrkdat,table(short,subscribed)),margin=1)
```
# Test for any difference across treatments
Because the experiment was randomized by batch, and because of trouble matching subscribers uniquely to the experimental pool, we analyze the data by a collapsed version of the 82 batch indicators that we call `batchday`. Nearly all batches were sent 2 per day and the subscriber data only has day of subscription, not hour or minute. This choice, to make new experimental blocks from the original ones will decrease statistical power, but, because the blocks were basically the same size, and because we take blocking into account when we calculate estimates, our estimates will be unbiased.
Do we have enough information to reject the claim that there are no differences between the treatments? We use the generalized Cochran-Mantel-Haensel test to address this question. We are using the coin library because it allows for easy assessment of this test without large sample assumptions by simulating from the reference distribution implied by the experiment itself.
```{r cmh}
library(coin)
cmh1approx<-cmh_test(subscribedF~treatmentF|batchday2F,data=wrkdat,distribution=approximate(B=1000))
cmh1<-cmh_test(subscribedF~treatmentF|batchday2F,data=wrkdat)
bigtab<-with(wrkdat,table(treatmentF,subscribed,batchday2F))
cmh2 <- mantelhaen.test(bigtab)
c(pvalue(cmh1),pvalue(cmh1approx)[[1]],cmh2$p.value)
```
Those $p$-values indicate that there is a lot of evidence against the idea that all six emails had the same effect and, despite the rareness of the outcome, the substantive interpretation of the test is the same whether or not we rely on large sample assumptions or sample directly from the randomization distribution.
# Estimation and Inference for Difference of Proportions
Now we'll start to draw comparisons between specific treatments and groups of treatments.
In the figure below we show the proportions subscribing in each treatment group with 95-percent confidence intervals. Treatments B and F appear to be the most successful at encouraging subscription to the newsletter. Because treatment was assigne by block, our overall estimates of proportions are weighted by the size of the block. Here, we follow @lin2013agnostic and the [Green Lab Standard Operating Procedure](https://github.com/acoppock/Green-Lab-SOP) calculating these quantities.
Block randomized experiments require that the estimand be defined with respect
to a scheme for weighting the contributions of the different block-specific
estimates, and an estimator which is unbiased (or at least consistent) for that
estimand. In this case, we have two possible estimands: When block sizes vary
and especially when the proportion treated per block varies greatly, one can
substantially increase precision by estimating using what are known as harmonic
weights. In this case, with more or less equal sized blocks (40 out of 41 are
equal), and more or less equal proportions, we use the simple block size
weighting because the proportions themselves are easier to interpret in such a case.
```{r eval=FALSE, echo=FALSE, include=FALSE}
## Harmonic Mean weighted estimates of the proportions. Not included in the rendered version
lm1<-lm(subscribed~treatment+batchday2F,data=wrkdat)
coef(lm1)[1:10]
propsw<- c(treatmentA=coef(lm1)[[1]],coef(lm1)[1]+coef(lm1)[2:6])
propsw
```
We will want to produce confidence intervals and test hypotheses about the
difference proportions, and Lin (2011,2013) shows that one can use a least squares model to estimate block size weighted differences of proportions and also that the design-based standard errors (and p-values and confidence intervals) can be directly calculated using the HC2 standard errors (heteroskedasticity consistent type 2 standard errors).
Not shown: Test that we can recover block size weighted quantites from least squares using a subset of the data.
```{r testbatchsizeweighting, echo=FALSE, results=FALSE}
## Batch size weighting
### Verifying code using only 3 strata
wrkdatsmall<-droplevels(wrkdat[wrkdat$batchday2 %in% c(1,2,3),c("subscribed","treatmentF","batchday2F")])
tausbyhand<-function(dat,block,treatment,outcome){
##block is a factor indicating block membership
##treatment is factor indicating treatment assignment
##outcome is numeric (could be binary or continuous)
##dat is a data.frame or matrix
prop_b<-table(dat[,block])/nrow(dat)
tau_b<-sapply(split(dat,dat[,block]),function(blockdat){
sapply(levels(blockdat[,treatment]),function(i){
mean(blockdat[,outcome][blockdat[,treatment]==i]) }) })
tau_bw<-apply(tau_b[,names(prop_b)],1,function(x){ x[names(prop_b)]*prop_b})
stopifnot(all.equal(tau_b[,"3"]*prop_b["3"], tau_bw["3",]))
taus<-colSums(tau_bw)
return(taus)
}
taus<-tausbyhand(dat=wrkdatsmall,block="batchday2F",treatment="treatmentF",outcome="subscribed")
## Now try this using lm()
prop_b<-table(wrkdatsmall$batchday2F)/nrow(wrkdatsmall)
B <- model.matrix(~batchday2F+0,data=wrkdatsmall)
Bmd <- sapply(1:ncol(B),function(i){ B[,i] - prop_b[i] }) ## subtract off proportion in B
colnames(Bmd)<-colnames(B)
wrkdatsmall[,colnames(Bmd)]<-Bmd
bigfmla<-as.formula(paste("subscribed~treatmentF*(",paste(colnames(Bmd)[-1],collapse="+"),")"))
lm2<-lm(bigfmla,data=wrkdatsmall)
coef(lm2)[1:10]
propswInter<- c(treatmentA=coef(lm2)[[1]],coef(lm2)[1]+coef(lm2)[2:6])
propswInter
taus
stopifnot(all.equal(taus,propswInter,check.attributes=FALSE))
## THe mean of the batch factor indicators is 0 by construction, this is why we can interpret the coefs on treatmentF as telling us about an average across the batches. This next shows this.
newdat<- expand.grid(treatmentF=levels(wrkdatsmall$treatmentF), batchday2F2=c(0), batchday2F3=c(0))
propsInter2<-predict(lm2,newdata=newdat)
stopifnot(all(propsInter2-propswInter==0))
stopifnot( all(abs(taus - propsInter2) < 1e-10) ) ## funny floating point differences
rm(B,Bmd,bigfmla,propswInter,newdat,propsInter2)
```
```{r lmbatch, cache=TRUE}
## Batch size weighting
## First calculate using a function that does the weighting directly
realtaus<-tausbyhand(dat=wrkdat,block="batchday2F",treatment="treatmentF",outcome="subscribed")
## Now try this using lm()
prop_b<-table(wrkdat$batchday2F)/nrow(wrkdat)
B <- model.matrix(~batchday2F+0,data=wrkdat)
Bmd <- sapply(1:ncol(B),function(i){ B[,i] - prop_b[i] })
colnames(Bmd)<-colnames(B)
wrkdat[,colnames(Bmd)]<-Bmd
bigfmla<-as.formula(paste("subscribed~treatmentF*(",paste(colnames(Bmd)[-1],collapse="+"),")"))
lm2<-lm(bigfmla,data=wrkdat)
coef(lm2)[1:10]
estprops<- c(treatmentA=coef(lm2)[[1]],coef(lm2)[1]+coef(lm2)[2:6])
stopifnot(all.equal(realtaus,estprops,check.attributes=FALSE))
## This next shows that we can get the same answers as suggested by Lin if we don't exclude a level of treatment.
lm3<-lm(update(bigfmla,.~+batchday2F1-1+.),data=wrkdat)
lm3props<-coef(lm3)[grep("treatment.[A-F]$",names(coef(lm3)))]
stopifnot(all(abs(lm3props-estprops)<1e-10))
lm3props
```
Under the assumption that all unmatchable subscribers were not a part of the experiment, we would say that Treatment F was the most powerful, although, the estimated proportions of subscribers are quite low (on the order of 1 to 2 percentage points). Can we distinguish these proportions from 0?
We get the CIs using the HC2 based standard errors following the development in Lin 2011 etc.
```{r}
library(sandwich)
library(lmtest)
source("confintHC.R") ## our own CI maker with HC SEs
theCIs<-confint(lm3,parm=names(lm3props),vcov=vcovHC(lm3,type="HC2"))
wrkdat$activeF <- factor(wrkdat$active)
wrkdat$shortF <- factor(wrkdat$short)
activefmla<-as.formula(paste("subscribed~activeF*(",paste(colnames(Bmd)[-1],collapse="+"),")"))
shortfmla<-as.formula(paste("subscribed~shortF*(",paste(colnames(Bmd)[-1],collapse="+"),")"))
lmActive <- lm(activefmla,data=wrkdat)
lmShort <- lm(shortfmla,data=wrkdat)
lmActiveprops <- coef(lmActive)[1:2]
lmShortprops <- coef(lmShort)[1:2]
activeCIs<-confint(lmActive,parm=names(lmActiveprops),vcov=vcovHC(lmActive,type="HC2"))
shortCIs<-confint(lmShort,parm=names(lmShortprops),vcov=vcovHC(lmShort,type="HC2"))
c(lmActiveprops,activeCIs[2,])
c(lmShortprops,shortCIs[2,])
```
We show these results in the following graph:
```{r fig1, cache=TRUE, tidy=FALSE, echo=FALSE, results=TRUE}
theprops<-data.frame(estprops,theCIs)
# Add names and Treatment indicator
colnames(theprops)<-c("pbar","ll","ul")
row.names(theprops)<- c('A','B','C','D','E','F')
theprops$Treatment<-c("Opt-in; List",
"Active; List",
"Opt-in; Quiz",
"Active; Quiz",
"Opt-in",
"Active"
)
theprops$Treatment <- factor(theprops$Treatment, levels=theprops$Treatment[order(theprops$pbar)])
# Plot with ggplot2
## library(ggplot2)
dodge<-position_dodge(width = 0.9)
p.props<-ggplot(theprops,aes(Treatment,pbar,fill=Treatment))+
geom_bar(stat = "identity", position = dodge)+
geom_errorbar(aes(ymin=ll,ymax=ul),position = dodge,width=.25)+
labs(list(y="Proportion Subscribing to Newsletter",
title="Proportion Subscribing by Treatment Group"))+
theme(axis.text.x = element_blank(),axis.ticks.x = element_blank())
p.props
```
# Test pairwise differences of treatments
This study is clearly big enough to distinguish the effects of any one of the treatments from 0. How do the treatments compare to each other? Next we provide a more formal assessment of the 15 pairwise comparisons by conducting a Tukey Honest Signficant Differences test on an ANOVA model weighted by the size of each batch, with family-wise confidence intervals that reflect the fact that we are making multiple comparisons.
```{r tukeyHSD, warning=FALSE, message=FALSE, cache=TRUE}
# Anova weighted by size of batch
a1<-aov(formula(lm3),wrkdat)
coef(a1)[names(lm3props)]
thsda1<-TukeyHSD(a1,which="treatmentF")
```
We see from the next figure that `r sum(thsda1$treatmentF[,4]<0.05)` out of the `r choose(6,2)` pairwise comparisons are statistically distinguishable from 0 ($p<0.05$). Specifically, the effects for Treatment F are significantly larger than any other treatment except Treatment B --- and there the 95% confidence interval just barely includes 0. The effect for Treatment B, although larger than the remaining Treatments (A,C,D and E) is only distinguishable from Treatment C.
```{r figTukey, cache=FALSE, results=TRUE}
# Get estimates and intervals
tukey<-data.frame(thsda1$treatmentF)
##plot(TukeyHSD(a1))
tukey$Comparison<-factor(rownames(tukey),levels=rev(rownames(tukey)))
tukey$Comparison<-factor(tukey$Comparison,levels=tukey$Comparison[order(tukey$diff)])
p.tukey<-ggplot(tukey,aes(y=Comparison,x=diff))+
geom_vline(xintercept=0,linetype = 2,col="red")+
geom_errorbarh(aes(xmin=lwr,xmax=upr),height=.25,col="grey")+
geom_point()+
labs(x="Difference in proportions",title="95% family-wise confidence level")
p.tukey
```
# Test differences in the prespecified subgroups of treatments
This section makes the following comparisons:
- A,C,E versus B,D,F (a test of opt-in versus enhanced active).
- A,B,C,D versus E,F (a test of Block of 10 present versus absent)
- A,B versus C,D (a test of list versus quiz format)
To make life easier, we'll create indicators of these groupings, write a function to conduct generalized Cochran-Mantel-Haensel tests of these differences, accounting for batch^[We also need to fix some code in the mantelhaen.test to tell R to treat integers as floating point values]
```{r subgroup}
# A,C,E versus B,D,F (a test of opt-in versus enhanced active).
wrkdat$treat_bdf_01<-as.numeric(grepl("B|D|F",wrkdat$treatment))
wrkdat$treat_bdf<-factor(ifelse(grepl("B|D|F",wrkdat$treatment),"Enhanced Active","Opt-In"))
table(wrkdat$treat_bdf)
# A,B,C,D versus E,F (a test of Block of 10 present versus active)
wrkdat$treat_abcd_01<-as.numeric(grepl("A|B|C|D",wrkdat$treatment))
wrkdat$treat_abcd<-ifelse(grepl("A|B|C|D",wrkdat$treatment),"Present","Absent")
wrkdat$treat_abcd<-factor(wrkdat$treat_abcd,levels=c("Present","Absent"))
# A,B versus C,D (a test of list versus quiz format)
wrkdat$treat_ab_01<-as.numeric(grepl("A|B",wrkdat$treatment))
wrkdat$treat_ab_01[!grepl("A|B|C|D",wrkdat$treatment)]<-NA
wrkdat$treat_ab<-factor(ifelse(wrkdat$treat_ab_01==1,"List","Quiz"))
```
```{r echo=FALSE}
# Use an updated mantel-haenszel test so it can calculate CIs for odds ratio
source("mymantelhaentest.R")
# Load a function to make nice tables
source("tablemaker.R")
```
## Overall table
The following table repeats and collects the results from above.
```{r overalltab, results="asis",cache=TRUE}
taball<-table.maker(wrkdat$treatmentF,xnm="treatmentF",ynm="subscribed",reflevel="F")
kable(taball,caption="All Treatments")
```
## A,C,E versus B,D,F (a test of opt-in versus enhanced active).
The proportion subscribing among the enhanced active treatments was higher than the proportion among the opt-in treatments (not surprising from the perspective of past literature).
```{r tabbdf,results="asis",cache=TRUE}
tab_bdf<-table.maker(wrkdat$treat_bdf,xnm="treat_bdf",ynm="subscribed")
kable(tab_bdf,caption="Enhanced Active versus Single Opt-In")
```
## A,B,C,D versus E,F (a test of Block of 10 present versus active)
Treatments E and F encouraged more subscriptions than A,B,C, or D as a group.
```{r tababcd,results="asis",cache=TRUE}
## A,B,C,D versus E,F (a test of Block of 10 present versus active)
tab_abcd<-table.maker(wrkdat$treat_abcd,xnm="treat_abcd",ynm="subscribed")
kable(tab_abcd,caption="Block of 10 Present or Absent")
```
## A,B versus C,D (a test of list versus quiz format)
The list format appears more effective than the quiz format.
```{r tabab,results="asis",cache=TRUE}
# A,B versus C,D (a test of list versus quiz format)
tab_ab<-table.maker(wrkdat$treat_ab,xnm="treat_ab",ynm="subscribed")
kable(tab_ab, caption="List versus Quiz Format")
```
# Assess sensitivity of the results to different ways of distributing unmatched cases
The table below shows the number of unmatched cases by batch day in the list of subscribers (i.e. people shown as suscribing who we were unable to match to the experimental pool by either email, full name, or last name).
```{r unmatch}
# List of unmatchables by batch day
# The count comes from a table that we created using the subscribers data and posted to github
unmatchables<-data.frame(batchday2=sort(unique(wrkdat$batchday2)),
count=c(24,6,14,33,20,24,19,38,8,18,8,99,14,23,20,48,16,12,22,44,16,53,12,26,40,13,64,23,54,24,29,15,52,10,11,65,22,25,40,10,81))
tab_unmatch<-unmatchables
colnames(tab_unmatch)<-c("Batch","Unmatched Cases")
kable(tab_unmatch,caption="Unmatched cases by batch")
```
It is possible that these unmatched cases represent successful interventions
(i.e. people who subscribed to the newsletter because of the emails but who did
not provide the same name or email address), it is also possible that they
reflect subscriptions by people who were not a part of the experimental pool
(and so should be ignored, in essence, we are assuming that this is the case in
the preceding analyses). In this section, we assess the consequences of that
assumption through a series of simulations that ask what would our results look
like had we been able to match these subscribers to members of the experimental
pool.
To set bounds on our results we examine the results from assigning all 1,195
unmatched cases to each of the six treatment conditions, while respecting the
distribution of unmatched cases by batch. (Code not shown.)
```{r ext, echo=FALSE, results=FALSE}
## There were a total of 81980 people assigned to treatment "A", so the, at maximum, we could increase those in treatment "A" by .015 or 1.5% (this is if all of the people who were missing were actually supposed to be in treatment A.
1195/sum(wrkdat$treatment=="A")
## What if all 1195 unmatched subscribers were assigned to treatment A?
## That would mean that an additional 24 people in batchday2==1 in treatment group A should be counted as subscribers, and 6 in batch 2, etc..
wrkdat$yallA<-wrkdat$subscribed
### For Batch 1
## Assigning the first 24 to 1 since we have no other covariate information here
wrkdat$yallA[which(wrkdat$batchday2==1 & wrkdat$treatmentF=="A" & wrkdat$subscribed==0)[1:24]]<-1
with(wrkdat[wrkdat$batchday2==1,],ftable(subscribed,yallA,treatmentF))
with(wrkdat[wrkdat$batchday2==1,],ftable(subscribed,treatmentF))
wrkdat$yallA<-NULL
## For all Batches
wrkdat$yallA<-wrkdat$subscribed
for(i in 1:nrow(unmatchables)){
batch<-unmatchables$batchday2[i]
num<-unmatchables$count[i]
valuestoreplace<- which(wrkdat$treatmentF=="A" & wrkdat$batchday2==batch & wrkdat$subscribed==0)
stopifnot(num<=valuestoreplace)
wrkdat$yallA[valuestoreplace[1:num]]<-1
}
testB1<-with(wrkdat[wrkdat$batchday2==1,],ftable(subscribed,yallA,treatmentF))
testB5<-with(wrkdat[wrkdat$batchday2==5,],ftable(subscribed,yallA,treatmentF))
stopifnot(testB1[2,1]==24)
stopifnot(testB5[2,1]==20)
## Turn this into a function:
boundedoutcome<-function(trtvar,trtval){
yall<-wrkdat$subscribed
for(i in 1:nrow(unmatchables)){
## message(i)
batch<-unmatchables$batchday2[i]
num<-unmatchables$count[i]
valuestoreplace<- which(wrkdat[[trtvar]]==trtval & wrkdat$batchday2==batch & wrkdat$subscribed==0)
stopifnot(num<=length(valuestoreplace))
yall[valuestoreplace[1:num]]<-1
}
return(yall)
}
wrkdat$tmp<-boundedoutcome(trtvar="treatmentF",trtval="A")
stopifnot(all.equal(wrkdat$tmp,wrkdat$yallA))
wrkdat$tmp<-NULL
wrkdat$yallA<-NULL
wrkdat$yallA<-boundedoutcome(trtvar="treatmentF",trtval="A")
wrkdat$yallB<-boundedoutcome(trtvar="treatmentF",trtval="B")
wrkdat$yallC<-boundedoutcome(trtvar="treatmentF",trtval="C")
wrkdat$yallD<-boundedoutcome(trtvar="treatmentF",trtval="D")
wrkdat$yallE<-boundedoutcome(trtvar="treatmentF",trtval="E")
wrkdat$yallF<-boundedoutcome(trtvar="treatmentF",trtval="F")
extremeoutcomes<-sort(grep("yall",names(wrkdat),value=TRUE))
## Make factors for cmh
for(i in extremeoutcomes){
wrkdat[[paste(i,"F",sep="")]]<-factor(wrkdat[[i]])
}
```
Now, we re-calculate the p-values for the overall test distinguishing the
treatments from each other. And we also present the different estimates of
proportions and their associated CIs.
First, we see that even under the extreme scenarios that we consider (all
unmatched subscribers assigned to only one treatment group), we can still
distinguish the effects of the different treatment groups.
```{r boundedcmh, cache=TRUE}
extremeoutcomesF<-paste(extremeoutcomes,"F",sep="")
cmhps<-sapply(extremeoutcomesF,function(y){
f <- reformulate(paste("treatmentF","|batchday2F"),response=y)
# CMH Test, use asymptotic results for now for now
cmh<-cmh_test(f,data=wrkdat)
pvalue(cmh)
})
cmhps
```
These tests show that, across the ways that subscription are asssigned (all to Treatment A, all to B, etc...), we can always distinguish the effects of the treatments from each other.
## The most effective treatment depends on our assumption of no non-pool subscribers
Below we show that, if all unmatched subscribers came from any one treatment arm, that our results would change: whichever arm receives those subscribers is the most effective treatment. We are sure that the actual results, should we be able to know which subscribers came from outside the experiment and which subscribers matched to which row in the experimental pool, would be less extreme than this. However, we do not know how much less extreme.
Now we show how the proportions subscribing in each treatment group might differ.
```{r results="asis",cache=TRUE}
tabAllA<-table.maker(wrkdat$treatmentF,xnm="treatmentF",ynm="yallA",reflevel="F")
kable(tabAllA, caption="If all the missing subcribers were in treatment A")
```
```{r results="asis",cache=TRUE}
tabAllB<-table.maker(wrkdat$treatmentF,xnm="treatmentF",ynm="yallB",reflevel="F")
kable(tabAllB, caption="If all the missing subcribers were in treatment B")
```
```{r results="asis",cache=TRUE}
tabAllC<-table.maker(wrkdat$treatmentF,xnm="treatmentF",ynm="yallC",reflevel="F")
kable(tabAllC, caption="If all the missing subcribers were in treatment C")
```
```{r results="asis",cache=TRUE}
tabAllD<-table.maker(wrkdat$treatmentF,xnm="treatmentF",ynm="yallD",reflevel="F")
kable(tabAllD, caption="If all the missing subcribers were in treatment D")
```
```{r results="asis",cache=TRUE}
tabAllE<-table.maker(wrkdat$treatmentF,xnm="treatmentF",ynm="yallE",reflevel="F")
kable(tabAllE, caption="If all the missing subcribers were in treatment E")
```
```{r results="asis",cache=TRUE}
tabAllF<-table.maker(wrkdat$treatmentF,xnm="treatmentF",ynm="yallF",reflevel="F")
kable(tabAllF, caption="If all the missing subcribers were in treatment F")
```
# Code Appendix
```{r appendix, eval=F,echo=T}
<<setup>>
<<dv>>
<<cmh>>
<<fig1>>
<<TukeyHSD>>
<<figTukey>>
<<subgroup>>
<<tabbdf>>
<<tababcd>>
<<tabab>>
<<unmatch>>
<<sensitivity>>
<<examp>>
<<cmhSim>>
<<figCMHsim>>
<<tukeySim>>
<<tukeytab>>
<<figTukeypval>>
<<figTukeycoef>>
<<ext>>
```