-
Notifications
You must be signed in to change notification settings - Fork 4
/
powersimtemplate.Rmd
187 lines (137 loc) · 8.47 KB
/
powersimtemplate.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
---
title: Power Analysis by Simulation
author: JM and JB^[This is a work in progress. Feel free to ask Jake or Juan Manuel for help.]
date: '`r format(Sys.Date(), "%B %d, %Y")`'
---
```{r include=FALSE, cache=FALSE}
# Some customization. You can alter or delete as desired (if you know what you are doing).
# knitr settings to control how R chunks work.
## To make the html file do
## render("powersimtemplate.Rmd",output_format=html_document(fig_retina=FALSE))
## To make the pdf file do
## render("powersimtemplate.Rmd",output_format=pdf_document())
require(knitr)
## Set the defaults for the display of the R code in this document.
opts_chunk$set(tidy=FALSE, # display code as typed
size="small", # slightly smaller font for code
echo=TRUE, # show the code
results='markup', # format the output nicely
strip.white=TRUE, # get rid of extra white space
cache=FALSE, # don't save computations by default
highlight=TRUE, # use color and bold etc..
width.cutoff=132, # enable a fairly wide display line
size='footnotesize', # make the code use a small font
out.width='.9\\textwidth', # by default any graphics will take up most of the page
message=FALSE, # don't print out warnings and messages
comment=NA, # print comments in the output
fig.retina=FALSE
)
```
# Overview.
How large of an effect can we detect given our design? In this handout, we demonstrate a simulation based approach to assessing the power of an experimental design to reject a false null hypothesis.
For a nice overview of the concept of statistical power, see <http://egap.org/methods-guides/10-things-you-need-know-about-statistical-power>. That page also includes a link to a power calculator which can provide quick and reasonable answers to questions about statistical power when you have a simple design and large sample. And you can see an approach to simulation for power analysis there too.^[The direct link to the code for the simulation based approach from that page is here <http://egap.org/content/power-analysis-simulations-r>.]
Right now, this document does not do any power analysis for a cluster-randomized design. We plan to add this. Meanwhile, you can look at this <http://egap.org/methods-guides/10-things-you-need-know-about-cluster-randomization> method guide.
# The Ingredients
Here we talk about each of the different ingredients in a power analysis. The basic idea is to create a fake study that represents your guesses about how the actual study could turn out. Later we wrap these ingredients into one function that you can use to assess different scenarios.
Often you will have data that you can use here: for example, you may know something about the characteristics of units that will receive treatment, or you may already know the subgroups within which you want to ensure equal numbers of treated and control units. You will need to copy this document and edit it in that case. For now, and to keep this document self-contained, we generate fake data.
## Specify Blocks/Design
How will treatment(s) be assigned? Here, just for example, I imagine that we have two groups of individuals, say, men and women, and that we will assign treatment within each group. We will eventually wrap all of these steps together into a function that we can use to assess power/minimum detectable effect given changes in design (such as changes in $N$, changes in blocking structure, etc..).
```{r}
N<-100
female<-rep(c("M","F"),c(ceiling(N/3),N-(ceiling(N/3))))
stopifnot(length(female)==N) ## we should have N units total
stopifnot(all( (table(female) %% 2)==0) ) ## we should have exactly half female and half male
```
Within each block of our imaginary design, we plan to assign half of the observations to treatment and half to control with equal probability.
```{r}
dir.create("libraries")
.libPaths("libraries") ## make the default place for libraries the local one
## install.packages(c("hexbin","xtable","svd","SparseM","abind"))
## download.file("https://github.com/markmfredrickson/RItools/releases/download/rand-dist-v1.0/RItools_0.1-12.tar.gz",destfile="RItools_0.1-12.tar.gz")
## install.packages("RItools_0.1-13.tar.gz")
library(devtools)
install_github("markmfredrickson/RItools@randomization-distribution") ## use @clusters if cluster randomized trial
```
```{r}
library(RItools)
## This is a function-maker that will allow us to repeat the experiment.
blockRandomSampler<-function(z,b){
function(samples){ ## this function assigned treatment within blocks
zs<-replicate(samples, unsplit(lapply(split(z,b),function(theb){ sample(rep(c(1,0),round(length(theb)/2))) }),b))
weight<-1 ## this is only useful later
return(list(weight = weight, samples = zs))
}
}
## Setup the treatment assignment function using one assignment vector (Z)
set.seed(12345)
Z<-unsplit(lapply(split(rep(NA,N),female),function(b){ sample(rep(c(1,0),round(length(b)/2))) }),female)
treatmentAssigner<-blockRandomSampler(z=Z,b=female)
## produce 10 different treatment assignments. Each should have the same number of men, women, treated and controls
Zs<-treatmentAssigner(10)$samples
tmptab<-table(Z,female)
print(tmptab)
tmp<-apply(Zs,2,function(z){ table(z,female) })
stopifnot(all(tmp[4,]==tmptab[1,2]))
stopifnot(all(tmp[1,]==tmptab[1,1]))
```
## Specify Outcomes
What kind of outcomes do you plan to measure? If you have a version of the outcomes already measured load them here. The key here is to describe the outcome variable as it would look in the control group. Here I'm using a continuous outcome that is correlated with gender.
```{r}
## This next is a binary outcome that is correlated with gender.
### y0<-rbinom(N,prob=c(.4,.8)[as.numeric(female=="F")+1],size=1)
y0<-ifelse(female=="F",runif(sum(female=="F"),min=0, max=10),
runif(sum(female!="F"),min=2,max=12))
tapply(y0,female,mean)
```
## What kind of effect do you anticipate?
Here is a simple effect in which the treatment raises every control value by 2:
```{r}
y1<-y0+2
````
What outcome to we see? The treatment assignment chooses whether we see the potential outcome to control, $y_0$, or treatment, $y_1$.
```{r}
Y<- Z*y1 + (1-Z)*y0
```
Create a new dataset.
```{r}
femaleF<-factor(female)
dat<-data.frame(Z=Z,Y=Y,femaleF=femaleF)
```
## Specify the outcome analysis
Will you estimate an average treatment effect? Or test a sharp null hypothesis of no effects? Or something else? Do it here, and make sure it makes sense.
The idea of statistical power requires testing, by the way. So, even if your focus is on estimation, you'll need to have some procedure for creating a confidence interval or $p$-value from a hypothesis test in order to talk about power or minimal detectable effects.
### Testing
A minimally detectable effect (MDE) is the smallest treatment effect that would be detectable with a statistical power of .8 at some $\alpha$ (mostly $\alpha=.05$). How to assess this? We can assess power by recording how often a test rejects a false null hypothesis. Usually, we set the truth to be zero, and test hypotheses that diverge from zero. If the truth is zero, then a well operating test should reject the truth no more than $\alpha$ of the simulations. And a powerful test should reject the false hypotheses quite often --- and those hypotheses should be more and more frequently rejected as they diverge from the truth.
```{r}
nsims <- 1000
newZs <- treatmentAssigner(nsims)$samples
## Here using a large sample hypothesis testing approach
## This is a very simple function, no covariates, assumming a constant additive effect
testH <- function(H,Z,y0,b=NULL){
y1 <- y0 + H
Y <- Z*y1 + (1-Z)*y0
if(is.null(b)){
xb1 <- xBalance(Z~Y,data=data.frame(Z=Z,Y=Y),report="chisquare.test")
return(xb1$overall["Unstrat","p.value"])
} else {
xb1 <- xBalance(Z~Y+strata(b),data=data.frame(Z=Z,Y=Y,b=b),report="chisquare.test")
return(xb1$overall["b","p.value"])
}
}
testH(H=1,Z=newZs[,1],y0=y0,b=dat$femaleF)
## testH(H=1,Z=newZs[,1],y0=y0)
alts <- sort(unique(c(0,seq(-1.5,1.5,length=100))))
## A non-parallel approach
## res <- sapply(alts,function(a){
## resa<-sapply(1:nsims,function(i){
## testH(H=a,Z=newZs[,i],y0=y0,b=dat$femaleF) })
## mean(resa <= .05 ) })
## names(res) <- alts
library(parallel)
res <- simplify2array(mclapply(alts,function(a){
resa<-sapply(1:nsims,function(i){
testH(H=a,Z=newZs[,i],y0=y0,b=dat$femaleF) })
mean(resa <= .05 ) }))
names(res) <- alts
res
```