Skip to content

Commit

Permalink
Trying to get html output because of problems with the projector
Browse files Browse the repository at this point in the history
  • Loading branch information
jwbowers committed Aug 5, 2019
1 parent a317f30 commit 22534f0
Show file tree
Hide file tree
Showing 21 changed files with 192 additions and 59 deletions.
5 changes: 3 additions & 2 deletions day-plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,10 @@ author: |
| ICPSR 2019 Session 2
| Jake Bowers, Ben Hansen, Tom Leavitt
bibliography:
- refs.bib
- BIB/refs.bib
- BIB/master.bib
- BIB/misc.bib
- ci.bib
- BIB/ci.bib
fontsize: 10pt
geometry: margin=1in
graphics: yes
Expand Down Expand Up @@ -182,6 +182,7 @@ Balance tests and the Sequential Intersection Union Principle
- Sensitivity analysis I (Rosenbaum Style)

### Readings on sensitivity analysis
- @rosenbaum2017observation, Chap 9
- @rosenbaum10book, Chap 3
- @hhh2010
- @rosenbaumtwo
Expand Down
2 changes: 1 addition & 1 deletion day10-Scores.Rmd
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Statistical Adjustment and Assessment of Adjustment in Observational Studies --- Matched Stratification for One and Multiple Variables.
date: '`r format(Sys.Date(), "%B %d, %Y")`'
author: ICPSR 2018 Session 2
author: ICPSR 2019 Session 2
bibliography:
- BIB/abbrev-long.bib
- BIB/refs.bib
Expand Down
121 changes: 88 additions & 33 deletions day11-MVMatching.Rmd
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
---
title: Matching on more than one covariate
date: '`r format(Sys.Date(), "%B %d, %Y")`'
author: ICPSR 2018 Session 2
author: ICPSR 2019 Session 2
bibliography:
- refs.bib
- BIB/refs.bib
- BIB/master.bib
- BIB/misc.bib
fontsize: 10pt
geometry: margin=1in
graphics: yes
biblio-style: authoryear-comp
biblatexoptions:
- natbib=true
biblio-style: authoryear-comp
output:
beamer_presentation:
slide_level: 2
Expand All @@ -22,6 +22,16 @@ output:
includes:
in_header:
- defs-all.sty
pandoc_args: [ "--csl", "chicago-author-date.csl" ]
revealjs::revealjs_presentation:
slide_level: 2
incremental: true
transition: none
fig_caption: true
self_contained: false
reveal_plugins: ["notes","search","zoom","chalkboard"]
pandoc_args: [ "--csl", "chicago-author-date.csl" ]
css: icpsr-revealjs.css
---

<!-- Make this document using library(rmarkdown); render("day12.Rmd") -->
Expand All @@ -37,16 +47,18 @@ require(knitr)
## This plus size="\\scriptsize" from https://stackoverflow.com/questions/26372138/beamer-presentation-rstudio-change-font-size-for-chunk
knitr::knit_hooks$set(mysize = function(before, options, envir) {
if (before)
return(options$size)
if (before){
return(options$size)
} else {
return("\\normalsize")}
})
knit_hooks$set(plotdefault = function(before, options, envir) {
if (before) par(mar = c(3, 3, .1, .1),oma=rep(0,4),mgp=c(1.5,.5,0))
})
opts_chunk$set(
tidy=FALSE, # display code as typed
tidy='styler', # display code as typed
echo=TRUE,
results='markup',
strip.white=TRUE,
Expand Down Expand Up @@ -86,16 +98,62 @@ library(chemometrics) ## for drawMahal
library(mvtnorm)
```

<!--- For HTML Only --->
`r if (!knitr:::is_latex_output()) '
$\\newcommand{\\Pdistsym}{P}$
$\\newcommand{\\Pdistsymn}{P_n}$
$\\newcommand{\\Qdistsym}{Q}$
$\\newcommand{\\Qdistsymn}{Q_n}$
$\\newcommand{\\Qdistsymni}{Q_{n_i}}$
$\\newcommand{\\Qdistsymt}{Q[t]}$
$\\newcommand{\\dQdP}{\\ensuremath{\\frac{dQ}{dP}}}$
$\\newcommand{\\dQdPn}{\\ensuremath{\\frac{dQ_{n}}{dP_{n}}}}$
$\\newcommand{\\EE}{\\ensuremath{\\mathbf{E}}}$
$\\newcommand{\\EEp}{\\ensuremath{\\mathbf{E}_{P}}}$
$\\newcommand{\\EEpn}{\\ensuremath{\\mathbf{E}_{P_{n}}}}$
$\\newcommand{\\EEq}{\\ensuremath{\\mathbf{E}_{Q}}}$
$\\newcommand{\\EEqn}{\\ensuremath{\\mathbf{E}_{Q_{n}}}}$
$\\newcommand{\\EEqni}{\\ensuremath{\\mathbf{E}_{Q_{n[i]}}}}$
$\\newcommand{\\EEqt}{\\ensuremath{\\mathbf{E}_{Q[t]}}}$
$\\newcommand{\\PP}{\\ensuremath{\\mathbf{Pr}}}$
$\\newcommand{\\PPp}{\\ensuremath{\\mathbf{Pr}_{P}}}$
$\\newcommand{\\PPpn}{\\ensuremath{\\mathbf{Pr}_{P_{n}}}}$
$\\newcommand{\\PPq}{\\ensuremath{\\mathbf{Pr}_{Q}}}$
$\\newcommand{\\PPqn}{\\ensuremath{\\mathbf{Pr}_{Q_{n}}}}$
$\\newcommand{\\PPqt}{\\ensuremath{\\mathbf{Pr}_{Q[t]}}}$
$\\newcommand{\\var}{\\ensuremath{\\mathbf{V}}}$
$\\newcommand{\\varp}{\\ensuremath{\\mathbf{V}_{P}}}$
$\\newcommand{\\varpn}{\\ensuremath{\\mathbf{V}_{P_{n}}}}$
$\\newcommand{\\varq}{\\ensuremath{\\mathbf{V}_{Q}}}$
$\\newcommand{\\cov}{\\ensuremath{\\mathbf{Cov}}}$
$\\newcommand{\\covp}{\\ensuremath{\\mathbf{Cov}_{P}}}$
$\\newcommand{\\covpn}{\\ensuremath{\\mathbf{Cov}_{P_{n}}}}$
$\\newcommand{\\covq}{\\ensuremath{\\mathbf{Cov}_{Q}}}$
$\\newcommand{\\hatvar}{\\ensuremath{\\widehat{\\mathrm{Var}}}}$
$\\newcommand{\\hatcov}{\\ensuremath{\\widehat{\\mathrm{Cov}}}}$
$\\newcommand{\\sehat}{\\ensuremath{\\widehat{\\mathrm{se}}}}$
$\\newcommand{\\combdiff}[1]{\\ensuremath{\\Delta_{{z}}[#1]}}$
$\\newcommand{\\Combdiff}[1]{\\ensuremath{\\Delta_{{Z}}[#1]}}$
$\\newcommand{\\psvec}{\\ensuremath{\\varphi}}$
$\\newcommand{\\psvecgc}{\\ensuremath{\\tilde{\\varphi}}}$
$\\newcommand{\\atob}[2]{\\ensuremath{#1\\!\\! :\\!\\! #2}}$
$\\newcommand{\\stratA}{\\ensuremath{\\mathbf{S}}}$
$\\newcommand{\\stratAnumstrat}{\\ensuremath{S}}$
$\\newcommand{\\sAsi}{\\ensuremath{s}}$
$\\newcommand{\\permsd}{\\ensuremath{\\sigma_{\\Pdistsym}}}$
$\\newcommand{\\dZ}[1]{\\ensuremath{d_{Z}[{#1}]}}$
$\\newcommand{\\tz}[1]{\\ensuremath{t_{{z}}[#1]}}$
$\\newcommand{\\tZ}[1]{\\ensuremath{t_{{Z}}[#1]}} $
'`

## Today

\begin{enumerate}
\item Agenda: Need to adjustment $\rightarrow$ "fair comparison" $\rightarrow$
1. Agenda: Need to adjustment $\rightarrow$ "fair comparison" $\rightarrow$
stratification $\rightarrow$ Evaluation/Assessment of the stratification;
How to do this with one variable. How to do this with more than one
variable.
\item Reading for tomorrow and next week: DOS 8--9, 13 and \cite[\S~9.5]{gelman2006dau}, and \cite{hans04} \cite{ho:etal:07}
\item Questions arising from the reading or assignments or life?
\end{enumerate}
2. Reading for tomorrow and next week: DOS 8--9, 13 and \cite[\S~9.5]{gelman2006dau}, and \cite{hans:04} \cite{ho:etal:07}
3. Questions arising from the reading or assignments or life?

# But first, review:

Expand All @@ -107,32 +165,29 @@ library(mvtnorm)
inference about causal effects).
- The testing based approach (Fisher, Rosenbaum)
- The estimation based approach (Neyman, Robins, many others)
- Skipped: The prediction based approach (Bayes, Rubin)
- Skipped: The prediction based approach (Bayes, Rubin); Estimation of Structural Models (Pearl)
- Causal inference from simple randomized experiments.
- Causal inference from randomized experiments with not-randomized doses
(i.e. Instrumental Variables approaches to causal inference)
- Briefly: How to know that our statistical procedures are doing what we want
them to do: The operating characteristics of statistical procedures for
esimation (bias,consistency) and testing (coverage / false positive error
rate, power).
rate, power).
- Making the case for adequate adjustment in observational studies
- The difficulties of making this case using the linear model
- The potential for making this case for one or more variables using
stratification.
- Using optimal, full matching technology to make and evaluate
stratifications.
- Using the linear model
- Using stratification (via optimal, full matching) and the experimental standard

```{r echo=FALSE, cache=TRUE}
load(url("http://jakebowers.org/Data/meddat.rda"))
meddat<- mutate(meddat,
meddat<- mutate(meddat,
HomRate03=(HomCount2003/Pop2003)*1000,
HomRate08=(HomCount2008/Pop2008)*1000)
row.names(meddat) <- meddat$nh
```
# Matching on Many Covariates: Using Mahalnobis Distance


## Dimension reduction using the Mahalanobis Distance
## Mahalanobis Distance

The general idea: dimension reduction. When we convert many columns into one column we reduce the dimensions of the dataset (to one column).

Expand All @@ -143,7 +198,7 @@ plot(meddat$nhAboveHS,meddat$nhPopD,xlim=c(-.3,.6),ylim=c(50,700))
```


## Dimension reduction using the Mahalanobis Distance
## Mahalanobis Distance

First, let's look at Euclidean distance: $\sqrt{ (x_1 - x_2)^2 + (y_1 - y_2)^2 }$

Expand All @@ -155,7 +210,7 @@ arrows(mean(X[,1]),mean(X[,2]),X["407",1],X["407",2])
text(.4,200,label=round(dist(rbind(colMeans(X),X["407",])),2))
```

## Dimension reduction using the Mahalanobis Distance
## Mahalanobis Distance

Problems with the Euclidean distance ($\sqrt{ (x_1 - x_2)^2 + (y_1 - y_2)^2
}$): over/under-emphasis depending on scaling, ignores correlation.
Expand All @@ -174,12 +229,12 @@ tmp
sqrt( (tmp[1,1] - tmp[2,1])^2 + (tmp[1,2]-tmp[2,2])^2 )
```

## Dimension reduction using the Mahalanobis Distance
## Mahalanobis Distance

For the scaling problem, standardize the Euclidean distance:$\sqrt{ (x_1/sd(x_1) - x_2/sd(x_2))^2 + (y_1/sd(y_1) - y_2/sd(y_2))^2 }$. (Here also centering the variables, since we only care about distance.)
For the scaling problem, standardize the Euclidean distance:$\sqrt{ (x_1/sd(x_1) - x_2/sd(x_2))^2 + (y_1/sd(y_1) - y_2/sd(y_2))^2 }$. (Here also centering the variables, since we only care about distance.)

```{r echo=FALSE}
Xsd <-scale(X)
Xsd <-scale(X)
apply(Xsd,2,sd)
zapsmall(apply(Xsd,2,mean))
```
Expand All @@ -193,7 +248,7 @@ text(2,-1.2,label=round(dist(rbind(colMeans(Xsd),Xsd["407",])),2))
```


## Dimension reduction using the Mahalanobis distance
## Mahalanobis distance



Expand All @@ -211,7 +266,7 @@ row.names(newX) <- row.names(X)
cor(newX)
```

```{r echo=FALSE, out.width=".5\\textwidth"}
```{r echo=FALSE, out.width=".4\\textwidth"}
mhnew <- mahalanobis(newX,center=colMeans(newX),cov=cov(newX))
drawMahal(newX,center=colMeans(newX),covariance=cov(newX),
quantile = c(0.975, 0.75, 0.5, 0.25))
Expand All @@ -228,9 +283,9 @@ text(newX[newpts,1]-.02,newX[newpts,2]-10,
## Dimension reduction using the Mahalanobis distance

The contour lines show points with the same
Mahalanobis distance and the numbers are Euclidean distance.
Mahalanobis distance, the numbers are Euclidean distance. Notice that the point with Euclidean distance of 161 is farther than 250 in Mahalanobis terms.

```{r echo=FALSE, out.width=".8\\textwidth"}
```{r echo=FALSE, results="hide", out.width=".75\\textwidth"}
mhnew <- mahalanobis(newX,center=colMeans(newX),cov=cov(newX))
drawMahal(newX,center=colMeans(newX),covariance=cov(newX),
quantile = c(0.975, 0.75, 0.5, 0.25))
Expand Down Expand Up @@ -434,7 +489,7 @@ plot(xb4)

## Calipers

The optmatch package allows calipers (which disallow certain pairs from being matched).^[You can implement penalties by hand.] Here, for example, we disallow comparisons which differ by more than 2 standard deviations on the propensity score.
The optmatch package allows calipers (which forbids certain pairs from being matched).^[You can implement penalties by hand.] Here, for example, we forbid comparisons which differ by more than 2 standard deviations on the propensity score.

```{r}
quantile(as.vector(psdist),seq(0,1,.1))
Expand All @@ -443,7 +498,7 @@ as.matrix(psdistCal)[5:10,5:10]
```
## Calipers

The optmatch package allows calipers (which disallow certain pairs from being matched).^[You can implement penalties by hand.] Here, for example, we disallow comparisons which differ by more than 2 standard deviations on the propensity score.
The optmatch package allows calipers (which forbid certain pairs from being matched).^[You can implement penalties by hand.] Here, for example, we forbid comparisons which differ by more than 2 standard deviations on the propensity score.

```{r}
fmCal1 <- fullmatch(psdist+caliper(psdist,2),data=meddat,tol=.00001)
Expand Down Expand Up @@ -551,8 +606,8 @@ strata or sets or selecting subsets.
- The work on cardinality matching and fine balance
<http://jrzubizarreta.com/>
<https://cran.rstudio.com/web/packages/designmatch/> )
- The work on speedier approximate full matching <http://fredriksavje.com/>
<https://github.com/fsavje/quickmatch>
- The work on speedier approximate full matching with more data <http://fredriksavje.com/>
<https://github.com/fsavje/quickmatch>, <https://cran.r-project.org/web/packages/rcbalance/index.html>.
- The work on using genetic algorithms to (1) find approximate strata
with-replacement <http://sekhon.berkeley.edu/matching/> and (2) to find
an approximation to a completely randomized study (i.e. best subset
Expand Down
25 changes: 17 additions & 8 deletions day12-MatchingTools.Rmd
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
---
title: Matching Tools and Approaches
date: '`r format(Sys.Date(), "%B %d, %Y")`'
author: ICPSR 2018 Session 2
author: ICPSR 2019 Session 2
bibliography:
- refs.bib
- BIB/refs.bib
- BIB/master.bib
- BIB/misc.bib
fontsize: 10pt
Expand All @@ -22,6 +22,15 @@ output:
includes:
in_header:
- defs-all.sty
pandoc_args: [ "--csl", "chicago-author-date.csl" ]
revealjs::revealjs_presentation:
slide_level: 2
incremental: true
transition: none
fig_caption: true
self_contained: false
reveal_plugins: ["notes","search","zoom","chalkboard"]
pandoc_args: [ "--csl", "chicago-author-date.csl" ]
---

<!-- Make this document using library(rmarkdown); render("day12.Rmd") -->
Expand All @@ -37,8 +46,10 @@ require(knitr)
## This plus size="\\scriptsize" from https://stackoverflow.com/questions/26372138/beamer-presentation-rstudio-change-font-size-for-chunk
knitr::knit_hooks$set(mysize = function(before, options, envir) {
if (before)
if (before){
return(options$size)
} else {
return("\\normalsize")}
})
knit_hooks$set(plotdefault = function(before, options, envir) {
Expand Down Expand Up @@ -88,11 +99,9 @@ library(mvtnorm)

## Today

\begin{enumerate}
\item Agenda:
\item Reading for tomorrow and next week: DOS 8--9, 13 and \cite[\S~9.5]{gelman2006dau}, and \cite{hans:04} \cite{ho:etal:07}
\item Questions arising from the reading or assignments or life?
\end{enumerate}
1. Agenda:
2. Reading for tomorrow and next week: DOS 8--9, 13 and \cite[\S~9.5]{gelman2006dau}, and \cite{hans:04} \cite{ho:etal:07}
3. Questions arising from the reading or assignments or life?

# But first, review:

Expand Down
2 changes: 1 addition & 1 deletion day12.Rmd
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: The problems of covariance adjustment for bias; Simple stratification based approaches
date: '`r format(Sys.Date(), "%B %d, %Y")`'
author: ICPSR 2018 Session 2
author: ICPSR 2019 Session 2
bibliography:
- refs.bib
- BIB/master.bib
Expand Down
2 changes: 1 addition & 1 deletion day13-MatchingInformation.Rmd
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Matching Tools and Information in a Block-Randomized Experiment
date: '`r format(Sys.Date(), "%B %d, %Y")`'
author: ICPSR 2018 Session 2
author: ICPSR 2019 Session 2
bibliography:
- refs.bib
- BIB/master.bib
Expand Down
2 changes: 1 addition & 1 deletion day13.Rmd
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Matching on more than one covariate
date: '`r format(Sys.Date(), "%B %d, %Y")`'
author: ICPSR 2018 Session 2
author: ICPSR 2019 Session 2
bibliography:
- refs.bib
- BIB/master.bib
Expand Down
2 changes: 1 addition & 1 deletion day14-MatchedEstimationTesting.Rmd
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Estimation and Testing for Stratified Matched Designs
date: '`r format(Sys.Date(), "%B %d, %Y")`'
author: ICPSR 2018 Session 2
author: ICPSR 2019 Session 2
bibliography:
- refs.bib
- BIB/master.bib
Expand Down
2 changes: 1 addition & 1 deletion day14.Rmd
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Strategies for making matched designs
date: '`r format(Sys.Date(), "%B %d, %Y")`'
author: ICPSR 2018 Session 2
author: ICPSR 2019 Session 2
bibliography:
- refs.bib
- BIB/master.bib
Expand Down
2 changes: 1 addition & 1 deletion day15.Rmd
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Information, Balance, Estimation, Testing
date: '`r format(Sys.Date(), "%B %d, %Y")`'
author: ICPSR 2018 Session 2
author: ICPSR 2019 Session 2
bibliography:
- refs.bib
- BIB/master.bib
Expand Down
Loading

0 comments on commit 22534f0

Please sign in to comment.