Skip to content

Commit

Permalink
add SMMAT workflow
Browse files Browse the repository at this point in the history
  • Loading branch information
Liu committed May 27, 2021
1 parent 0464017 commit a3c1281
Showing 1 changed file with 109 additions and 1 deletion.
110 changes: 109 additions & 1 deletion GWAS/LMM.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1334,7 +1334,7 @@
"kernel": "SoS"
},
"source": [
"### Step 3: Perform single variant score test"
"### Step 3: Perform single variant score test for common variants"
]
},
{
Expand Down Expand Up @@ -1367,6 +1367,114 @@
" gzip ${cwd}/${_input:bn}.${phenoFile:bn}.gmmat.score.txt"
]
},
{
"cell_type": "markdown",
"metadata": {
"kernel": "SoS"
},
"source": [
"## SMMAT workflow implementation\n",
"Documentation can be found [here](https://github.com/hanchenphd/GMMAT/blob/master/inst/doc/GMMAT.pdf)"
]
},
{
"cell_type": "markdown",
"metadata": {
"kernel": "SoS"
},
"source": [
"### Step 1: Creation of the GRM "
]
},
{
"cell_type": "markdown",
"metadata": {
"kernel": "SoS"
},
"source": [
"### Step 2: Fitting the null"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"kernel": "SoS"
},
"outputs": [],
"source": [
"# Run SMMAT step1: fit a GLMM with covariate adjustment and random effects to account for population structure and family or cryptic relatedness\n",
"[SMMAT_null]\n",
"#use the standardized GRM file generated in gemma steps\n",
"parameter: grmFile = path(f'{cwd:d}/{bfile:bn}.sXX.txt')\n",
"#a colum in the data frame data, indicaing e id of samples\n",
"parameter: phenoCol = 'AD'\n",
"parameter: idCol = 'IID'\n",
"input: phenoFile, f'{bfile:n}.fam', grmFile\n",
"output: f'{cwd}/{bfile:bn}.{phenoFile:bn}.GMMAT.rds'\n",
"R: container='/mnt/mfs/statgen/containers/lmm.sif',expand='${ }', stderr = f'{_output[0]:n}.stderr', stdout = f'{_output[0]:n}.stdout'\n",
" library('GMMAT')\n",
" library('data.table')\n",
" library('dplyr')\n",
" \n",
" #Prepare phenotype and covariates in an R data frame\n",
" pheno = fread(${_input[0]:r}, header = TRUE)\n",
" #Pheno are currently coded as 1, 2, and -9, need to recoded as 0,1,and NA\n",
" pheno$${phenoCol}= recode(pheno$${phenoCol}, `2` = 1, `1` = 0, `-9` = NULL)\n",
" #Prepare GRM file in a R data frame\n",
" GRM = as.matrix(fread(${_input[2]:r}, header = FALSE))\n",
" #Extract IIDs from .fam file\n",
" id_vector = fread(${_input[1]:r}, header = FALSE)[,2]\n",
" #make the GRM colnames and rownames using the actual IID\n",
" colnames(GRM) = t(id_vector)\n",
" rownames(GRM) = t(id_vector)\n",
"\n",
" fit_null = glmmkin(${phenoCol} ~ ${\"+\".join(covarCol + qCovarCol)} , \n",
" data = pheno, \n",
" kins = GRM, \n",
" id = '${idCol}',\n",
" family = binomial(link = \"logit\"))\n",
" saveRDS(fit_null, ${_output:r})"
]
},
{
"cell_type": "markdown",
"metadata": {
"kernel": "SoS"
},
"source": [
"### Step 3: Perform variant set burden test for rare variants"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"kernel": "SoS"
},
"outputs": [],
"source": [
"# run SMMAT step 2: variant set burden tests (based on the null model built above)\n",
"[SMMAT]\n",
"input: f'{bfile:n}.bed', f'{bfile:n}.fam',f'{bfile:n}.bim' group_by = 1, groupFile\n",
"depends: f'{cwd}/{bfile:bn}.{phenoFile:bn}.GMMAT.rds'\n",
"output: f'{cwd}/{_input[0]:bn}.{phenoFile:bn}.smmat.burden.txt.gz'\n",
"R: container='/mnt/mfs/statgen/containers/lmm.sif', expand='${ }', stderr = f'{_output[0]:n}.stderr', stdout = f'{_output[0]:n}.stdout'\n",
" library('GMMAT')\n",
" library('SNPRelate')\n",
" snpgdsBED2GDS(${_input[0]}, ${_input[1]}, ${_input[2]}, f'{cwd}/{_input[0]:bn}.gds')\n",
" null_model = readRDS(${_depends:r})\n",
" burden.test = SMMAT(null_model,\n",
" group.file = ${_input[4]},\n",
" MAF.range = c(1e-7, ${maf_max_filter}),\n",
" miss.cutoff = ${geno_filter},\n",
" method = 'davies',\n",
" tests = 'B')\n",
" write.table(burden.test,f'{cwd}/{_input[0]:bn}.{phenoFile:bn}.smmat.burden.txt', sep = '\\t', quote = F, col.names = T, row.names = F)\n",
" bash: container=container_lmm,expand='${ }', stderr = f'{_output[0]:n}.stderr', stdout = f'{_output[0]:n}.stdout'\n",
" gzip ${cwd}/${_input[0]:bn}.${phenoFile:bn}.smmat.score.txt"
]
},
{
"cell_type": "markdown",
"metadata": {
Expand Down

0 comments on commit a3c1281

Please sign in to comment.