Feature roadmap #36

danielnachun · 2023-12-15T06:12:55Z

As our work on this package progress, this issue can help us enumerate possible future features of the package depending on the time and interests of contributors. Some features will be needed for the manuscript submission, and others will make more sense to consider for future releases.

TWAS

Individual

Summary

normal prior with SuSiE
mr.ash
Bayesian alphabet from qbayes
Rcpp reimplementation of existing summary-based PRS-cs (see links above)
Rcpp wrapper for summary-based Dirichlet process regression (manuscript, GitHub)
lassosum for LASSO and elastic net (manuscript, GitHub)

Longer term

Determine if the continuous shrinkage prior in PRS-cs (manuscript, GitHub) can be extended from summary statistics to individual-level data, and implement in Rcpp.
explore feasibility of ncvreg, L0Learn, and BGLR for summary data - might be a lot of work for little gain if mr.ash generalizes all of these
Extension to genome-wide TWAS (this will be a separate manuscript) - see discussion about genome-wide extension for MR and polygenic risk scores.
Extend mr.ash to work with other ebnm priors - deconvolveR is most interesting because it is a smooth approximation of NPMLE instead of a scale mixture of normals

Mendelian randomization

Egger regression as an additional horizontal pleiotropy test to complement heterogeneity tests - only useful with enough independent instruments
EDIT: verify that this is not already how we are doing MR "Omnigenic model" that incorporates all variants as instruments (this will be a separate manuscript, possibly in combination with the trans-QTL extension) - inspired by [OMR] (https://academic.oup.com/bib/article/22/6/bbab322/6347949), and could exploit the fact that SuSiE gives us posterior effect sizes and standard errors, unlike most other fine mapping methods.
- ~~How will this method handle weak instrument bias without removing variants - consider debiasing estimators like dIVW and pIVW - does OMR have this issue too?~~
- ~~Should show that SuSiE does a comparable job in terms of adjusting for LD as LD scores (used by OMR and MRAID, and the variant selection methods used by MR.LDP, MR-Corr2 and MR-CUE.~~
- ~~Are the heterogeneity tests and Egger regression still valid for testing for horizontal pleiotropy in the presence of so many weak instruments?~~
Extension to genome-wide analysis with trans-QTLs (this will definitely be a separate manuscript):
- Proper handling of correlated horizontal pleiotropy (CHP) is critical. The most conservative existing approach is to just remove pleiotropic variants - other solutions are provided by cause, MRAID, MR-Corr2, MR-CUE and MRcML. See also this review, which does not include some of the more recent methods but does discuss CHP.
- Can we estimate CHP in trans-QTLs by looking at the effect of the same variant across all tested molecular traits?
- Existing methods for handling CHP do not seem to use empirical Bayes methods - can we use SuSiE to help us do this?

Colocalization

Other model of colocalization (this will definitely be a separate manuscript) - can we treat gene-level colocalization as a Kullbeik-Leibler divergence between two multivariate normal distributions? Can we penalize this divergence for LD using the entropy of the distribution of the (top or all?) eigenvalues of the LD matrix? How does this compare to correlating PIPs?

Polygenic molecular risk scores (PMRS)

SuSiE model is ready made for prediction of molecular trait - make it easy to do predictions from new genotype data
- Genome-wide prediction will have the same concerns about CHP - this doesn't matter for predicting traits from PMRS, but does matter for model interpretation
- Could also use SuSiE to predict traits from PMRS - similar idea to CTWAS, definitely a separate manuscript and probably a separate package, would want to extend to survival models.
mr.ash and other penalized regression methods can be used for prediction for genome-wide TWAS but not MR, because penalized regression doesn't produce valid standard errors

Interfaces with other packages

mvsusier/mvsusiF
- Straight forward for integrating with TWAS and MR - we use just the posterior effect size estimates as we normally do
- Challenging for colocalization - colocBoost is the current solution, hopefully we can figure something out here later
susiF
vignette for INTACT
vignette for CTWAS - currently challenging to run CTWAS

Other

Easy approach to adjust fine mapping to remove variants that were not tested in the GWAS but were tested in the QTL - this doesn't work for TWAS with penalized regression!
Vignette on imputing GWAS summary statistics (and QTL summary statistics if not using individual level QTL data) - this would ideally be tied to future efforts to improve this approach methodologically.
Data package for LD blocks for GWAS fine mapping
- What about windows for QTL summary stats?
- Could pre-computed LD windows be stored on queryable server? Alternatively, could download 1000 Genomes population as a reference, and compute LD matrix for user?

The text was updated successfully, but these errors were encountered:

gaow · 2024-01-04T03:22:21Z

mr.ash priors

Not sure if mr.ash package itself accepts these other models. We might have to implement if we want. I would suggest we stick to the originaly published.

vignette for CTWAS - currently challenging to run CTWAS

As of now the main branch for cTWAS is broken. I've talked to people in Xin's group -- it should be possible to involve them in the xQTL project focused on delivering this part. I'll talk to Xin more formally in the following couple of weeks.

gaow · 2024-03-10T03:40:01Z

Updates on the original post

cTWAS -- my team is working actively with Xin's to refactor the package but still WIP
mr_ash_rss Rcpp is roughly done: we still need to compare it against the individual level, add additional parameters to be consistent with that, and also add some omp on the loops to parallel, just like what we do for other Rcpp applications under src
prs_cs Rcpp is roughly done. It is hard to compare numerically with the original implementation (MCMC by nature) but we should need another pair of eyes to read compare manually line by line if we did the right thing. I have done that myself I think it seems fine.

danielnachun · 2024-09-12T03:51:05Z

I recently reread the SuSiE-inf paper (https://pubmed.ncbi.nlm.nih.gov/38036779/) and now understand that this model is really just extending SuSiE to do variance components estimation to handle stratification instead of residualizing ancestry PCs. I have some ideas on how to do this in a way that is more suitable for QTLs and could be treated as a 'pre-processing' step rather than having to modify how SuSiE itself runs. This approach could also be make to work with mr.ash or other penalized regression methods. Importantly, nothing in pecotmr would need to be modified for any of this to work, including the original implementation of SuSiE-inf. Consequently I've cleaned up and corrected various mistakes in my original post to incorporate this knowledge along with some other topics as well.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature roadmap #36

Feature roadmap #36

danielnachun commented Dec 15, 2023 •

edited

Loading

gaow commented Jan 4, 2024 •

edited

Loading

gaow commented Mar 10, 2024

danielnachun commented Sep 12, 2024 •

edited

Loading

Feature roadmap #36

Feature roadmap #36

Comments

danielnachun commented Dec 15, 2023 • edited Loading

TWAS

Individual

Summary

Longer term

Mendelian randomization

Colocalization

Polygenic molecular risk scores (PMRS)

Interfaces with other packages

Other

gaow commented Jan 4, 2024 • edited Loading

gaow commented Mar 10, 2024

danielnachun commented Sep 12, 2024 • edited Loading

danielnachun commented Dec 15, 2023 •

edited

Loading

gaow commented Jan 4, 2024 •

edited

Loading

danielnachun commented Sep 12, 2024 •

edited

Loading