Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature roadmap #36

Open
13 of 22 tasks
danielnachun opened this issue Dec 15, 2023 · 3 comments
Open
13 of 22 tasks

Feature roadmap #36

danielnachun opened this issue Dec 15, 2023 · 3 comments

Comments

@danielnachun
Copy link
Collaborator

danielnachun commented Dec 15, 2023

As our work on this package progress, this issue can help us enumerate possible future features of the package depending on the time and interests of contributors. Some features will be needed for the manuscript submission, and others will make more sense to consider for future releases.

TWAS

Individual

  • normal prior with SuSiE
  • mr.ash
  • elastic net with glmnet
  • LASSO with glmnet
  • Bayesian alphabet from qbayes
  • Rcpp wrapper for Dirichlet process regression (manuscript, GitHub)
  • MCP and SCAD from ncvreg
  • L0Learn from L0Learn
  • BayesB and Bayesian Lasso from BGLR

Summary

  • normal prior with SuSiE
  • mr.ash
  • Bayesian alphabet from qbayes
  • Rcpp reimplementation of existing summary-based PRS-cs (see links above)
  • Rcpp wrapper for summary-based Dirichlet process regression (manuscript, GitHub)
  • lassosum for LASSO and elastic net (manuscript, GitHub)

Longer term

  • Determine if the continuous shrinkage prior in PRS-cs (manuscript, GitHub) can be extended from summary statistics to individual-level data, and implement in Rcpp.
  • explore feasibility of ncvreg, L0Learn, and BGLR for summary data - might be a lot of work for little gain if mr.ash generalizes all of these
  • Extension to genome-wide TWAS (this will be a separate manuscript) - see discussion about genome-wide extension for MR and polygenic risk scores.
  • Extend mr.ash to work with other ebnm priors - deconvolveR is most interesting because it is a smooth approximation of NPMLE instead of a scale mixture of normals

Mendelian randomization

  • Egger regression as an additional horizontal pleiotropy test to complement heterogeneity tests - only useful with enough independent instruments
  • EDIT: verify that this is not already how we are doing MR "Omnigenic model" that incorporates all variants as instruments (this will be a separate manuscript, possibly in combination with the trans-QTL extension) - inspired by [OMR] (https://academic.oup.com/bib/article/22/6/bbab322/6347949), and could exploit the fact that SuSiE gives us posterior effect sizes and standard errors, unlike most other fine mapping methods.
    • How will this method handle weak instrument bias without removing variants - consider debiasing estimators like dIVW and pIVW - does OMR have this issue too?
    • Should show that SuSiE does a comparable job in terms of adjusting for LD as LD scores (used by OMR and MRAID, and the variant selection methods used by MR.LDP, MR-Corr2 and MR-CUE.
    • Are the heterogeneity tests and Egger regression still valid for testing for horizontal pleiotropy in the presence of so many weak instruments?
  • Extension to genome-wide analysis with trans-QTLs (this will definitely be a separate manuscript):
    • Proper handling of correlated horizontal pleiotropy (CHP) is critical. The most conservative existing approach is to just remove pleiotropic variants - other solutions are provided by cause, MRAID, MR-Corr2, MR-CUE and MRcML. See also this review, which does not include some of the more recent methods but does discuss CHP.
    • Can we estimate CHP in trans-QTLs by looking at the effect of the same variant across all tested molecular traits?
    • Existing methods for handling CHP do not seem to use empirical Bayes methods - can we use SuSiE to help us do this?

Colocalization

  • Other model of colocalization (this will definitely be a separate manuscript) - can we treat gene-level colocalization as a Kullbeik-Leibler divergence between two multivariate normal distributions? Can we penalize this divergence for LD using the entropy of the distribution of the (top or all?) eigenvalues of the LD matrix? How does this compare to correlating PIPs?

Polygenic molecular risk scores (PMRS)

  • SuSiE model is ready made for prediction of molecular trait - make it easy to do predictions from new genotype data
    • Genome-wide prediction will have the same concerns about CHP - this doesn't matter for predicting traits from PMRS, but does matter for model interpretation
    • Could also use SuSiE to predict traits from PMRS - similar idea to CTWAS, definitely a separate manuscript and probably a separate package, would want to extend to survival models.
  • mr.ash and other penalized regression methods can be used for prediction for genome-wide TWAS but not MR, because penalized regression doesn't produce valid standard errors

Interfaces with other packages

  • mvsusier/mvsusiF
    • Straight forward for integrating with TWAS and MR - we use just the posterior effect size estimates as we normally do
    • Challenging for colocalization - colocBoost is the current solution, hopefully we can figure something out here later
  • susiF
  • vignette for INTACT
  • vignette for CTWAS - currently challenging to run CTWAS

Other

  • Easy approach to adjust fine mapping to remove variants that were not tested in the GWAS but were tested in the QTL - this doesn't work for TWAS with penalized regression!
  • Vignette on imputing GWAS summary statistics (and QTL summary statistics if not using individual level QTL data) - this would ideally be tied to future efforts to improve this approach methodologically.
  • Data package for LD blocks for GWAS fine mapping
    • What about windows for QTL summary stats?
    • Could pre-computed LD windows be stored on queryable server? Alternatively, could download 1000 Genomes population as a reference, and compute LD matrix for user?
@gaow
Copy link
Contributor

gaow commented Jan 4, 2024

 mr.ash priors

Not sure if mr.ash package itself accepts these other models. We might have to implement if we want. I would suggest we stick to the originaly published.

 vignette for CTWAS - currently challenging to run CTWAS

As of now the main branch for cTWAS is broken. I've talked to people in Xin's group -- it should be possible to involve them in the xQTL project focused on delivering this part. I'll talk to Xin more formally in the following couple of weeks.

@gaow
Copy link
Contributor

gaow commented Mar 10, 2024

Updates on the original post

  1. cTWAS -- my team is working actively with Xin's to refactor the package but still WIP
  2. mr_ash_rss Rcpp is roughly done: we still need to compare it against the individual level, add additional parameters to be consistent with that, and also add some omp on the loops to parallel, just like what we do for other Rcpp applications under src
  3. prs_cs Rcpp is roughly done. It is hard to compare numerically with the original implementation (MCMC by nature) but we should need another pair of eyes to read compare manually line by line if we did the right thing. I have done that myself I think it seems fine.

@danielnachun
Copy link
Collaborator Author

danielnachun commented Sep 12, 2024

I recently reread the SuSiE-inf paper (https://pubmed.ncbi.nlm.nih.gov/38036779/) and now understand that this model is really just extending SuSiE to do variance components estimation to handle stratification instead of residualizing ancestry PCs. I have some ideas on how to do this in a way that is more suitable for QTLs and could be treated as a 'pre-processing' step rather than having to modify how SuSiE itself runs. This approach could also be make to work with mr.ash or other penalized regression methods. Importantly, nothing in pecotmr would need to be modified for any of this to work, including the original implementation of SuSiE-inf. Consequently I've cleaned up and corrected various mistakes in my original post to incorporate this knowledge along with some other topics as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants