Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Residual hook causes NAs #191

Open
JThomasWatson opened this issue Oct 14, 2024 · 2 comments
Open

Residual hook causes NAs #191

JThomasWatson opened this issue Oct 14, 2024 · 2 comments

Comments

@JThomasWatson
Copy link

If you have a question about how to use MAST, or how to interpret the output please submit a question to the bioconductor support
site
so that others can
benefit from the discussion. Please do not use github issues.

Otherwise
Please briefly describe your problem and what output you expect.

Please include a minimal reproducible example (AKA a reprex). If you've never heard of a reprex before, start by reading https://www.tidyverse.org/help/#reprex.

Brief description of the problem

I'd like to use MAST with sample ID as a random effect. When I include anything in the hook parameter as in section 4.2 of the MAIT tutorial, the ZlmFit object becomes full of NAs and no residuals are accessible. In the summary table created by msummary <- summary(zlmResidDE, doLRT = lrt_name, parallel = TRUE), all fold change values are NaN while everything else is NA. Below is the code I use to attempt a zlm model with residual hook.

zlm_group <- zlm(~ group + cdr_centered + age_centered + pool + sex + (1 | sample_id), 
                       sca, ebayes = FALSE, method = "glmer", hook = combined_residuals_hook,
                       parallel = TRUE, fitArgsD = list(nAGQ = 0))
      
lrt_name <- paste0("group", ident.2)
msummary <- summary(zlm_group, doLRT = lrt_name, parallel = TRUE)

And the produced ZlmFit object looks like this:

image

Where most values are NA, and the hookOut slot is NULL for each gene.

When I don't include hook, the ZlmFit object is populated with data, and the summary table builds properly, but no residuals are stored. The code used to build the model without residual hook is:

zlm_group <- zlm(~ group + cdr_centered + age_centered + pool + sex + (1 | sample_id), 
                       sca, ebayes = FALSE, method = "glmer",
                       parallel = TRUE, fitArgsD = list(nAGQ = 0))
      
lrt_name <- paste0("group", ident.2)
msummary <- summary(zlm_group, doLRT = lrt_name, parallel = TRUE)

Viewing the ZlmFit object created by that code shows this:

image

I attempted to run the residual hook function line by line to better understand what's happening, and I noticed warnings after the commands class(x@fitC) <- c("glm","lm") and class(x@fitD) <- c("bayesglm","glm","lm") that said these objects would no longer be S4 objects. I wondered if that class change was breaking a subsequent operation that expected an S4 object, so I attempted to run a modified version by cloning this repository, building and loading the package locally, and redefining combined_residuals_hook() without those two class change commands, but that doesn't appear to have solved it, as including the hook parameter still introduces NAs and doesn't store residuals. Below is my output from sessionInfo():

R version 4.4.0 (2024-04-24)
Platform: x86_64-pc-linux-gnu
Running under: Red Hat Enterprise Linux Server 7.9 (Maipo)

Matrix products: default
BLAS:   /hpc/software/spack_v20d1/spack/opt/spack/linux-rhel7-x86_64/gcc-12.3.0/r-4.4.0-aaqsqjfnzw5p5c4twtddj4sxqi23pyfw/rlib/R/lib/libRblas.so 
LAPACK: /hpc/software/spack_v20d1/spack/opt/spack/linux-rhel7-x86_64/gcc-12.3.0/r-4.4.0-aaqsqjfnzw5p5c4twtddj4sxqi23pyfw/rlib/R/lib/libRlapack.so;  LAPACK version 3.12.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: America/Chicago
tzcode source: system (glibc)

attached base packages:
 [1] grid      stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] NMF_0.28                    cluster_2.1.6               rngtools_1.5.2              registry_0.5-1             
 [5] MAST_1.27.1                 SingleCellExperiment_1.24.0 gridExtra_2.3               fastDummies_1.7.4          
 [9] ppcor_1.1                   MASS_7.3-61                 DescTools_0.99.57           knitr_1.48                 
[13] aricode_1.0.3               e1071_1.7-16                car_3.1-3                   carData_3.0-5              
[17] viridis_0.6.5               viridisLite_0.4.2           ggdark_0.2.1                ggsci_3.2.0                
[21] cowplot_1.1.3               ggthemes_5.1.0              plotly_4.10.4               rsvd_1.0.5                 
[25] ggrepel_0.9.6               ComplexHeatmap_2.21.1       SeuratData_0.2.2.9001       sparseMatrixStats_1.16.0   
[29] edgeR_4.2.1                 limma_3.60.6                DESeq2_1.44.0               SummarizedExperiment_1.34.0
[33] Biobase_2.64.0              MatrixGenerics_1.16.0       matrixStats_1.4.1           GenomicRanges_1.56.1       
[37] GenomeInfoDb_1.40.1         IRanges_2.38.1              S4Vectors_0.42.1            BiocGenerics_0.50.0        
[41] DElegate_1.2.1              UpSetR_1.4.0                doMC_1.3.8                  iterators_1.0.14           
[45] foreach_1.5.2               rlist_0.4.6.2               Seurat_5.1.0                SeuratObject_5.0.2         
[49] sp_2.1-4                    lubridate_1.9.3             forcats_1.0.0               stringr_1.5.1              
[53] dplyr_1.1.4                 purrr_1.0.2                 readr_2.1.5                 tidyr_1.3.1                
[57] tibble_3.2.1                ggplot2_3.5.1               tidyverse_2.0.0             data.table_1.16.0          
[61] plyr_1.8.9                 

loaded via a namespace (and not attached):
  [1] spatstat.sparse_3.1-0   httr_1.4.7              RColorBrewer_1.1-3      doParallel_1.0.17       tools_4.4.0            
  [6] sctransform_0.4.1       utf8_1.2.4              R6_2.5.1                lazyeval_0.2.2          uwot_0.2.2             
 [11] GetoptLong_1.0.5        withr_3.0.1             prettyunits_1.2.0       progressr_0.14.0        cli_3.6.3              
 [16] spatstat.explore_3.3-2  labeling_0.4.3          mvtnorm_1.3-1           spatstat.data_3.1-2     proxy_0.4-27           
 [21] ggridges_0.5.6          pbapply_1.7-2           parallelly_1.38.0       readxl_1.4.3            rstudioapi_0.16.0      
 [26] generics_0.1.3          shape_1.4.6.1           ica_1.0-3               spatstat.random_3.3-2   Matrix_1.7-0           
 [31] fansi_1.0.6             abind_1.4-8             lifecycle_1.0.4         SparseArray_1.4.8       Rtsne_0.17             
 [36] promises_1.3.0          crayon_1.5.3            miniUI_0.1.1.1          lattice_0.22-6          pillar_1.9.0           
 [41] rjson_0.2.23            boot_1.3-31             gld_2.6.6               future.apply_1.11.2     codetools_0.2-20       
 [46] leiden_0.4.3.1          glue_1.8.0              spatstat.univar_3.0-1   vctrs_0.6.5             png_0.1-8              
 [51] spam_2.11-0             cellranger_1.1.0        gtable_0.3.5            xfun_0.48               S4Arrays_1.4.1         
 [56] mime_0.12               survival_3.7-0          statmod_1.5.0           fitdistrplus_1.2-1      ROCR_1.0-11            
 [61] nlme_3.1-166            progress_1.2.3          RcppAnnoy_0.0.22        irlba_2.3.5.1           KernSmooth_2.23-24     
 [66] colorspace_2.1-1        Exact_3.3               tidyselect_1.2.1        compiler_4.4.0          expm_1.0-0             
 [71] DelayedArray_0.30.1     scales_1.3.0            lmtest_0.9-40           rappdirs_0.3.3          digest_0.6.37          
 [76] goftest_1.2-3           spatstat.utils_3.1-0    minqa_1.2.8             XVector_0.44.0          htmltools_0.5.8.1      
 [81] pkgconfig_2.0.3         lme4_1.1-35.5           fastmap_1.2.0           rlang_1.1.4             GlobalOptions_0.1.2    
 [86] htmlwidgets_1.6.4       UCSC.utils_1.0.0        shiny_1.9.1             farver_2.1.2            zoo_1.8-12             
 [91] jsonlite_1.8.9          BiocParallel_1.38.0     magrittr_2.0.3          Formula_1.2-5           GenomeInfoDbData_1.2.12
 [96] dotCall64_1.2           patchwork_1.3.0         munsell_0.5.1           Rcpp_1.0.13             reticulate_1.39.0      
[101] stringi_1.8.4           rootSolve_1.8.2.4       zlibbioc_1.50.0         listenv_0.9.1           lmom_3.2               
[106] deldir_2.0-4            splines_4.4.0           tensor_1.5              hms_1.1.3               circlize_0.4.16        
[111] locfit_1.5-9.10         igraph_2.0.3            spatstat.geom_3.3-3     RcppHNSW_0.6.0          reshape2_1.4.4         
[116] BiocManager_1.30.25     nloptr_2.1.1            tzdb_0.4.0              httpuv_1.6.15           RANN_2.6.2             
[121] polyclip_1.10-7         future_1.34.0           clue_0.3-65             scattermore_1.2         gridBase_0.4-7         
[126] xtable_1.8-4            RSpectra_0.16-2         later_1.3.2             class_7.3-22            timechange_0.3.0       
[131] globals_0.16.3
@JThomasWatson
Copy link
Author

@gfinak @amcdavid Have you seen this behavior before?

@amcdavid
Copy link
Member

Looking at the source of

combined_residuals_hook<- function(x){
it looks like it wasn't written to work with lme4 objects. You might be able to adapt that code if you replace various calls to coef with calls to fixef.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants