Different result from available results #546

rfeng2023 · 2023-03-10T19:22:23Z

I am reproducing the MASH pipeline, and I found the result of mashr is different from the available results (produced by Hao),
the code in this step is

library(mashr)
dat = readRDS('/mnt/vast/hpc/csg/rf2872/Work/MASH_csg.q/output/Ast_Exc_Inh_Mic_OPC_Oli.rds')
set.seed(1)
random.subset = sample(1:nrow(dat$random.b), min(6000, nrow(dat$random.b)))
random.subset = mash_set_data(dat$random.b[random.subset,], dat$random.s[random.subset,], alpha=1, zero_Bhat_Shat_reset = 1E3)
vhatprior = mash_estimate_corr_em(random.subset, readRDS('/mnt/vast/hpc/csg/rf2872/Work/MASH_csg.q/MASH_6_celltypes2/Ast_Exc_Inh_Mic_OPC_Oli.EZ.prior.rds'), max_iter = 6)
vhat = vhatprior$V
saveRDS(vhat, '/mnt/vast/hpc/csg/rf2872/Work/MASH_csg.q/MASH_6_celltypes/Ast_Exc_Inh_Mic_OPC_Oli.EZ.V_mle.rds')

Here is the result of my job,

and below is the result of Hao's job

The random subset may cause a tiny difference from others. But there are some of the values reversed. Is that normal?

The text was updated successfully, but these errors were encountered:

gaow · 2023-03-14T21:44:52Z

@rfeng2023 The result should be identical if you set seed ... did you use the same container? It could make a diffeerence if major R version changes also changed behavior of seed such that set.seed(1) is not the same between different versions R or related libraries?

To solve the problem: since you are worried of the small differences particularly the sign differences, my suggestion is to rerun but using a much larger random.b sample and see how it works ? for example, take 4 random SNPs per gene so you get many such SNPs.

gaow · 2023-03-14T21:53:51Z

To solve the problem: since you are worried of the small differences particularly the sign differences, my suggestion is to rerun but using a much larger random.b sample and see how it works ? for example, take 4 random SNPs per gene so you get many such SNPs.

We can hold on to this. Because picking these SNPs should be built int othe updated pipeline with fine-mapping CS in mind. We can revisit at that point. But it is good for you (and important) that you keep documenting these problems. @rfeng2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different result from available results #546

Different result from available results #546

rfeng2023 commented Mar 10, 2023

gaow commented Mar 14, 2023

gaow commented Mar 14, 2023

Different result from available results #546

Different result from available results #546

Comments

rfeng2023 commented Mar 10, 2023

gaow commented Mar 14, 2023

gaow commented Mar 14, 2023