Mash fix #263

rfeng2023 · 2024-10-02T17:53:27Z

No description provided.

gaow · 2024-10-02T19:23:49Z

R/mash_wrapper.R

+ # remove the data driven U in U list, and recalculate w
+ U[phenos_out] <- NULL
+ w <- w[!names(w) %in% phenos_out]
+ sum_w = sum(w)


sum_w has to be computed before removing any elements from w

ie, before the for loop where w gets changed.

I think this is related to the below comments, if we removed more context by doing your suggestion the sum of w won't be 1

sum of w don't have to be 1. but it is important to maintain the same summed value as original in your output w. Suppose here is the input:

U = [U1, U2,U3] w = [0.1, 0.2,0.1]

also suppose U3 is removed, so the updated is

U = [U1, U2]

define sum_w = 0.1 + 0.2 + 0.1 = 0.4. In this case you can understand it as the remaining 0.6 are the NULL

When U3 is removed, the sum of the remaining weights becomes:

sum_w_new = 0.1 + 0.2 = 0.3

To maintain the original total weight sum of 0.4, adjust the weights proportionally:

w1 = (0.1 / 0.3) * 0.4 = 0.1333
w2 = (0.2 / 0.3) * 0.4 = 0.2667

So the updated weights are:

w = [0.1333, 0.2667]

My initial suggestion on slack was not correct eitehr so here i flush out it more refined fashion.

Yes, I can switch to this version, but just a reminder: the sum_w is close to 1. So, even though we normalize with (w1 / sum_w_new) * sum_w, it essentially simplifies to w1 / sum_w_new, giving us the same value as the current version.

yes i understand. my proposal is for when null components are also considered, as a more generic case and I can anticipate that usage.

gaow · 2024-10-02T19:26:48Z

R/mash_wrapper.R

- message(paste(length(U), "components of matrices remained after filtering. "))
- return(list(U = U, w = w))
+ # remove the data driven U in U list, and recalculate w
+ U[phenos_out] <- NULL


this line does not look correct to me. After subsetting U (line 58 to 65), if the remaining elements in U are all zeros, as in the case of a singleton matrix removed from the mixture, then we should already set those matrices NULL.

The original U is like below:

'XtX''tFLASH_default''FLASH_default_1''FLASH_default_2''FLASH_default_3''FLASH_default_4''FLASH_default_5''FLASH_default_6''FLASH_default_7''FLASH_default_8''FLASH_default_9''FLASH_default_10''FLASH_default_11''tFLASH_nonneg''FLASH_nonneg_1''FLASH_nonneg_2''FLASH_nonneg_3''FLASH_nonneg_4''FLASH_nonneg_5''FLASH_nonneg_6''FLASH_nonneg_7''FLASH_nonneg_8''FLASH_nonneg_9''FLASH_nonneg_10''FLASH_nonneg_11''FLASH_nonneg_12''PCA_1''PCA_2''tPCA''identity''Knight_eQTL_brain''MiGA_GFM_eQTL''MiGA_GTS_eQTL''MiGA_SVZ_eQTL''MiGA_THA_eQTL''BM_10_MSBB_eQTL''BM_22_MSBB_eQTL''BM_36_MSBB_eQTL''BM_44_MSBB_eQTL''monocyte_ROSMAP_eQTL''Mic_DeJager_eQTL''Ast_DeJager_eQTL''Oli_DeJager_eQTL''Exc_DeJager_eQTL''Inh_DeJager_eQTL''DLPFC_DeJager_eQTL''PCC_DeJager_eQTL''AC_DeJager_eQTL''Mic_Kellis_eQTL''Ast_Kellis_eQTL''Oli_Kellis_eQTL''OPC_Kellis_eQTL''Exc_Kellis_eQTL''Inh_Kellis_eQTL''Ast_mega_eQTL''Exc_mega_eQTL''Inh_mega_eQTL''Oli_mega_eQTL''STARNET_eQTL_Mac''DLPFC_Bennett_pQTL''OPC_DeJager_eQTL''OPC_mega_eQTL''Ast_10_Kellis_eQTL''Mic_12_Kellis_eQTL''Mic_13_Kellis_eQTL''Mic_mega_eQTL''STARNET_eQTL''equal_effects''simple_het_1''simple_het_2''simple_het_3'

as you can see, there included some data driven U from each context. To my understanding, we should not keep the filter out contexts in this list, so this line is going to remove the U such as BM_22_MSBB_eQTL when we decide to look into ROSMAP data?

I know. Still, my reply is "After subsetting U (line 58 to 65), if the remaining elements in U are all zeros, as in the case of a singleton matrix removed from the mixture, then we should already set those matrices NULL." To elaborate, if a matrix is BM_22_MSBB_eQTL it would be non-zero only at the BM_22_MSBB_eQTL entry. Once we subset and removed that context, this entire compoment BM_22_MSBB_eQTL will be all zeros and therefore get removed.

I got your problem. The issue is, for the U as shown in the figure, even though we subset the contexts which is not Knight, there could not be all zeros because of a much smaller value than Knight itself in diagonal. that is why it is not removed.

So, we have another option here: to either check if all elements are zero to if all elements, except for the diagonal, are zero.

I see, this totally makes sense now for what @Chunmingl observed those redundant diagonal matrices!

you are right, i forgot how it works ... it was using context specific matrices only as initial values into the EM algorithm but the output will actually change the matrices as it sees fit, possibly slightly, like the above. Perhaps just to make it full rank. But this creates an issue for us.

We cannot simply remove diagonal matrices because signals on the diagonal can be strong and relevant.

If we stick to exactly our pipeline with all the column name conventions being consistent with the context names and singleton matrices also consistent, then perhaps what you did makes the most sense after all, to get things roughly working .... let's do that. My only request then is to add some comment strings in the code to point to this discussion here, for future references.

sure thing, will do

gaow · 2024-10-02T19:27:47Z

R/mash_wrapper.R

- return(list(U = U, w = w))
+ # remove the data driven U in U list, and recalculate w
+ U[phenos_out] <- NULL
+ w <- w[!names(w) %in% phenos_out]


Same here -- w is already taken care of in the for loop above. I dont think it is correct to have this line here. Unless am missing someting?

…mash_fix

rfeng2023 and others added 3 commits October 2, 2024 13:48

update mash wrapper

d85e97e

add annotation for change

bfe7f2c

Update documentation

e7d96b7

gaow reviewed Oct 2, 2024

View reviewed changes

rfeng2023 added 2 commits October 2, 2024 23:25

update mash wrapper

97dc1bf

Merge branch 'mash_fix' of https://github.com/rfeng2023/pecotmr into …

0ff68fc

…mash_fix

gaow merged commit 0602cd4 into cumc:main Oct 3, 2024
3 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mash fix #263

Mash fix #263

rfeng2023 commented Oct 2, 2024

gaow Oct 2, 2024

gaow Oct 2, 2024

rfeng2023 Oct 2, 2024

gaow Oct 2, 2024

rfeng2023 Oct 3, 2024 •

edited

Loading

gaow Oct 3, 2024

gaow Oct 2, 2024

rfeng2023 Oct 2, 2024

gaow Oct 2, 2024

rfeng2023 Oct 2, 2024 •

edited

Loading

rfeng2023 Oct 2, 2024

gaow Oct 2, 2024 •

edited

Loading

rfeng2023 Oct 3, 2024

gaow Oct 2, 2024

Mash fix #263

Mash fix #263

Conversation

rfeng2023 commented Oct 2, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rfeng2023 Oct 3, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rfeng2023 Oct 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gaow Oct 2, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rfeng2023 Oct 3, 2024 •

edited

Loading

rfeng2023 Oct 2, 2024 •

edited

Loading

gaow Oct 2, 2024 •

edited

Loading