Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeWarning: Mean of empty slice #17

Open
sum732 opened this issue Nov 9, 2021 · 4 comments
Open

RuntimeWarning: Mean of empty slice #17

sum732 opened this issue Nov 9, 2021 · 4 comments

Comments

@sum732
Copy link

sum732 commented Nov 9, 2021

Hello,
I am trying to use JACKS and running into following issues:

python ~/Research/Programs/Jacks/JACKS/jacks/run_JACKS.py Count_Matrix.tab Exp_Summary_JACKS.tab sgRNA_Mapping_File.tab --sgrna_hdr=sgrna --gene
_hdr=Gene --ctrl_sample_hdr=Sample  --outprefix JACKS
[2021-11-09 18:05:01,363] jacks: INFO     Loading sample specification
[2021-11-09 18:05:01,363] jacks: INFO     Loading gene mappings
[2021-11-09 18:05:01,365] jacks: INFO     Loading data and pre-processing
[2021-11-09 18:05:01,424] jacks: INFO     Applying median normalisation
[2021-11-09 18:05:01,471] jacks: INFO     Collating 0 samples
[2021-11-09 18:05:01,487] jacks: INFO     Running JACKS inference
/home/.conda/envs/jacksenv/lib/python3.10/site-packages/scipy/_lib/deprecation.py:20: RuntimeWarning: Mean of empty slice
  return fun(*args, **kwargs)
/home/Research/Programs/Jacks/JACKS/jacks/jacks/infer.py:88: RuntimeWarning: Mean of empty slice.
  LOG.debug("After init, mean absolute error=%.3f, <x>=%.1f <w>=%.1f lower bound=%.1f"%(SP.nanmean(abs(y.T-SP.outer(w1,x1))).mean(), x1.mean(), w1.mean(), bound))
/home/.conda/envs/jacksenv/lib/python3.10/site-packages/numpy/core/_methods.py:189: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
/home/Research/Programs/Jacks/JACKS/jacks/jacks/infer.py:115: RuntimeWarning: Mean of empty slice.
  LOG.debug("After W update, <w>=%.1f, mean absolute error=%.3f"%(w1.mean(), SP.nanmean(abs(y.T-SP.outer(w1,x1))).mean()))
/home/Research/Programs/Jacks/JACKS/jacks/jacks/infer.py:96: RuntimeWarning: Mean of empty slice.
  LOG.debug("Iter %d/%d. lb: %.1f err: %.3f x:%.2f+-%.2f w:%.2f+-%.2f xw:%.2f"%(i+1, n_iter, bound, SP.nanmean(abs(y.T-SP.outer(w1,x1))).mean(), x1.mean(), SP.median((x2-x1**2)**0.5), w1.mean(), SP.median((w2-w1**2)**0.5), x1.mean()*w1.mean()))
/home/.conda/envs/jacksenv/lib/python3.10/site-packages/numpy/core/fromnumeric.py:3440: RuntimeWarning: Mean of empty slice.
  return _methods._mean(a, axis=axis, dtype=dtype,
[2021-11-09 18:05:01,801] jacks: INFO     Writing JACKS results
/gpfs/fs1/home/Research/Programs/Jacks/JACKS/jacks/jacks/jacks_io.py:28: RuntimeWarning: Mean of empty slice
  ordered_genes = [(np.nanmean(jacks_results[gene][4]),gene) for gene in jacks_results]
/home/Research/Programs/Jacks/JACKS/jacks/jacks/jacks_io.py:137: RuntimeWarning: Mean of empty slice
  ordered_genes = [(np.nanmean(jacks_results[gene][4]),gene) for gene in jacks_results]

Here are snippets of how various files look:
head -n +4 Count_Matrix.tab

sgRNA   P23H1   P23H2   Control1        Control2        Control3
Amfr_sgRNA1     150     44      602     530     302
Amfr_sgRNA2     141     24      380     350     162
Amfr_sgRNA3     203     21      443     435     303

head Exp_Summary_JACKS.tab

Replicate       Sample
P23H1   P23H
P23H2   P23H
Control1        CTRL
Control2        CTRL
Control3        CTRL

head -n +3 sgRNA_Mapping_File.tab

sgrna   Gene
Sec24d_sgRNA2   Sec24d
Gm30534_sgRNA3  Gm30534

Not exactly sure why any array slice will produce a mean of <0( assuming that is the error). The counts is tab separated and so are the other files as well. I do see a Collating 0 samples could this be the issue?

Any help will be much appreciated.

Thanks,
D

@lp2
Copy link

lp2 commented Nov 11, 2021 via email

@sum732
Copy link
Author

sum732 commented Nov 11, 2021

Hi Leo,
Thank you for responding and for the suggestion.
I had tried setting things correctly

head -n +1 Count_Matrix-Fixed.tab sgRNA_Mapping_File-Fixed.tab 
==> Count_Matrix-Fixed.tab <==
sgrna   Gene    P23H_rep1       P23H_rep2       Control_rep1    Control_rep2    Control_rep3

==> sgRNA_Mapping_File-Fixed.tab <==
sgrna   Gene

Got this error

python ~/Research/Programs/Jacks/JACKS/jacks/run_JACKS.py Count_Matrix-Fixed.tab Exp_Summary_JACKS.tab sgRNA_Mapping_File-Fixed.tab --sgrna_hdr=sgrna --gene_h
dr=Gene --ctrl_sample_hdr=Sample  --outprefix JACKS
[2021-11-11 17:39:54,074] jacks: INFO     Loading sample specification
[2021-11-11 17:39:54,075] jacks: INFO     Loading gene mappings
[2021-11-11 17:39:54,078] jacks: INFO     Loading data and pre-processing
[2021-11-11 17:39:54,136] jacks: INFO     Applying median normalisation
[2021-11-11 17:39:54,184] jacks: INFO     Collating 0 samples
[2021-11-11 17:39:54,200] jacks: INFO     Running JACKS inference
/home/.conda/envs/jacksenv/lib/python3.10/site-packages/scipy/_lib/deprecation.py:20: RuntimeWarning: Mean of empty slice
  return fun(*args, **kwargs)
/homes/Research/Programs/Jacks/JACKS/jacks/jacks/infer.py:88: RuntimeWarning: Mean of empty slice.
  LOG.debug("After init, mean absolute error=%.3f, <x>=%.1f <w>=%.1f lower bound=%.1f"%(SP.nanmean(abs(y.T-SP.outer(w1,x1))).mean(), x1.mean(), w1.mean(), bound))
/gpfs/fs1/home/mehrotras/.conda/envs/jacksenv/lib/python3.10/site-packages/numpy/core/_methods.py:189: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
/home/Research/Programs/Jacks/JACKS/jacks/jacks/infer.py:115: RuntimeWarning: Mean of empty slice.
  LOG.debug("After W update, <w>=%.1f, mean absolute error=%.3f"%(w1.mean(), SP.nanmean(abs(y.T-SP.outer(w1,x1))).mean()))
/home/Research/Programs/Jacks/JACKS/jacks/jacks/infer.py:96: RuntimeWarning: Mean of empty slice.
  LOG.debug("Iter %d/%d. lb: %.1f err: %.3f x:%.2f+-%.2f w:%.2f+-%.2f xw:%.2f"%(i+1, n_iter, bound, SP.nanmean(abs(y.T-SP.outer(w1,x1))).mean(), x1.mean(), SP.median((x2-x1**2)**0.5), w1.mean(), SP.median((w2-w1**2)**0.5), x1.mean()*w1.mean()))
/home/.conda/envs/jacksenv/lib/python3.10/site-packages/numpy/core/fromnumeric.py:3440: RuntimeWarning: Mean of empty slice.
  return _methods._mean(a, axis=axis, dtype=dtype,
[2021-11-11 17:39:54,511] jacks: INFO     Writing JACKS results
/home/Research/Programs/Jacks/JACKS/jacks/jacks/jacks_io.py:28: RuntimeWarning: Mean of empty slice
  ordered_genes = [(np.nanmean(jacks_results[gene][4]),gene) for gene in jacks_results]
/home/Research/Programs/Jacks/JACKS/jacks/jacks/jacks_io.py:137: RuntimeWarning: Mean of empty slice
  ordered_genes = [(np.nanmean(jacks_results[gene][4]),gene) for gene in jacks_results]
(jacksenv) mehrotras@login01:$/JACKS>head -n +1 Count_Matrix-Fixed.tab sgRNA_Mapping_File-Fixed.tab 

Changed to Match

head -n +1 Count_Matrix-Fixed.tab sgRNA_Mapping_File-Fixed.tab 
==> Count_Matrix-Fixed.tab <==
sgRNA   Gene    P23H_rep1       P23H_rep2       Control_rep1    Control_rep2    Control_rep3

==> sgRNA_Mapping_File-Fixed.tab <==
sgRNA   Gene

Error

python ~/Research/Programs/Jacks/JACKS/jacks/run_JACKS.py Count_Matrix-Fixed.tab Exp_Summary_JACKS.tab sgRNA_Mapping_File-Fixed.tab --sgrna_hdr=sgRNA --gene_hdr=Gene --ctrl_sample_hdr=Sample  --outprefix JACKS
[2021-11-11 17:42:37,162] jacks: INFO     Loading sample specification
[2021-11-11 17:42:37,162] jacks: INFO     Loading gene mappings
[2021-11-11 17:42:37,164] jacks: INFO     Loading data and pre-processing
[2021-11-11 17:42:37,223] jacks: INFO     Applying median normalisation
[2021-11-11 17:42:37,272] jacks: INFO     Collating 0 samples
[2021-11-11 17:42:37,288] jacks: INFO     Running JACKS inference
/home/.conda/envs/jacksenv/lib/python3.10/site-packages/scipy/_lib/deprecation.py:20: RuntimeWarning: Mean of empty slice
  return fun(*args, **kwargs)
/home/Research/Programs/Jacks/JACKS/jacks/jacks/infer.py:88: RuntimeWarning: Mean of empty slice.
  LOG.debug("After init, mean absolute error=%.3f, <x>=%.1f <w>=%.1f lower bound=%.1f"%(SP.nanmean(abs(y.T-SP.outer(w1,x1))).mean(), x1.mean(), w1.mean(), bound))
/home/.conda/envs/jacksenv/lib/python3.10/site-packages/numpy/core/_methods.py:189: RuntimeWarning: invalid value encountered in double_scalars
  ret = ret.dtype.type(ret / rcount)
/home/Research/Programs/Jacks/JACKS/jacks/jacks/infer.py:115: RuntimeWarning: Mean of empty slice.
  LOG.debug("After W update, <w>=%.1f, mean absolute error=%.3f"%(w1.mean(), SP.nanmean(abs(y.T-SP.outer(w1,x1))).mean()))
/home/Research/Programs/Jacks/JACKS/jacks/jacks/infer.py:96: RuntimeWarning: Mean of empty slice.
  LOG.debug("Iter %d/%d. lb: %.1f err: %.3f x:%.2f+-%.2f w:%.2f+-%.2f xw:%.2f"%(i+1, n_iter, bound, SP.nanmean(abs(y.T-SP.outer(w1,x1))).mean(), x1.mean(), SP.median((x2-x1**2)**0.5), w1.mean(), SP.median((w2-w1**2)**0.5), x1.mean()*w1.mean()))
/home/.conda/envs/jacksenv/lib/python3.10/site-packages/numpy/core/fromnumeric.py:3440: RuntimeWarning: Mean of empty slice.
  return _methods._mean(a, axis=axis, dtype=dtype,
[2021-11-11 17:42:37,602] jacks: INFO     Writing JACKS results
/home/Research/Programs/Jacks/JACKS/jacks/jacks/jacks_io.py:28: RuntimeWarning: Mean of empty slice
  ordered_genes = [(np.nanmean(jacks_results[gene][4]),gene) for gene in jacks_results]
/home/Research/Programs/Jacks/JACKS/jacks/jacks/jacks_io.py:137: RuntimeWarning: Mean of empty slice
  ordered_genes = [(np.nanmean(jacks_results[gene][4]),gene) for gene in jacks_results]

Here is Expsummary

cat Exp_Summary_JACKS.tab 
Replicate       Sample
P23H_rep1       P23H
P23H_rep2       P23H
Control_rep1    CTRL
Control_rep2    CTRL
Control_rep3    CTRL

These all are tab separated, I reconfirmed this.

@lp2
Copy link

lp2 commented Nov 12, 2021 via email

@sum732
Copy link
Author

sum732 commented Nov 15, 2021

Hi Leo,
I cannot say about the version, as there is none to show. I had done a recent pull ( last commit shown is Oct 2, 2020).
There are different versions of examples to setup the experiment design, here and here, that i have tried to follow.
I think you are right and that it could be the Exp design file is not set correctly. The issues is I am not sure which is right one?
Based on your suggestion I made this change:

cat Exp_Summary_JACKS.tab 
Replicate       Sample  CONTROL
P23H_rep1       P23H
P23H_rep2       P23H
Control_rep1    Control1        Control_rep1
Control_rep2    Control2        Control_rep2
Control_rep3    Control3        Control_rep3
python ~/Research/Programs/Jacks/JACKS/jacks/run_JACKS.py Count_Matrix-Fixed.tab Exp_Summary_JACKS.tab sgRNA_Mapping_File-Fixed.tab --sgrna_hdr=sgRNA --gene_hdr=Gene --
rep_hdr=Replicate  --sample_hdr=Sample --ctrl_sample_hdr=CONTROL --outprefix JACKS

I am getting

python ~/Research/Programs/Jacks/JACKS/jacks/run_JACKS.py Count_Matrix-Fixed.tab Exp_Summary_JACKS.tab sgRNA_Mapping_File-Fixed.tab --sgrna_hdr=sgRNA --gene_hdr=Gene --
rep_hdr=Replicate  --sample_hdr=Sample --ctrl_sample_hdr=CONTROL --outprefix JACKS
[2021-11-15 11:32:53,381] jacks: INFO     Loading sample specification
[2021-11-15 11:32:53,382] jacks: INFO     Loading gene mappings
[2021-11-15 11:32:53,384] jacks: INFO     Loading data and pre-processing
[2021-11-15 11:32:53,443] jacks: INFO     Applying median normalisation
[2021-11-15 11:32:53,468] jacks: WARNING  Undefined variances in sample Control1, set --ctrl_genes input to JACKS to infer variances from control genes
[2021-11-15 11:32:53,471] jacks: WARNING  Undefined variances in sample Control2, set --ctrl_genes input to JACKS to infer variances from control genes
[2021-11-15 11:32:53,473] jacks: WARNING  Undefined variances in sample Control3, set --ctrl_genes input to JACKS to infer variances from control genes
[2021-11-15 11:32:53,478] jacks: INFO     Collating 4 samples
Traceback (most recent call last):
  ...........
......
... 
ValueError: 'Control_rep1' is not in list

So it seems that I need the third column. However, I am not sure how set this correctly. I have 3 WT/Controls and 2 muts/treatments.

Making this change

cat Exp_Summary_JACKS.tab 
Replicate       Sample  CONTROL
P23H_rep1       P23H
P23H_rep2       P23H
Control_rep1    Control1        Control1
Control_rep2    Control2        Control2
Control_rep3    Control3        Control3

shows following error

python ~/Research/Programs/Jacks/JACKS/jacks/run_JACKS.py Count_Matrix-Fixed.tab Exp_Summary_JACKS.tab sgRNA_Mapping_File-Fixed.tab --sgrna_hdr=sgRNA --gene_hdr=Gene --rep_hdr=Replicate  --sample_hdr=Sample --ctrl_sample_hdr=CONTROL --outprefix JACKS
[2021-11-15 11:35:40,198] jacks: INFO     Loading sample specification
[2021-11-15 11:35:40,199] jacks: INFO     Loading gene mappings
[2021-11-15 11:35:40,201] jacks: INFO     Loading data and pre-processing
[2021-11-15 11:35:40,259] jacks: INFO     Applying median normalisation
[2021-11-15 11:35:40,294] jacks: INFO     Collating 1 samples
Traceback (most recent call last):
 ...
.......
ValueError: None is not in list

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants