Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in analysing virus data #4

Open
capemaster opened this issue Apr 16, 2014 · 10 comments
Open

Error in analysing virus data #4

capemaster opened this issue Apr 16, 2014 · 10 comments

Comments

@capemaster
Copy link

Hi,
this is the error i get from the terminal:

Traceback (most recent call last):
  File "./shorah.py", line 142, in <module>
    keep_files=options.k, alpha=options.a)
  File "/home/capemaster/Desktop/SHORAH/shorah-master/dec.py", line 399, in main
    proposed[beg] = (get_prop(dbg_file), j)
  File "/home/capemaster/Desktop/SHORAH/shorah-master/dec.py", line 252, in get_prop
    return prop
UnboundLocalError: local variable 'prop' referenced before assignment

The tail of the dec.log is reported.

DEBUG 2014-04-16 12:06:05,696                          run_dpm 178 run  -i w-JX480631-7706-7906.reads.fas -j 54405 -t 10881 -a 0.100000 -K 20 finished
DEBUG 2014-04-16 12:06:05,696                          run_dpm 179 Child /home/capemaster/Desktop/SHORAH/shorah-master/diri_sampler returned 0
DEBUG 2014-04-16 13:30:21,533                          run_dpm 178 run  -i w-JX480631-7840-8040.reads.fas -j 56490 -t 11298 -a 0.100000 -K 20 finished
DEBUG 2014-04-16 13:30:21,534                          run_dpm 179 Child /home/capemaster/Desktop/SHORAH/shorah-master/diri_sampler returned 0
DEBUG 2014-04-16 13:30:21,534                          run_dpm 170 /home/capemaster/Desktop/SHORAH/shorah-master/diri_sampler -i w-JX480631-7907-8107.reads.fas -j 52110 -t 10422 -a 0.100000 -K 20
DEBUG 2014-04-16 14:41:19,600                          run_dpm 178 run  -i w-JX480631-7907-8107.reads.fas -j 52110 -t 10422 -a 0.100000 -K 20 finished
DEBUG 2014-04-16 14:41:19,601                          run_dpm 179 Child /home/capemaster/Desktop/SHORAH/shorah-master/diri_sampler returned 0
INFO 2014-04-16 14:41:19,601                          main 392 reading windows for start position 135
WARNING 2014-04-16 14:41:19,602                          correct_reads 227 No reads in window 135?
INFO 2014-04-16 14:41:19,602                          main 396 this is window w-JX480631-135-335

I got the same error in many platforms.
I successully analysed another sample created in the way.
What is going on?

@ozagordi
Copy link
Collaborator

Hi,
most likely it is due to a low (zero?) coverage in a region. If you look at that warning in dec.log it complains that there are no reads in that window. Try viewing the region with samtools tview, maybe stop the haplotype reconstruction before that. Give it a try and let me know, please.

Best.
O

@capemaster
Copy link
Author

Dear,
I have tried shortening the window in the zone where reads are present and went just fine.
I suggest to implement some type of control of this problem.
Thank you for the advice,
BEST

@ozagordi
Copy link
Collaborator

Hi,
thanks for checking and reporting it here. I agree, it's a good idea to implement a control. I will keep this issue open until then.

@fifthguy
Copy link

fifthguy commented Sep 2, 2016

Hi,

i'm facing a similar problem as "capemaster", just that in my case i first get a set of SegmentationFaults I assume they come form a compiled C program that dec.py calls. More interestingly after A SET of SegmentationFaults (number of which appears to vary with window size: smaller window, more SegFaults i get), the program then terminates with the same error message as with capemaster. Also my dec.log ends the same as his.

Any ideas what's going on and how to fix it?

Thanks in advance!

(Using Linux mint 17 - same as Ubuntu, installed gsl via the apt-get method)

@ozagordi
Copy link
Collaborator

ozagordi commented Sep 3, 2016

Hi,
I would need some more info. Could you make a toy example?

@fifthguy
Copy link

fifthguy commented Mar 2, 2017

Hi,

I just tried with more suitable data ... (Higher coverage and so on...) I was still getting the same error, then I shortened the window size and restricted the region of interest (ROI) and it worked out well ... I suppose the uncovered parts were problematic. Whenever a region with coverage of about 10 (maybe somewhat higher - it's an over the thumb estimate) or less is present in the ROI, shorah will throw me a segfault and later the error that "capemaster" shows.

Do you have it written down somewhere what min coverage of a region shorah will accept?

Thanks and best regards, Tomaž

@ozagordi
Copy link
Collaborator

ozagordi commented Mar 2, 2017

Hi.
No, it's not written anywhere because it's hard to tell. Shorah was developed having in mind coverage of thousands or more, the initial goal was to detect variants down to 0.1%, over short regions. With coverage of 100 it should still work fine, but it would have unexpected behaviour if coverage goes up and down wildly. Further, one should wonder why the coverage behaves like that.

Keep also in mind that global hapotype reconstruction is hard, it makes little sense if the coverage is so low, and other tools are better suited (Quasirecomb, Haplotyper, PredictHaplo). Give also a look at this paper.

@fifthguy
Copy link

fifthguy commented Mar 2, 2017

Hi again,

the overall coverage is much higher (between 200 and 2000), the sequence however is rather long (~190kb) and contains repeats; the regions with lower coverage are identical sequential repeats, that got piled up in region ie. 1000-1500 but not in 1500-2000 ; there is a pretty harsh overmapping one part of the repeats and an undermapping in the second part. This is why such low coverage. Thank you for the paper suggestions; will read.

Best, T

@ozagordi
Copy link
Collaborator

ozagordi commented Mar 2, 2017

190 kb is way too long. Repeats don't make me feel better, also. These methods work under the assumption of uniformly spread variation, such that every region covered by a read length will display some variant.

@fifthguy
Copy link

fifthguy commented Mar 2, 2017

Yes it it farfetched; so far the pipeline is working on a repeat-masked version... Most likely I will see a certain degree of nonsense soon enough... :)

Also, based on the vcf I expect to see 2 to 16 major variants. I want to see what shorah will produce and if it makes no sense whatsoever, i will use an alternative strategy...

DrYak pushed a commit that referenced this issue Feb 23, 2023
* [Added] Restructered doc and b2w interface

* [Added] argparse for b2w

* [Changed] Pytest parametrized by different b2w inputs; Old b2w version used to generate test data

* [Changed] Restructed testing for more general test sets

* [Added] tiling strategy class

* [Fixed] tests path

* [Added] Additional test data set
[Fixed] Numerous bugs in b2w

* [Added] Impl of maximum_reads

* [Added] Impl for minimum_reads
DrYak pushed a commit that referenced this issue Feb 23, 2023
* BREAKING CHANGE(argprase): sampler has to be called with flag (default: vanilla shorah)

* fix(logging): removed print

* fix(tests): shotgun e2e works again, issue with dirisampler number of interations j

* fix(cli): error if illegal arg combination

* docs: fix cli instrc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants