-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*.ustr file and other files not generated when using a genome as a reference in the assembly method #542
Comments
Did you check the |
Hi Issac, I should have mentioned that I used the "*" in the params to create all the output formats, but the run only generated 8 instead of 12 files. |
What version of ipyrad are you running? ( If it is the most recent version of ipyrad please share your |
I ran ipyrad in the server and it's listed as ipyrad/0.9. I thought also that it could be an issue with the version, so I installed the latest version (ipyrad_0.9.93) in my miniconda environment. ipyrad [v.0.9.93]
|
Quick update. It was not running because of a lack of memory! I'm running ipyrad for two species. For the first one ipyrad finished successfully and generated all the outputs. I used denovo assembly and the genome reference as filter (step #29). After comparing with my previous run using ipyrad 0.9.12 and the genome as a reference the amount of retained loci dropped from 22K to 6K and the amount of missing data for both the snps matrix and the sequence matrix decreased from 20.5% to 10.6%. Could it be possible that the newer version has different criteria? Regarding the second species, ipyrad has not been able to pass step #6. Attached is the json file. I hope it helps to find out what could be the error. Fingers crossed for a quick and easy solution! Thanks, |
@aroavaron In general the newest version of ipyrad should be trusted more than any previous version, for the fact that we are always fixing bugs. The difference in results between 0.9.93 and 0.9.12 (very old) is not so surprising. I would trust the newest version. As for the second species, can you tell me what is the error you are getting during step 6? If you can show me all the command line output and the full error message when it dies that would be very helpful. |
I ran the latest version of ipyrad (0.9.93) and used two assembly approaches. The reference approach resulted in 15,304 loci retained (26.3 % SNPs matrix missing sites / 28.1% sequences matrix missing sites), while the denovo-reference reference using the reference (in this case a genome) as filter approach (parameter #29) recovered 6,378 loci (14.6% SNPs matrix missing sites and 14.8% sequences matrix missing sites). For downstream analyses, it would be better to use the data with fewer missing values in general. However, I am curious about the reason(s) for the difference in the number of retained loci. |
I'm not sure i understand well what the two different assemblies were. In one case you did the 'reference' assembly using an 'on target' genome. In the 'denovo-reference' approach did you use this same genome sequences as the 'reference_as_filter' parameter? In general different assembly methods are doing quite different things so they will normally produce different results. |
Yes, exactly! I used the same genome (at chromosome level of the species that I'm working on) for both approaches. I fully agree with you that different approaches would generate different results, but I would like to understand a little better what is going on, as I was not expecting a 42% drop in the number of loci retained using the second approach. Thank you for the quick reply! |
Well, the reference_as_filter removes any reads that map to the reference sequence, so the 6,378 loci you retained in this assembly are all the loci that don't map well to the reference (for whatever reason). Either they are off target, or the reference is distant from the focal taxon, or the assembly quality is not perfect. Does that help? |
Hello,
I ran the program using denovo assembly strategy and then using a genome as a reference. For the first one, ipyrad generated 18 output files. However, when I used the genome, it only generated 14 output files. The missing files are: *.migrate, *.treemix, *.ugeno, and, most importantly for me, *.ustr. Has anyone encountered the same issue? Any suggestions or insights would be appreciated.
Cheers,
A
The text was updated successfully, but these errors were encountered: