-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue while choosing the reference path for genotyping #329
Comments
Suppose the MSA looks like
|
Hi, Let me know if you have any more questions from my side |
Omg we have not replied to you! So sorry @AmayAgrawal , we will return to this after the Xmas vacation |
No worries. It would be nice if you can look at this now |
Hi,
I am facing an issue regarding the reference path that pandora uses for genotyping the variants. It is basically using the less frequent supported path instead of most frequent supported path as a reference. Below I will try to explain it in a simple way:
Suppose I am using 100 strains for my analysis. First, I did the pan-geome analysis and use the MSA's to build the pan-genome reference graphs (PRG). Next, used these PRG's to genotype the variants in these 100 strains using pandora. Now suppose for a pan-genome graph of a particular loci (let's say gene A) at a particular position (let's say 300), we have 3 differents paths that are possible. Among these 3 paths, If I understand correctly, the path which is supported by majority strains out of 100 strains should be chosen as reference, but actually it was not the case. Due to this, suppose the SNP which I was looking for (let's say C 300 T), in which 'C' is ref and 'T' is alt allele, actually pandora chooses 'T' as ref and 'C' as alt allele. I saw in one of the issues that is currently open that Pandora heavily undermappes (#325). Can it the be the case that it is choosing less frequent path due to this or maybe I am understanding something incorrectly?
The text was updated successfully, but these errors were encountered: