Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Choice of outgroup #2

Open
XMTian opened this issue Oct 13, 2022 · 1 comment
Open

Choice of outgroup #2

XMTian opened this issue Oct 13, 2022 · 1 comment

Comments

@XMTian
Copy link

XMTian commented Oct 13, 2022

Dear Bárbara,

Thank you for developing this method and useful software.

I want to ask some questions about the choice of the outgroup. I want to use NCD2 to detect the signature of balancing selection in the genome of a fish species. There is a sister species that has a divergent time of 2 MYA to my fish species, and this species also have a similar trait that is hypothesized under balancing selection. However, I will only focus on my species, since there are not enough samples of its sister species. Do you think is it appropriate to use its sister species as an outgroup? Does the divergent time (2 MYA) matter? And a more important question, should I choose another species without the candidate trait of interest as an outgroup, or it doesn't matter since even though the same gene contributes to this trait in two species, the substitutions should be different since the surrounding loci are caused by mutations.

Thanks in advance!

Best wishes,
Xiaomeng

@bbitarello
Copy link
Owner

Hello! The outgroup is used in NCD2 (but not NCD2). NCD2 outperforms NCD1, so is preferable. The way this information is used is as follows:

  1. you only need one individual from the sister species. So if that is available, it is enough. The information you need is the number of fixed differences between that species and your species for every window you query.
  2. Regarding the fact that the sister species might potentially also be under balancing selection at this locus, I have not explored this. My intuition is that then both species should have a reduced number of FDs relative to the number of SNPs. This may or may not be a problem, depending on what you are comparing your candidate locus to. If, as you said, there might be another sister group to use, then maybe do that.
  3. Divergence time. You want the two species to be sufficiently diverged for you to have the signature. At least 4xNe generations, where Ne is the effective population size of your species and the sister species. For example, in humans the long-term Ne is roughly 10,000, so 4x10,000=40,000 generations. To convert this to years you need to multiply this by the generation time. If we consider this to be 25 years for humans, we get 1,000,000 (1 million years). The sister species we use for humans is chimp and human and chimp diverged from their common ancestor at least 6 million years ago, so this is appropriate.

I hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants