Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is shorah same as lofreq? #87

Open
ibseq opened this issue Aug 31, 2022 · 4 comments
Open

is shorah same as lofreq? #87

ibseq opened this issue Aug 31, 2022 · 4 comments

Comments

@ibseq
Copy link

ibseq commented Aug 31, 2022

Hi all,
does shorah algorithms follow the same principle as lofreq?

thanks
Ibseq

@DrYak
Copy link
Member

DrYak commented Sep 9, 2022

Not quite.

LoFreq is mostly done position wise:

  • It calls SNVs per position (in a way that is roughly reminiscent in how samtools' pileup and bcftools works).
  • It uses some additionnal heurisitc to increase the confidence of the results (it does more than a simple basecount).

ShoRAH at its core is a local haplotype caller, SNVs are a by-product of this:

  • it divides all the aligned reads into windows.
    • NOTE there is currently a bad interaction between this division into windows and multiplex PCR amplicon protocols (e.g.: ARTIC v4.1) causing to lose window. A new version fixing this should come in the comming months
  • within each window, ShoRAH clusters reads together.
    • logic is that real SNVs each coming from the same haplotype will cluster together
    • whereas sequencing error will always be randomly spread among the read and will not cluster (i.e.: no matter how the sampler creates clusters, the error will fail to group together).
  • in each window, local haplotypes are called from the consensus of each such cluster of reads
    • (this will "average-out" any sequencing error according to the model).
  • SNV are then called simply by comparing these local haplotype with the reference
    • (at that stage, sequencing errors have been eliminated by the clustering, any difference should be "true SNV" as far as the model is concerned).

@ibseq
Copy link
Author

ibseq commented Oct 6, 2022

Hi thanks again.
I’m aware that we can use a flag to specifically use lofreq instead of shorah to analyse the data, in that case which parameters from Lofreq are then chosen?

thanks
ibseq

@DrYak
Copy link
Member

DrYak commented Oct 6, 2022

The flag to select is in the section general, parameter snv_caller.

The parameter for lofreq are taken from the section lofreq (example), unlike the parameters of shorah wich are taken from section snv (example)

See the content of the file config/config.html in your local installation for a full reference of the configuration.

@ibseq
Copy link
Author

ibseq commented Oct 6, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants