-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pysam.pileup low max depth leads to FP #63
Comments
Thanks for spotting this! I agree that it's important to document that SComatic uses a default max_depth of 8000, and to add command line options to |
For their defense, it's a samtools/pysam default behaviour that is quite unexpected (at least to me). It can have a big impact on very deep sequencing but also in low-depth + high-cell count when using |
Dear ArthurDondi , Thanks for spotting this issue. Yes, a PR would be great, thanks a lot |
Dear users, Thanks again for your feedback, |
We have now accepted the PR. Thanks! |
pysam.pileup
max_depth
argument is by default 8000, however for RNA-seq there are edge cases as below, where a germline intronic SNP in a strongly expressed gene will appear as a false-positive somatic SNV:Top row is the cancer cells sample, mutated with 14% VAF and 19% CCF (max depth in exon ~17000) and the bottom row is the non-cancer cells sample with 4% VAF and 6% CCF (max depth in exon ~70000). This looks like a germline SNP.
With the current
max_depth=8000
I got a PASS mutation because only a fraction of the 475 reads are processed :chr11 61965525 61965525 G A PASS Cancer AAGGG AACAC 1 25 15 4 4 0.16 0.2667 0.0001 0.0 2 2 1;115;0.1465 1;92;0.1055 . PASS DP|NC|CC|BC|BQ|BCf|BCr 94|81|1:0:0:79:0:0|1:0:0:92:0:0|93:0:0:8556:0:0|0:0:0:14:0:0|1:0:0:78:0:0 25|15|4:0:0:13:0:0|4:0:0:21:0:0|372:0:0:1953:0:0|0:0:0:2:0:0|4:0:0:19:0:0
If I raise it to, say,
max_depth=200000
it's aCell_type_noise
, as it should:chr11 61965525 61965525 G A,A Cell_type_noise NonCancer,Cancer AAGGG AACAC 1 479,44 239,26 19,6 15,5 0.0397,0.1364 0.0628,0.1923 0.0017,0.0 0.0,0.0 2 2 1;498;0.3121 1;245;0.2041 . BetaBin_problem,PASS DP|NC|CC|BC|BQ|BCf|BCr 479|239|15:1:0:233:0:0|19:1:0:453:0:0|1767:93:0:42129:0:0|0:0:0:48:0:0|19:1:0:405:0:0 44|26|5:0:0:25:0:0|6:0:0:37:0:0|558:0:0:3441:0:0|0:0:0:3:0:0|6:0:0:34:0:00
pysam.pileup occurs in
BaseCellCounter.py
andSingleCellGenotype.py
. Let me know if you're interested in a PR. There is also a filter bug inbasecallingstep1
as signaled in #16.The text was updated successfully, but these errors were encountered: