Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

31_CoriellNA12878.gatk.vcf.gz upload failed #186

Closed
davmlaw opened this issue Jan 12, 2021 · 4 comments
Closed

31_CoriellNA12878.gatk.vcf.gz upload failed #186

davmlaw opened this issue Jan 12, 2021 · 4 comments
Labels
wontfix This will not be worked on

Comments

@davmlaw
Copy link
Contributor

davmlaw commented Jan 12, 2021

See:

http://144.6.229.160/upload/view_upload_pipeline/1149

I've also made a small VCF that reproduces this (need to unzip before you upload) https://app.zenhub.com/files/299486514/9b9563b2-8575-4c54-a58e-28b811f55acb/download

#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	31_CoriellNA12878
chr13	37564190	rs1253322554	A	.	109.74	PASS	AN=0;BaseQRankSum=0.674;DB;DP=129;ExcessHet=0.0025;FS=0;InbreedingCoeff=0.3955;MQ=60;MQRankSum=0;QD=27.07;ReadPosRankSum=-0.674;SOR=1.609;VQSLOD=8.09;culprit=MQRankSum	GT:AD:DP:GQ:PGT:PID:PL:PS	.|.:0:.:.:0|1:37564182_A_T:.:37564182
  File "/mnt/variantgrid/upload/vcf/vcf_import.py", line 229, in import_vcf_file
    bulk_inserter.process_entry(v)
  File "/mnt/variantgrid/upload/vcf/bulk_genotype_vcf_processor.py", line 231, in process_entry
    phred_likelihood_str = self.get_phred_likelihood_str(variant)
  File "/mnt/variantgrid/upload/vcf/bulk_genotype_vcf_processor.py", line 187, in get_phred_likelihood_str
    pl_value = int(pl[i][pl_index])
@davmlaw
Copy link
Contributor Author

davmlaw commented Jan 13, 2021

Reopening because it was auto-closed by git commit.

This bug popped up here for the first time as we expected PL to be a 3 element array, but it was "." as the sample had a genotype of ".|." which we had never seen before. Made special case code to handle that

Redeployed, reloading VCF on vg test... failed again with:

  File "/mnt/variantgrid/upload/vcf/bulk_genotype_vcf_processor.py", line 191, in get_phred_likelihood_str
    pl_index = pl_index_lookup[gt]
chr7	42924353	rs552132089;rs771560161;rs869187830;rs370736797	CAA	.	57.23	PASS	AN=0;DB;DP=95;ExcessHet=0.0045;FS=0;InbreedingCoeff=0.2568;MQ=60;POSITIVE_TRAIN_SITE;QD=28.61;SOR=2.833;VQSLOD=4.88;culprit=FS	GT:AD:DP:GQ:PGT:PID:PL:PS	.|.:0:1:.:0|1:42924353_CAAA_C:0:42924353

Note: This has an ALT of "." and GT=".|." - but cyVCF2 calls it a zygosity of HOM_ALT here

@davmlaw davmlaw reopened this Jan 13, 2021
@davmlaw davmlaw reopened this Jan 13, 2021
@davmlaw
Copy link
Contributor Author

davmlaw commented Jan 13, 2021

Raised CyVCF2 issue brentp/cyvcf2#187 - will write workaround

@davmlaw
Copy link
Contributor Author

davmlaw commented Jan 15, 2021

I think Frank has fixed this by removing samples with ".|." via commit:

https://bitbucket.org/sacgf/ngs-pipelines/commits/eeb43fbff5137121fe6b5ac824bdbe3083aa5d51

But need to confirm

@davmlaw davmlaw added the wontfix This will not be worked on label Jan 15, 2021
@davmlaw
Copy link
Contributor Author

davmlaw commented Jan 15, 2021

Frank confirms pipeline change ensures we won't see this zygosity again - no need to deal with this, will re-open again if ever happens in another file

@davmlaw davmlaw closed this as completed Jan 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

1 participant