A deep learning SNP variant caller aimed for Prokaryotic genomes!
git clone [email protected]:tmsincomb/DeepVCF.git
pip install -e ./DeepVCF
conda install -y dwgsim samtools bcftools bwa
from DeepVCF.core import DeepVCF
deepvcf = DeepVCF()
deepvcf.train(reference_file, alignment_file, vcf_file) # vcf treated as truth
vcf_df = deepvcf.create_vcf(
reference_file=query_ref_file,
alignment_file=query_align_file,
output_folder='./',
output_prefix='my-variants' # auto adds .deepvcf.vcf to end of file created
)
vcf_df.head() # shows pandas DataFrame for variant outputs
Recreating Example Datasets
Usage Demo with In Silico datasets
Model Validation with human datasets from GIAB