Skip to content

An algorithm implemented in Python to identify putative protein-coding exons from a reference genome and a set of splice sites.

License

Notifications You must be signed in to change notification settings

mattdoug604/exon_trap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Exon_Trap

This is a program for identifying putative protein-coding exons. Conceptually similar to the experimental technique of the same name, it takes a set of known or predicted splice sites and a refernece genome to define exons that make up protein-coding genes (exons that do not contain a premature stop codon).

Input:

  • aligned reads (BAM format)
  • introns (GFF3 format)
  • index of "translation blocks" encoded by the reference genome

Requires Python 3 and the following modules:

  • BioPython
  • pysam

Before running:

The main script requires that your genome sequence be indexed. Run 'exon_trap/generate_index.py' to index your desired genome sequence (in FASTA format). Note: This only has to be done once per genome.

To run:

exon_trap [options] <prefix/of/index/files> <introns.gff3> <sorted_alignments.bam>

About

An algorithm implemented in Python to identify putative protein-coding exons from a reference genome and a set of splice sites.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages