Ptolemy is a reference-free approach for analysing microbial genome architectures, particularly, to study gene and structural diversity. In a nutshell, it uses a "top-down" approach to align multiple genomes via synteny analysis. The output is a gene-based population genome graph describing genes and structural variants that are unique/shared across a population. It requires a set of FASTA-formatted-assemblies and corresponding GFF-formatted-annotations.
You can read more about it in our publication.
This is an experimental branch in an ongoing collaborative project for studying genome architectures of bacteria phages.
Aside from some optimizations, there is an experimental, standalone (noisy) long-read aligner. In essence: use Ptolemy to build gene-based population genome graphs of available bacteria-phage genomes, then align long-reads from a metagenomic sequencing run to identify existing/new architectures.
As an example, a graph of all available Pseudomonas genomes (146) from NCBI, followed by alignment of a barcoded sample from a metagenomics nanopore sequencing run generated by undergraduate bachelor students:
Executable jar files are available under releases.
Ptolemy requires minimmap2 (uses it for performing pairwise gene-alignment during database creation and syntenic anchoring).
Ptolemy requires a tab-delimited file containing unique sample identifier, path to assembly, and path to gene annotations. For example:
Genome1 path/to/assembly/genome1.fa path/to/annotations/genome1.gff
Genome2 path/to/assembly/genome2.fa path/to/annotations/genome2.gff
Genome3 path/to/assembly/genome3.fa path/to/annotations/genome3.gff
There are three main steps in Ptolemy:
- Database creation ( java -jar ptolemy.jar extract ... )
- Multiple-genome alignment via syntenic anchoring ( java -jar ptolemy.jar syntenic-anchors ... )
- Canonical graph construction ( java -jar ptolemy.jar canonical-quiver ... )
The experimental steps:
- Index canonical quiver ( java -jar ptolemy.jar index-graph ... )
- Long-read alignment ( java -jar ptolemy.jar align-reads ... )
A typical workflow:
#graph construction
java -jar ptolemy.jar extract -g genome_list.txt -o ptolemy_db
java -jar ptolemy.jar syntenic-anchors --db ptolemy_db -o .
java -jar ptolemy.jar canonical-quiver -s syntenic_anchors.txt --db ptolemy_db -o .
#long-read alignment
java -jar ptolemy.jar index-graph -c canonical_quiver.gfa --db db/
java -jar ptolemy.jar align-reads -r reads.fa -c canonical_quiver.gfa --db db/ -o . -p alignment
The graph is stored as a GFA-formatted file and can be visualized via graph-visualizers such as Bandage.
Test-data available under 'testing_data' directory which contains full Pacbio assemblies of a single yeast chromosome from three genomes.