Supplementary material

This is the GRASP-resources page, where you will find -

1. Supplementary material for the GRASP paper

2. Python notebooks designed to help with curating and analysing ancestral sequence reconstruction datasets

You can also learn more about GRASP at the GRASP-suite website and use GRASP now

Supplementary material

All of the supplementary material for the GRASP paper is stored in the Supplementary Material folder. Refer to the README within that folder for further information.

Notebooks

These Jupyter notebooks are split into two sections.

Curation - aligning, curating, and handling files before ancestral inference

Post Inference Analysis - analysing data sets after ancestral inference

How to use this repository

Clone this repository to your desktop

git clone https://github.com/bodenlab/GRASP-resources.git

Install the required Python modules as specified in requirements.txt (we assume python>=3.5)

pip install -r requirements.txt

Some notebooks require additional code that is stored in the /src folder. As long as you keep the src folder in the same relative location to the notebooks this will run correctly.

For Curation 5, the standard package of MAFFT is required for multiple sequence alignment.

Here are the instructions to install MAFFT

Now you can start a Jupyter notebook from the main folder

jupyter-notebook

And you will be able to navigate to the different notebooks and run the Python code within them.

Notebooks Table of Contents

Curation 1 - Basic file handling

This notebook shows ways to read FASTA files into Python and perform basic operations on them.

Curation 2 - Sequence curation

This notebook shows how to filter sequence data sets on basis of their headers and how to summarise the species information within them.

Curation 3 - Checking exon counts

This notebook shows how to query NCBI database to retrieve exon counts for a sequence data set.

Curation 4 - Mapping exon structure

This notebook shows how to map the exon structure information onto a multiple sequence alignment.

Curation 5 - Sequence curation for ancestral sequence reconstruction

This notebook shows how to automatically and iteratively remove sequences from a data set on the basis of length, bad characters, motifs, and internal deletions.

Post inference analysis 1 - Analysis of fractional distance

This notebook allows you to analyse how the amino acid sequence at equivalent nodes changes as we increase data set size. You can specify nodes of interest in the smallest data set, which are then mapped to the equivalent nodes in the larger data sets, and then the fractional distance is calculated and plotted for all given nodes. This analysis was performed in the GRASP paper (see Figure 3).

The default notebook uses the DHAD and CYP2 data sets and recreates figures from the GRASP paper, however it can easily be adapted to your own data sets.

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
Files		Files
Supplementary_Material		Supplementary_Material
src		src
.gitignore		.gitignore
Notebook - Curation 1 - Basic FASTA file operations.ipynb		Notebook - Curation 1 - Basic FASTA file operations.ipynb
Notebook - Curation 2 - Sequence curation.ipynb		Notebook - Curation 2 - Sequence curation.ipynb
Notebook - Curation 3 - Checking exon counts.ipynb		Notebook - Curation 3 - Checking exon counts.ipynb
Notebook - Curation 4 - Mapping exon structure onto multiple sequence alignment.ipynb		Notebook - Curation 4 - Mapping exon structure onto multiple sequence alignment.ipynb
Notebook - Curation 5 - Sequence curation for ancestral sequence reconstruction.ipynb		Notebook - Curation 5 - Sequence curation for ancestral sequence reconstruction.ipynb
Notebook - Post Inference Analysis 1 - Analysis of fractional distance.ipynb		Notebook - Post Inference Analysis 1 - Analysis of fractional distance.ipynb
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Supplementary material

Notebooks

How to use this repository

Notebooks Table of Contents

About

Releases

Packages

Contributors 5

Languages

bodenlab/GRASP-resources

Folders and files

Latest commit

History

Repository files navigation

Supplementary material

Notebooks

How to use this repository

Notebooks Table of Contents

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages