Skip to content

1. Downloading and processing MitoFish data

shenjean edited this page Nov 8, 2024 · 16 revisions

MitoFish website: http://mitofish.aori.u-tokyo.ac.jp/download.html

Complete + partial mDNA sequence file

mkdir NCBI
cd NCBI
wget http://mitofish.aori.u-tokyo.ac.jp/files/complete_partial_mitogenomes.zip
unzip complete_partial_mitogenomes.zip
grep ">" mito-all| cut -d "|" -f2 >mitofish.accession

Complete mitogenomes

Download complete mitogenomes from MitoFish

mkdir mitogenomes
cd mitogenomes
wget http://mitofish.aori.u-tokyo.ac.jp/files/mitogenomes.zip
unzip mitogenomes.zip

Prepare reference file from complete full-length mitogenomes

  • Get list of accession numbers from mitogenomes folder:
ls *.fa | cut -d "_" -f1,2 | grep -v complete >../complete.full.accession
cd ..
  • Generate a file containing gene descriptions. Gene descriptions from complete mitogenomes have the following format: [species name], complete mitogenome
grep ">" mitogenomes/*.fa | cut -d "|" -f7 | sed "s/$/, complete mitogenome/" >complete.full.def
  • Combine accession and description files into a tab-separated table with header:
echo -e accession"\t"gene definition >complete.gene.header
paste -d "\t" complete.full.accession complete.full.def >complete.full.list
cat complete.gene.header complete.full.list >complete.full.gene.tsv
  • Contents of complete.full.gene.tsv:
accession       gene definition
NC_000860       Salvelinus fontinalis, complete mitogenome
NC_000861       Salvelinus alpinus, complete mitogenome
NC_000890       Mustelus manazo, complete mitogenome