RAD Tools

This repository contains several tools built by CoBiG² members to help us deal with RAD and GBS data. The existing tools are mentioned below.

Baypass_workflow.R

This script has moved to a new repository!

geste2baypass.py

This script will convert a file from the GESTE format to BayPass input.

VCF2phy

This is a converter from VCF to phylip format. ~~It currently does not create the phylip header, you have to do it yourself.~~ Headers are created fine too.

FDR.R

This is a script made in R used to calculate the Q-values from the p-values csv table that is outputed from lositan. The input is the mentioned csv, and a new csv is outputed, with an extra q-values column. That simple.

struct_to_distruct.py

This is an ugly script that will transform an output from STRUCTURE into an input for DISTRUCT. You need to have a trailing "/" at the end of the output location, or you'll get strangely named files in the parent directory of the one you are pointing at...

Loci_counter.py

A simple script to add a number to each loci in .loci files. Outputs to stdout, so "pipe away"!

structure_filter.py

Another simple script that will filter the columns of a .structure file according to a "subset" file that contains nothing but the wanted marker names, one per line.

bayenv2_results_miner.py

A simple script to look mine the data files produced by Bayenv2. It will look for any SNPs which have a BF>=10 AND a spearman p-value < 0.05 (these values can be changed in the source).

filter_loci_by_vcf.py

This script will discard any information in a .loci file that is not referenced in the provided vcf file.

singleton_site_remover.py

This script will remove any singleton sites from a "phylip" formatted file. Takes one argument - the phylip file. The output will be written to a new file which has the same name as the original, but with the prefix "_no_singletons.phy".

correct_geste.py

This script will correct some inconsistencies in GESTE files created by PGDSpider. Sometimes, when all alleles are of the same type, the get presented in the geste files as:

## X    2   X 0

Other time they show up instead like this:

## X    1   X

This script modifies the file so that they are always presented in the first form. It reads a GESTE file and spits the output to STDOUT, so you might want to use a shell redirect.

vcf2dadi.py

This script will convert any VCF file into dadi input. It take 3 arguments (input, output and population_information). The "population_information" argument should be a text file with one line per population. Each line should start with the population name, followed by ":" followed by the names of the samples that belong to the referred population, separated by whitespace, like this:

Pop1:Sample1 Sample2 Sample3
Pop2:Sample4 Sample5 Sample6 Sample7

This script requires the PyVCF package

snp_pca.R / snp_pca_static.R

Script to perform a PCA based on a VCF file. It will optionally take a "populations" file to gather individuals into populations. This is a TAB separated file where each line contains the individual name and the respective population. It will look somewhat like this:


Indiv1  Pop1
Indiv2  Pop1
Indiv3  Pop2
Indiv4  Pop2

The "static" version will produce a PDF, whereas the "non-static" will produce a plotly HTML file.

segregating_loci_finder.py

This script will find any loci that are fully segregated between two groups in a VCF file. It is very basic so far, and requires all individuals from a group to be sequential, eg. the first 3 individuals are group 1 and the remainder are group 2. It takes 2 arguments, the first is the path to your VCF file, and the second is the number of individuals in the first group. Also, only supports 2 groups. The segregation can be of any type, but it has to be 100%. This means that a locus that is 100% heterozygous in group 1 is considered segregated from between groups as long as group 2 has no heterozygous individuals.

bl_gff_2_annotation.py

Script to filter annotations on a GFF file based on the contents of a tabular blast output.

filter_replicates_vcf

Script that receives a vcf file with technical replicated samples, removes the lines with contrary information between replicates and marks as missing data the snp's that show a percentage of missing data better than [-m X] between their replicates.

It also takes as an input a file with the names of the replicates for each individual (as shown in the /Examples directory).

License

Everything is under the GPLv3.

Name		Name	Last commit message	Last commit date
Latest commit History 234 Commits
Examples		Examples
.Rhistory		.Rhistory
.gitignore		.gitignore
.pairwise_ld.py.swp		.pairwise_ld.py.swp
BioGenepop.py		BioGenepop.py
Demerelate_wrapper.R		Demerelate_wrapper.R
FDR.R		FDR.R
LICENSE		LICENSE
Loci_counter.py		Loci_counter.py
Loci_filter_from_vcf.py		Loci_filter_from_vcf.py
README.md		README.md
SpaceMix_workflow.R		SpaceMix_workflow.R
VCF2phy.py		VCF2phy.py
VCF_SNP_number_matcher.py		VCF_SNP_number_matcher.py
VCF_corresponder.py		VCF_corresponder.py
bayenv2_results_miner.py		bayenv2_results_miner.py
bl_gff_2_annotation.py		bl_gff_2_annotation.py
blast_tab_parser.py		blast_tab_parser.py
compare_maps.py		compare_maps.py
compare_pairs.py		compare_pairs.py
correct_geste.py		correct_geste.py
filter_loci_by_vcf.py		filter_loci_by_vcf.py
filter_replicates_vcf.py		filter_replicates_vcf.py
geste2baypass.py		geste2baypass.py
keep_central_snp.py		keep_central_snp.py
loci2phy.py		loci2phy.py
loci_and_vcf_to_GPhoCS.py		loci_and_vcf_to_GPhoCS.py
loci_consensus.py		loci_consensus.py
oldloci2phy.py		oldloci2phy.py
outlier_removal.py		outlier_removal.py
pairLD.R		pairLD.R
pairwise_ld.py		pairwise_ld.py
plot_structure.py		plot_structure.py
segregating_loci_finder.py		segregating_loci_finder.py
singleton_site_remover.py		singleton_site_remover.py
skyline_creator.py		skyline_creator.py
snp_pca.R		snp_pca.R
snp_pca_static.R		snp_pca_static.R
strcture_filter.py		strcture_filter.py
struct_to_distruct.py		struct_to_distruct.py
vcf2bcg.py		vcf2bcg.py
vcf2dadi.py		vcf2dadi.py
vcf2dapc.R		vcf2dapc.R
vcf2fas.py		vcf2fas.py
vcf2snapp.py		vcf2snapp.py
vcf2spacemix.sh		vcf2spacemix.sh
vcf2treemix.py		vcf2treemix.py
vcf_parser.py		vcf_parser.py
vcftools_stats.py		vcftools_stats.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAD Tools

Baypass_workflow.R

geste2baypass.py

VCF2phy

FDR.R

struct_to_distruct.py

Loci_counter.py

structure_filter.py

bayenv2_results_miner.py

filter_loci_by_vcf.py

singleton_site_remover.py

correct_geste.py

vcf2dadi.py

snp_pca.R / snp_pca_static.R

segregating_loci_finder.py

bl_gff_2_annotation.py

filter_replicates_vcf

License

About

Releases

Packages

Contributors 4

Languages

License

CoBiG2/RAD_Tools

Folders and files

Latest commit

History

Repository files navigation

RAD Tools

Baypass_workflow.R

geste2baypass.py

VCF2phy

FDR.R

struct_to_distruct.py

Loci_counter.py

structure_filter.py

bayenv2_results_miner.py

filter_loci_by_vcf.py

singleton_site_remover.py

correct_geste.py

vcf2dadi.py

snp_pca.R / snp_pca_static.R

segregating_loci_finder.py

bl_gff_2_annotation.py

filter_replicates_vcf

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages