Introduction

An identical-by-descent (IBD) segment is an identical-by-state (IBS) segment shared by a pair of haplotypes/individuals that is inherited from a common ancestor without recombination. The concept is important in understanding the genetic relatedness, population structure and effective population size in recent generations. As different individual pairs may share IBD segments of different lengths over different genomic regions, the number of IBD segments increase quadratically with increasing sample size, and proportionally with genome size. High recombination rate as well as genotyping/phasing errors can break down long IBD segments into multiple shorter segments, making the scale even larger.

Recently, several IBD calling tools, such as hap-IBD, iLASH, RaPID, and TPBWT, have been designed to efficiently detect IBD segments using phased genotype data. However, fewer tools were designed to handle this large-scale IBD data. The convenience of R and python in handling the IBD result (table/data frame) comes with a cost of memory and a possibly slower computation speed. Depending on the computing environment, high requirement of memory and dynamically reallocating memory might lead to instability of these programs.

ibdtools is an attempt to improve the efficiency of handling large scale IBD segments information by:

better encoding the IBD segments
stabler and tighter memory management
implementing some commonly-used algorithms in c/c++

Currently, the functionalities and algorithms are very basic and probably not the most efficient, but has been used in real research projects. As projects progress, we will add more functions into this program.

ibdtools encode: encode the IBD file, VCF file and plink map file into binary format for better/quicker IO
ibdtools split: remove IBD segments overlapping with given regions or calculated regions with low SNP density. This can be useful for removing false positive IBD segments in regions of low SNP density, or correcting selection-induced bias.
ibdtools sort: sort large IBD files by implementing an external sorting algorithm. It must be run before calling ibdtools merge.
ibdtools merge: flatten haplotype-pair IBD segments into individual-pair IBD segments by merging all IBD segments shared by a pair of individuals when they overlap or are separated by a short gap with only a few discordant sites. This sub-command reimplements the algorithm in Dr. Browning's merge-ibd-segments tool for stability and consistency purposes.
ibdtools matrix: allow aggregating IBD segments into chromosome-wide and genome-wide total IBD and output total IBD matrix for downstream analyses. During the aggregation process, IBD can be filtered at different levels including the subpopulation, IBD segment length, and total IBD length.
ibdtools snpdens: calculate SNP density across chromosome.
ibdtools coverage: calculate IBD coverage across chromosome.
ibdtools view: search and print IBD segments belonging to a specified pair of individuals.
ibdtools decode: convert the processed, encoded IBD back to text format for readability and downstream analysis.
ibdtools stat: currently, only calculate the IBD length distribution.

Clone the repo and the submodules

git clone [email protected]:gbinux/ibdtools.git --recurse-submodules
cd ibdtools

Dependencies

htslib
fmt
gtest

They can be installed using conda

conda env create -f env.yml

Compile `ibdtools`

conda activate ibdtools
meson build
ninja -C build

ibdtools binary file will be made under ./build/

Simulated data and example

cd example

# simulating data
./simulate_data.py

# running ibdtools on the simulated input data
./run_ibdtools.sh

# Read ibdtools matrix to numpy array
./read_mat.py

Documentation

List all available subcommands

./ibdtools

Show help message/documentation for a subcommand

./ibdtools [subcommand] -h

Name		Name	Last commit message	Last commit date
Latest commit History 137 Commits
appimage		appimage
data		data
example		example
include		include
src		src
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
Makefile		Makefile
Readme.md		Readme.md
env.yml		env.yml
git_version.txt		git_version.txt
meson.build		meson.build

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Clone the repo and the submodules

Dependencies

Compile `ibdtools`

Simulated data and example

Documentation

About

Releases

Packages

Languages

License

umb-oconnorgroup/ibdtools

Folders and files

Latest commit

History

Repository files navigation

Introduction

Clone the repo and the submodules

Dependencies

Compile ibdtools

Simulated data and example

Documentation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Compile `ibdtools`

Packages