Integrating Single-Cell Results for Exploring and Analyzing Methylation
Analysis and visualization of Whole Genome Bisulfite Sequencing (WGBS) data requires reading aligned sequencing data into formats that existing packages like BSseq and scMET can analyze. Getting the data from on-disk formats like bedfiles to a matrix of methylation values can be difficult because, with nearly 30 million CpGs, WGBS data can be quite large.
iscream aims to efficiently read alinged (sc)WGBS data into formats that can be used by other packages. iscream uses htslib to query genomic regions to make matrices for BSSeq or aggregate the methylated reads for scMET.
iscream depends on the htslib header files. These may be installed with your package manager:
- ubuntu/debian:
libhts-dev
- fedora/RHEL:
htslib-devel
- brew:
htslib
- nixpkgs:
htslib
or built manually: https://www.htslib.org/download/.
The header files may also be found among your HPC modules - make sure the
PKG_CONFIG_PATH
environment variable includes the pkgconfig
location for
your installation of htslib. You can verify that the htslib development
libraries are installed with pkg-config
:
pkg-config --cflags --libs htslib
GNU GCC must be installed for OpenMP support. This is usually installed by default on Linux systems, but may need to be manually installed on MacOS to use iscream with multiple threads1.
You can install the development version from Github by cloning the repo and running
git clone https://github.com/huishenlab/iscream
R CMD INSTALL iscream
You can also use the R devtools
package:
devtools::install_github("huishenlab/iscream")
or pak
:
pak::pkg_install("huishenlab/iscream")
A user guide is available on the package website. Bug reports may be submitted through GitHub issues.
Footnotes
-
Using OpenMP is also possible with Clang on MacOS (https://mac.r-project.org/openmp/) but installing GCC with Homebrew may be easier (https://formulae.brew.sh/formula/gcc). ↩