HLApers

License

HLApers integrates software such as kallisto, Salmon and STAR. Before using it, please read the license notices here

Getting started

Install required software

1. HLApers

git clone https://github.com/genevol-usp/HLApers.git

2. R v3.4+

3. In R, install the following packages

from Bioconductor:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("Biostrings")

from GitHub:

if (!requireNamespace("devtools", quietly = TRUE))
    install.packages("devtools")

devtools::install_github("genevol-usp/hlaseqlib")

4. For STAR-Salmon-based pipeline, install:

STAR v2.5.3a+
Salmon v0.8.2+
samtools 1.3+
seqtk

5. For kallisto-based pipeline, install:

kallisto

Download data:

1. IMGT database

git clone https://github.com/ANHIG/IMGTHLA.git

2. Gencode:

transcripts fasta (e.g., Gencode v37 fasta)
corresponding annotations GTF (e.g., Gencode v37 GTF)

HLApers usage

Link the hlapers executable in your execution path, or change to the HLApers directory and execute the program with ./hlapers.

Getting help

HLApers is composed of the following modes:

hlapers --help

Usage: hlapers [modes]

prepare-ref          Prepare transcript fasta files.
index                Create index for read alignment.
bam2fq               Convert BAM to fastq.
genotype             Infer HLA genotypes.
quant                Quantify HLA expression.

1. Building a transcriptome supplemented with HLA sequences

The first step is to use hlapers prepare-ref to build an index composed of Gencode transcripts, where we replace the HLA transcripts with IMGT HLA allele sequences.

hlapers prepare-ref --help

Usage: hlapers prepare-ref [options]

-t | --transcripts   Fasta with Gencode transcript sequences.
-a | --annotations   GTF from Gencode for the same Genome version.
-i | --imgt          Path to IMGT directory.
-o | --out           Output directory.

Example:

hlapers prepare-ref -t gencode.v37.transcripts.fa.gz -a gencode.v37.annotation.gtf.gz -i IMGTHLA -o hladb

2. Creating an index for read alignment

hlapers index --help

Usage: hlapers index [options]

-t | --transcripts   Fasta with Gencode transcript sequences.
-p | --threads       Number of threads.
-o | --out           Output directory.
--kallisto           Create index for kallisto pipeline instead of STARsalmon.

Example:

hlapers index -t hladb/transcripts_MHC_HLAsupp.fa -p 4 -o index

3. HLA genotyping

Given a BAM file from a previous alignment to the genome, we first need to extract the reads mapped to the MHC region and those which are unmapped. For this, we can use the bam2fq utility.

hlapers bam2fq --help

Usage: hlapers bam2fq [options]

-m | --mhc-coords    Genomic coordinates of the MHC region in chrN:start-end format if MHC fastq is desired.
-b | --bam           BAM file (if -m is specified, needs to be sorted by coordinate; otherwise use --sort).
-o | --outprefix     Output prefix name.
--sort               Sort input BAM file by coordinate (REQUIRED if -m is specified and BAM is not sorted by coordinate).

Example:

hlapers bam2fq -b HG00096.bam -m ./hladb/mhc_coords.txt -o HG00096

Then we run the genotyping module.

hlapers genotype --help

Usage: hlapers genotype [options]

-i | --index         Index generated by 'hlapers index'.
-t | --transcripts   Fasta with Gencode transcripts sequences used for 'hlapers index'.
-1 | --fq1           Fastq for READ 1.
-2 | --fq2           Fastq for READ 2.
-p | --threads       Number of threads.
-o | --outprefix     Output prefix name.
--kallisto           Use kallisto for genotyping.

Example:

hlapers genotype -i index/STARMHC -t ./hladb/transcripts_MHC_HLAsupp.fa -1 HG00096_mhc_1.fq -2 HG00096_mhc_2.fq -p 8 -o results/HG00096

4. Quantify HLA expression

In order to quantify expression, we use the quant module. If the original fastq files are available, we can proceed directly to the quantification step. If only a BAM file of a previous alignment to the genome is available, we first need to convert the BAM to fastq using the bam2fq utility.

Example:

hlapers bam2fq -b HG00096.bam -o HG00096

Proceed to the quantification step.

hlapers quant --help

Usage: hlapers quant [options]

-t | --transcripts   Reference transcripts directory.
-g | --genotypes     *_genotypes.tsv file generated by 'hlapers genotype'.
-1 | --fq1           Fastq for READ 1.
-2 | --fq2           Fastq for READ 2.
-p | --threads       Number of threads.
-o | --out           Output prefix name.
--salmonreads        Use Salmon lightweight alignment for quantification (NOT TESTED)
--kallisto           Use kallisto for quantification.

Example:

hlapers quant -t ./hladb -g ./results/HG00096_genotypes.tsv -1 HG00096_1.fq.gz -2 HG00096_2.fq.gz -o ./results/HG00096 -p 8

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
script		script
.gitignore		.gitignore
README.Rmd		README.Rmd
README.md		README.md
hlapers		hlapers
license.txt		license.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HLApers

License

Getting started

Install required software

1. HLApers

2. R v3.4+

3. In R, install the following packages

4. For STAR-Salmon-based pipeline, install:

5. For kallisto-based pipeline, install:

Download data:

1. IMGT database

2. Gencode:

HLApers usage

Getting help

1. Building a transcriptome supplemented with HLA sequences

2. Creating an index for read alignment

3. HLA genotyping

4. Quantify HLA expression

About

Releases 1

Packages

Languages

License

genevol-usp/HLApers

Folders and files

Latest commit

History

Repository files navigation

HLApers

License

Getting started

Install required software

1. HLApers

2. R v3.4+

3. In R, install the following packages

4. For STAR-Salmon-based pipeline, install:

5. For kallisto-based pipeline, install:

Download data:

1. IMGT database

2. Gencode:

HLApers usage

Getting help

1. Building a transcriptome supplemented with HLA sequences

2. Creating an index for read alignment

3. HLA genotyping

4. Quantify HLA expression

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages