README

Install

First, we need to install devtools, if not already installed.

> install.packages("devtools")

Then, we install hlahaps using the install_github() function.

> devtools::install_github("VitorAguiar/hlahaps")

Now, we can load hlahaps as any other R package using the base function library().

> library(hlahaps)

Usage

When we load the package, the Gourraud et al. (2014) data is loaded.

> pag

# A tibble: 1,910 × 5
   subject        A        C        B       DRB1
     <chr>    <chr>    <chr>    <chr>      <chr>
1  HG00096 A*01:01g C*07:01g B*08:01g DRB1*03:01
2  HG00096  A*29:02  C*16:01  B*44:03 DRB1*07:01
3  HG00097 A*03:01g C*07:02g B*07:02g DRB1*13:03
4  HG00097 A*24:02g C*07:02g B*07:02g DRB1*15:01
5  HG00099 A*68:01g C*07:04g B*44:02g DRB1*03:01
6  HG00099 A*01:01g C*07:01g B*08:01g DRB1*11:01
7  HG00100 A*01:01g C*07:01g B*08:01g DRB1*03:01
8  HG00100 A*01:01g  C*06:02  B*57:01 DRB1*07:01
9  HG00101  A*32:01  C*02:02 B*27:05g DRB1*04:04
10 HG00101 A*11:01g  C*06:02  B*57:01 DRB1*15:01
# ... with 1,900 more rows

We will use the individual HG00096 to demonstrate how the query_nmdp() function works:

> test_ind <- dplyr::filter(pag, subject == "HG00096")
> test_ind

# A tibble: 2 × 5
  subject        A        C        B       DRB1
    <chr>    <chr>    <chr>    <chr>      <chr>
1 HG00096 A*01:01g C*07:01g B*08:01g DRB1*03:01
2 HG00096  A*29:02  C*16:01  B*44:03 DRB1*07:01

Applying query_nmdp():

> query_nmdp(test_ind)

$`original haplotypes`
# A tibble: 2 × 4
         A        C        B       DRB1
     <chr>    <chr>    <chr>      <chr>
1 A*01:01g C*07:01g B*08:01g DRB1*03:01
2  A*29:02  C*16:01  B*44:03 DRB1*07:01

$`haplotypes found`
# A tibble: 2 × 14
         A        C        B       DRB1    AFA_freq AFA_rank     API_freq
     <chr>    <chr>    <chr>      <chr>       <dbl>    <int>        <dbl>
1 A*01:01g C*07:01g B*08:01g DRB1*03:01 0.010938011        2 0.0011127143
2 A*29:02g  C*16:01  B*44:03 DRB1*07:01 0.003515995       14 0.0002314976
# ... with 7 more variables: API_rank <int>, CAU_freq <dbl>,
#   CAU_rank <int>, HIS_freq <dbl>, HIS_rank <int>, NAM_freq <dbl>,
#   NAM_rank <int>

$`haplotypes not found`
# A tibble: 0 × 4
# ... with 4 variables: A <chr>, C <chr>, B <chr>, DRB1 <chr>

$`possible haplotypes`
# A tibble: 14 × 4
          A        C        B       DRB1
      <chr>    <chr>    <chr>      <chr>
1  A*01:01g C*07:01g B*08:01g DRB1*03:01
2  A*01:01g C*07:01g B*08:01g DRB1*07:01
3  A*01:01g C*07:01g  B*44:03 DRB1*03:01
4  A*01:01g C*07:01g  B*44:03 DRB1*07:01
5  A*01:01g  C*16:01 B*08:01g DRB1*03:01
6  A*01:01g  C*16:01  B*44:03 DRB1*03:01
7  A*01:01g  C*16:01  B*44:03 DRB1*07:01
8  A*29:02g C*07:01g B*08:01g DRB1*03:01
9  A*29:02g C*07:01g B*08:01g DRB1*07:01
10 A*29:02g C*07:01g  B*44:03 DRB1*03:01
11 A*29:02g C*07:01g  B*44:03 DRB1*07:01
12 A*29:02g  C*16:01 B*08:01g DRB1*03:01
13 A*29:02g  C*16:01  B*44:03 DRB1*03:01
14 A*29:02g  C*16:01  B*44:03 DRB1*07:01

query_nmdp() returns a list with 4 elements:

A data.frame with the original haplotypes
A data.frame with the haplotypes found at the nmdp table
A data.frame with the haplotypes which were not found at the nmdp table
A data.frame with possible haplotypes at nmdp table given the individual's alleles

It is possible to apply query_nmdp() to the whole data, e.g. by using plyr::dlply() with a parallel backend provided by doMC::registerDoMC():

> # using 16 cores:
> doMC::registerDoMC(16)
> 
> results_list <- plyr::dlply(pag, ~subject, . %>% query_nmdp(), .parallel = TRUE)

Using purrr::map_df() it is possible to extract the info for all individuals as a data.frame. For example, let's create a data.frame with all the haplotypes found in the NMDP table:

> haps_found <- 
+   purrr::map_df(results_list, "haplotypes found", .id = "subject") %>%
+   tibble::as_tibble()
> 
> haps_found

# A tibble: 1,352 × 15
   subject        A        C        B        DRB1     AFA_freq AFA_rank
     <chr>    <chr>    <chr>    <chr>       <chr>        <dbl>    <int>
1  HG00096 A*01:01g C*07:01g B*08:01g  DRB1*03:01 1.093801e-02        2
2  HG00096 A*29:02g  C*16:01  B*44:03  DRB1*07:01 3.515995e-03       14
3  HG00097 A*03:01g C*07:02g B*07:02g  DRB1*13:03 8.835382e-05     2047
4  HG00097 A*24:02g C*07:02g B*07:02g  DRB1*15:01 1.471206e-03       54
5  HG00099 A*68:01g C*07:04g B*44:02g  DRB1*03:01 7.084284e-05     2518
6  HG00099 A*01:01g C*07:01g B*08:01g DRB1*11:01g 2.326729e-04      741
7  HG00100 A*01:01g C*07:01g B*08:01g  DRB1*03:01 1.093801e-02        2
8  HG00100 A*01:01g C*06:02g B*57:01g  DRB1*07:01 2.086870e-03       33
9  HG00101  A*32:01 C*02:02g B*27:05g  DRB1*04:04 5.512488e-06    12841
10 HG00101 A*11:01g C*06:02g B*57:01g  DRB1*15:01 0.000000e+00       NA
# ... with 1,342 more rows, and 8 more variables: API_freq <dbl>,
#   API_rank <int>, CAU_freq <dbl>, CAU_rank <int>, HIS_freq <dbl>,
#   HIS_rank <int>, NAM_freq <dbl>, NAM_rank <int>

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
R		R
data-raw		data-raw
data		data
man		man
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

README

Install

Usage

About

Releases

Packages

Languages

genevol-usp/hlahaps

Folders and files

Latest commit

History

Repository files navigation

README

Install

Usage

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages