combineList: BACKEND Argument not working as intended #119

Apompetti-Cori opened this issue Apr 8, 2023 · 1 comment

Apompetti-Cori commented Apr 8, 2023

#test combineList function bsseq
#Noticed it still loads assays into memory after combination
M <- matrix(0:8, 3, 3)
Cov <- matrix(1:9, 3, 3)
hdf5_M <- writeHDF5Array(M)
hdf5_Cov <- writeHDF5Array(Cov)
hdf5_BS1 <- BSseq(chr = c("chr1", "chr2", "chr1"),
                  pos = c(1, 2, 3),
                  M = hdf5_M,
                  Cov = hdf5_Cov,
                  sampleNames = c("A", "B", "C"))


hdf5_BS2 <- BSseq(chr = c("chr1", "chr1", "chr1"),
                  pos = c(3, 4, 5),
                  M = hdf5_M,
                  Cov = hdf5_Cov,
                  sampleNames = c("D", "E", "F"))


x <- combineList(list(hdf5_BS1, hdf5_BS2), BACKEND = "HDF5Array")
When supplying the BACKEND="HDF5Array" as an argument for combineList, the resulting combined object is still loaded in memory.

Here is my output:

> x <- combineList(list(hdf5_BS1, hdf5_BS2), BACKEND = "HDF5Array")
> x
An object of type 'BSseq' with
  5 methylation loci
  6 samples
has not been smoothed
All assays are in-memory
Thanks for your patience while I was on leave.

I can confirm that bsseq::combineList() seems to be ignoring the BACKEND argument in this case.
I'm not sure exactly why and haven't had time to dig into this further.
But a workaround is to call HDF5Array::setAutoRealizationBackend("HDF5Array") before running bsseq::combineList(), as shown in the example below.


M <- matrix(0:8, 3, 3)
Cov <- matrix(1:9, 3, 3)
hdf5_M <- writeHDF5Array(M)
hdf5_Cov <- writeHDF5Array(Cov)
hdf5_BS1 <- BSseq(
  chr = c("chr1", "chr2", "chr1"),
  pos = c(1, 2, 3),
  M = hdf5_M,
  Cov = hdf5_Cov,
  sampleNames = c("A", "B", "C"))
hdf5_BS2 <- BSseq(
  chr = c("chr1", "chr1", "chr1"),
  pos = c(3, 4, 5),
  M = hdf5_M,
  Cov = hdf5_Cov,
  sampleNames = c("D", "E", "F"))

# Assay is in-memory despite specifying `BACKEND = "HDF5Arra"`
x <- combineList(list(hdf5_BS1, hdf5_BS2), BACKEND = "HDF5Array")
#> An object of type 'BSseq' with
#>   5 methylation loci
#>   6 samples
#> has not been smoothed
#> All assays are in-memory
#> 5x6 integer: DelayedMatrix object
#> └─ 5x6 integer: Set dimnames
#>    └─ 5x6 integer: [seed] matrix object

# Assay is on-disk (as expected)
y <- combineList(list(hdf5_BS1, hdf5_BS2))
#> An object of type 'BSseq' with
#>   5 methylation loci
#>   6 samples
#> has not been smoothed
#> Some assays are HDF5Array-backed
#> 5x6 integer: DelayedMatrix object
#> └─ 5x6 integer: Set dimnames
#>    └─ 5x6 integer: [seed] HDF5ArraySeed object
Session info
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R Under development (unstable) (2023-02-13 r83829)
#>  os       macOS Ventura 13.3.1
#>  system   aarch64, darwin20
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Australia/Melbourne
#>  date     2023-05-02
#>  pandoc   2.19.2 @ /Applications/ (via rmarkdown)
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package              * version   date (UTC) lib source
#>  beachmat               2.17.0    2023-04-25 [1] Bioconductor
#>  Biobase              * 2.61.0    2023-04-25 [1] Bioconductor
#>  BiocGenerics         * 0.47.0    2023-04-25 [1] Bioconductor
#>  BiocIO                 1.11.0    2023-04-25 [1] Bioconductor
#>  BiocParallel           1.35.0    2023-04-25 [1] Bioconductor
#>  Biostrings             2.69.0    2023-04-25 [1] Bioconductor
#>  bitops                 1.0-7     2021-04-24 [1] CRAN (R 4.3.0)
#>  BSgenome               1.69.0    2023-04-25 [1] Bioconductor
#>  bsseq                * 1.37.0    2023-04-25 [1] Bioconductor
#>  cli                    3.6.1     2023-03-23 [1] CRAN (R 4.3.0)
#>  codetools              0.2-19    2023-02-01 [1] CRAN (R 4.3.0)
#>  colorspace             2.1-0     2023-01-23 [1] CRAN (R 4.3.0)
#>  crayon                 1.5.2     2022-09-29 [1] CRAN (R 4.3.0)
#>  data.table             1.14.8    2023-02-17 [1] CRAN (R 4.3.0)
#>  DelayedArray         * 0.27.0    2023-04-25 [1] Bioconductor
#>  DelayedMatrixStats     1.23.0    2023-04-25 [1] Bioconductor
#>  digest                 0.6.31    2022-12-11 [1] CRAN (R 4.3.0)
#>  evaluate               0.20      2023-01-17 [1] CRAN (R 4.3.0)
#>  fastmap                1.1.1     2023-02-24 [1] CRAN (R 4.3.0)
#>  fs                     1.6.2     2023-04-25 [1] CRAN (R 4.3.0)
#>  GenomeInfoDb         * 1.37.0    2023-04-25 [1] Bioconductor
#>  GenomeInfoDbData       1.2.10    2023-03-26 [1] Bioconductor
#>  GenomicAlignments      1.37.0    2023-04-25 [1] Bioconductor
#>  GenomicRanges        * 1.53.0    2023-04-25 [1] Bioconductor
#>  glue                   1.6.2     2022-02-24 [1] CRAN (R 4.3.0)
#>  gtools                 3.9.4     2022-11-27 [1] CRAN (R 4.3.0)
#>  HDF5Array            * 1.29.0    2023-04-25 [1] Bioconductor
#>  htmltools              0.5.5     2023-03-23 [1] CRAN (R 4.3.0)
#>  IRanges              * 2.35.0    2023-04-25 [1] Bioconductor
#>  knitr                  1.42      2023-01-25 [1] CRAN (R 4.3.0)
#>  lattice                0.21-8    2023-04-05 [1] CRAN (R 4.3.0)
#>  lifecycle              1.0.3     2022-10-07 [1] CRAN (R 4.3.0)
#>  limma                  3.57.0    2023-04-25 [1] Bioconductor
#>  locfit                 1.5-9.7   2023-01-02 [1] CRAN (R 4.3.0)
#>  Matrix               * 1.5-4     2023-04-04 [1] CRAN (R 4.3.0)
#>  MatrixGenerics       * 1.13.0    2023-04-25 [1] Bioconductor
#>  matrixStats          * 0.63.0    2022-11-18 [1] CRAN (R 4.3.0)
#>  munsell                0.5.0     2018-06-12 [1] CRAN (R 4.3.0)
#>  permute                0.9-7     2022-01-27 [1] CRAN (R 4.3.0)
#>  R.methodsS3            1.8.2     2022-06-13 [1] CRAN (R 4.3.0)
#>  R.oo                   1.25.0    2022-06-12 [1] CRAN (R 4.3.0)
#>  R.utils                2.12.2    2022-11-11 [1] CRAN (R 4.3.0)
#>  R6                     2.5.1     2021-08-19 [1] CRAN (R 4.3.0)
#>  Rcpp                   1.0.10    2023-01-22 [1] CRAN (R 4.3.0)
#>  RCurl                  1.98-1.12 2023-03-27 [1] CRAN (R 4.3.0)
#>  reprex                 2.0.2     2022-08-17 [1] CRAN (R 4.3.0)
#>  restfulr               0.0.15    2022-06-16 [1] CRAN (R 4.3.0)
#>  rhdf5                * 2.45.0    2023-04-25 [1] Bioconductor
#>  rhdf5filters           1.13.2    2023-04-30 [1] Bioconductor
#>  Rhdf5lib               1.23.0    2023-04-25 [1] Bioconductor
#>  rjson                  0.2.21    2022-01-09 [1] CRAN (R 4.3.0)
#>  rlang                  1.1.1     2023-04-28 [1] CRAN (R 4.3.0)
#>  rmarkdown              2.21      2023-03-26 [1] CRAN (R 4.3.0)
#>  Rsamtools              2.17.0    2023-04-25 [1] Bioconductor
#>  rstudioapi             0.14      2022-08-22 [1] CRAN (R 4.3.0)
#>  rtracklayer            1.61.0    2023-04-25 [1] Bioconductor
#>  S4Vectors            * 0.39.0    2023-04-25 [1] Bioconductor
#>  scales                 1.2.1     2022-08-20 [1] CRAN (R 4.3.0)
#>  sessioninfo            1.2.2     2021-12-06 [1] CRAN (R 4.3.0)
#>  sparseMatrixStats      1.13.0    2023-04-25 [1] Bioconductor
#>  SummarizedExperiment * 1.31.0    2023-04-25 [1] Bioconductor
#>  withr                  2.5.0     2022-03-03 [1] CRAN (R 4.3.0)
#>  xfun                   0.39      2023-04-20 [1] CRAN (R 4.3.0)
#>  XML                    3.99-0.14 2023-03-19 [1] CRAN (R 4.3.0)
#>  XVector                0.41.0    2023-04-25 [1] Bioconductor
#>  yaml                   2.3.7     2023-01-23 [1] CRAN (R 4.3.0)
#>  zlibbioc               1.47.0    2023-04-25 [1] Bioconductor
#>  [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library
#> ──────────────────────────────────────────────────────────────────────────────

