Slow numerics in operations #4

vsbuffalo · 2024-02-27T18:43:07Z

On large (1M ranges) datasets, bedtools map is beating GRanges. It look like it comes down to numerics:

# using inline: 
command       bedtools time    granges time      granges speedup (%)
------------  ---------------  --------------  ---------------------
map_multiple  26.89 min        688.60 s                     57.3189
map_max       21.34 min        21.54 min                    -0.94889
adjust        21.35 min        499.54 s                     60.9959
filter        27.01 min        20.38 min                    24.5583
map_min       935.69 s         17.48 min                   -12.1008
flank         30.26 min        943.36 s                     48.0353
map_mean      20.13 min        931.21 s                     22.913
map_sum       17.99 min        18.44 min                    -2.5225
windows       507.84 s         84.79 s                      83.304
map_median    16.90 min        17.81 min                    -5.36811

# not using inline
command       bedtools time    granges time      granges speedup (%)
------------  ---------------  --------------  ---------------------
map_multiple  26.89 min        688.60 s                     57.3189
map_max       21.34 min        21.54 min                    -0.94889
adjust        21.35 min        499.54 s                     60.9959
filter        27.01 min        20.38 min                    24.5583
map_min       935.69 s         17.48 min                   -12.1008
flank         30.26 min        943.36 s                     48.0353
map_mean      20.13 min        931.21 s                     22.913
map_sum       17.99 min        18.44 min                    -2.5225
windows       507.84 s         84.79 s                      83.304
map_median    16.90 min        17.81 min                    -5.36811

where inline here refers to #[inline(always)] above the Operations::run() (suggested by @molpopgen).

The text was updated successfully, but these errors were encountered:

vsbuffalo · 2024-02-27T23:18:36Z

So interestingly this isn't an issue on Apple silicon:

command       bedtools time    granges time      granges speedup (%)
------------  ---------------  --------------  ---------------------
map_multiple  133.10 s         67.61 s                       49.2089
adjust        63.30 s          33.18 s                       47.5765
map_min       414.12 s         291.08 s                      29.7102
map_mean      410.42 s         276.49 s                      32.6327
map_max       417.44 s         281.10 s                      32.6611
map_sum       425.98 s         276.15 s                      35.1739
map_median    419.66 s         294.14 s                      29.9112
flank         86.00 s          51.95 s                       39.5932
filter        78.36 s          43.85 s                       44.0442
windows       287.13 s         49.90 s                       82.6225

vs Ubuntu x86_64

command     bedtools time    granges time      granges speedup (%)
----------  ---------------  --------------  ---------------------
map_max     530.72 s         571.12 s                     -7.61291
map_min     537.00 s         611.83 s                    -13.9352
map_mean    534.22 s         572.59 s                     -7.18139
map_sum     561.18 s         601.37 s                     -7.16121
map_median  570.75 s         549.84 s                      3.6649

…` added, etc. - New `Sequences::region_apply_into_granges()`, which applies a function to regions of a sequence, and puts the results in a new `GRanges`. - New `Sequences` test cases corresponding to `granges_test_case_01()`, and lots of new `Sequence`-related tests. - Added `map_into_array1()` and `map_into_array2()` for the --feature=ndarray, with tests. - `take_ranges()` and `take_both()` added. - New `GRanges::midpoints()` method, with tests. - New `GRanges::data_indices()` method, to build a `GenomeMap` out of the data indices. - New `GRanges::data_by_seqnames()` which builds a `GenomeMap` of `Vec<U>` of the data, by sequence name. `GRanges::data_refs_by_seqnames()` also added, for references. - `GRanges::into_vecranegs()` added, with `From` method for `COITreesEmpty` to `VecRangesEmpty` added since it's needed. - Cleaned lifetime out of `IndexedDataContainer`, brought down to assoc. type level. Cleans up things a lot! - Remove section on readme on slow benchmarks -- this is now GH issue #4.

…added, etc. - Renamed `region_apply` in `region_map` to be more consistent with Rust's naming. - `retain_seqnames`, etc for `TsvRecordIterator`. - New `Sequences::region_map_into_granges()`, which applies a function to regions of a sequence, and puts the results in a new `GRanges`. - New `Sequences` test cases corresponding to `granges_test_case_01()`, and lots of new `Sequence`-related tests. - Added `map_into_array1()` and `map_into_array2()` for the --feature=ndarray, with tests. - `take_ranges()` and `take_both()` added. - New `GRanges::midpoints()` method, with tests. - New `GRanges::data_indices()` method, to build a `GenomeMap` out of the data indices. - New `GRanges::data_by_seqnames()` which builds a `GenomeMap` of `Vec<U>` of the data, by sequence name. `GRanges::data_refs_by_seqnames()` also added, for references. - `GRanges::into_vecranegs()` added, with `From` method for `COITreesEmpty` to `VecRangesEmpty` added since it's needed. - Cleaned lifetime out of `IndexedDataContainer`, brought down to assoc. type level. Cleans up things a lot! - `into_array1()` and `into_array2()` and tests. - Remove section on readme on slow benchmarks -- this is now GH issue #4.

molpopgen · 2024-02-28T20:01:49Z

So interestingly this isn't an issue on Apple silicon:

command       bedtools time    granges time      granges speedup (%)
------------  ---------------  --------------  ---------------------
map_multiple  133.10 s         67.61 s                       49.2089
adjust        63.30 s          33.18 s                       47.5765
map_min       414.12 s         291.08 s                      29.7102
map_mean      410.42 s         276.49 s                      32.6327
map_max       417.44 s         281.10 s                      32.6611
map_sum       425.98 s         276.15 s                      35.1739
map_median    419.66 s         294.14 s                      29.9112
flank         86.00 s          51.95 s                       39.5932
filter        78.36 s          43.85 s                       44.0442
windows       287.13 s         49.90 s                       82.6225

vs Ubuntu x86_64

command     bedtools time    granges time      granges speedup (%)
----------  ---------------  --------------  ---------------------
map_max     530.72 s         571.12 s                     -7.61291
map_min     537.00 s         611.83 s                    -13.9352
map_mean    534.22 s         572.59 s                     -7.18139
map_sum     561.18 s         601.37 s                     -7.16121
map_median  570.75 s         549.84 s                      3.6649

Is this "apples to apples" ? (Pun intended.) The raw times reported for Apple Silicon are 10X less than for Ubuntu/x86. Are you sure that the Apple numbers reflect "large data" tests?

molpopgen · 2024-02-28T22:24:11Z

On a second read, maybe the Apple Silicon numbers are okay? They're not 10x smaller but more like 2x?

vsbuffalo added the performance/optimization label Feb 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow numerics in operations #4

Slow numerics in operations #4

vsbuffalo commented Feb 27, 2024

vsbuffalo commented Feb 27, 2024

molpopgen commented Feb 28, 2024

molpopgen commented Feb 28, 2024

Slow numerics in operations #4

Slow numerics in operations #4

Comments

vsbuffalo commented Feb 27, 2024

vsbuffalo commented Feb 27, 2024

molpopgen commented Feb 28, 2024

molpopgen commented Feb 28, 2024