Lot of unsafe code could probably be replaced by safe code without perf. impact #18

Ten0 · 2020-11-30T17:57:54Z

Looking at the source code:
https://docs.rs/slice-group-by/0.2.6/src/slice_group_by/linear_group/linear_group_by.rs.html#131-136

It seems this crate makes extensive use of unsafe.
However, it also looks like this unsafe is unnecessary as it creates no performance improvement:

Representation with the two pointers is exactly the memory representation of a slice, so it looks like this:

pub struct LinearGroupBy<'a, T: 'a, P> {
    ptr: *const T,
    end: *const T,
    predicate: P,
    _phantom: marker::PhantomData<&'a T>,
}

could be replaced by this:

pub struct LinearGroupBy<'a, T: 'a, P> {
    elements_left: &'a [T],
    predicate: P,
}

while this:

                let mut i = 0;
                let mut ptr = self.ptr;

                // we use an unsafe block to avoid bounds checking here.
                // this is safe because the only thing we do here is to get
                // two elements at `ptr` and `ptr + 1`, bounds checking is done by hand.

                // we need to get *two* contiguous elements so we check that:
                //  - the first element is at the `end - 1` position because
                //  - the second one will be read from `ptr + 1` that must
                //    be lower or equal to `end`
                unsafe {
                    while ptr != self.end.sub(1) {
                        let a = &*ptr;
                        ptr = ptr.add(1);
                        let b = &*ptr;

                        i += 1;

                        if !(self.predicate)(a, b) {
                            let slice = $mkslice(self.ptr, i);
                            self.ptr = ptr;
                            return Some(slice)
                        }
                    }
                }

could be rewritten using for (i,s) in self.elements_left.windows(2).enumerate() and slice::split_at/slice::split_at_mut.

(and obviously this:

                // `i` is either `0` or the `slice length - 1` because either:
                //  - we have not entered the loop and so `i` is equal to `0`
                //    the slice length is necessarily `1` because we ensure it is not empty
                //  - we have entered the loop and we have not early returned
                //    so `i` is equal to the slice `length - 1`
                let slice = unsafe { $mkslice(self.ptr, i + 1) };
                self.ptr = self.end;
                Some(slice)

by this:

Some(std::mem::replace(&mut self.elements_left, &[]))

)

I don't think this change would create additional runtime cost, as most double-bounds-checking (or constant-bounds-checking in particular with the constants 2 and 1 here) usually gets optimized-out by the compiler, and I would feel much more comfortable using this crate, as I wouldn't like to introduce so much unsafe to our codebase for such a simple algorithm.

The text was updated successfully, but these errors were encountered:

Kerollmops · 2020-12-24T16:25:42Z

Hey @Ten0,

I am on your side on this. I implemented the linear version of this algorithm in safe Rust, and proposed it to the standard library. I am now waiting for it to be merged.

For your information, I didn't take the time to check the performance impact, fortunately, I have an important set of benchmarks.

Ten0 · 2021-10-11T13:18:42Z

Hello!
I'm in need of this algorithm again.
It looks like this is fixed on the current master (#19), however I also noticed that slice-group-by 0.2.7 that fixes this has been released then yanked.
What is the current status of this? Was there an issue with 0.2.7?

Kerollmops · 2021-10-12T09:24:25Z

Hey @Ten0,

Sorry about that, I just published version 0.3.0 which is the real version that contains the changes you are looking for. Don't hesitate to ask me anything if you have any requests about this library.

Ten0 · 2023-08-16T14:52:49Z

Hmm looks like linear group by key has the same issue that could be fixed in the same way, no? Or even more simply, by making linear_group_by_key wrap a linear_group_by?

Kerollmops · 2023-12-14T13:56:17Z

This could be the case, but I don't have time right now. Maybe you can propose a pull request? I am pretty sure it will not be breaking.

Kerollmops mentioned this issue Dec 24, 2020

Implement LinearGroupBy with almost only safe Rust #19

Merged

Ten0 closed this as completed Oct 13, 2021

Ten0 reopened this Oct 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lot of unsafe code could probably be replaced by safe code without perf. impact #18

Lot of unsafe code could probably be replaced by safe code without perf. impact #18

Ten0 commented Nov 30, 2020

Kerollmops commented Dec 24, 2020 •

edited

Loading

Ten0 commented Oct 11, 2021 •

edited

Loading

Kerollmops commented Oct 12, 2021 •

edited

Loading

Ten0 commented Aug 16, 2023

Kerollmops commented Dec 14, 2023 •

edited

Loading

Lot of unsafe code could probably be replaced by safe code without perf. impact #18

Lot of unsafe code could probably be replaced by safe code without perf. impact #18

Comments

Ten0 commented Nov 30, 2020

Kerollmops commented Dec 24, 2020 • edited Loading

Ten0 commented Oct 11, 2021 • edited Loading

Kerollmops commented Oct 12, 2021 • edited Loading

Ten0 commented Aug 16, 2023

Kerollmops commented Dec 14, 2023 • edited Loading

Kerollmops commented Dec 24, 2020 •

edited

Loading

Ten0 commented Oct 11, 2021 •

edited

Loading

Kerollmops commented Oct 12, 2021 •

edited

Loading

Kerollmops commented Dec 14, 2023 •

edited

Loading