Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow reclaiming the current allocation #686

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

shahn
Copy link

@shahn shahn commented Mar 31, 2024

This is based on #680, where it was noted that it is hard to use BytesMut without additional allocations in some circumstances.

@shahn
Copy link
Author

shahn commented Mar 31, 2024

This is a draft to get some comments on whether such an API addition would be acceptable

src/bytes_mut.rs Outdated Show resolved Hide resolved
@shahn shahn marked this pull request as ready for review April 7, 2024 12:22
src/bytes_mut.rs Outdated Show resolved Hide resolved
src/bytes_mut.rs Outdated Show resolved Hide resolved
src/bytes_mut.rs Outdated Show resolved Hide resolved
@braddunbar
Copy link
Contributor

braddunbar commented Apr 8, 2024

This looks a lot like BytesMut::reserve with the allocating parts cut out, which sounds a lot like a special case of BytesMut::try_reserve(ORIGINAL_CAPACITY). What would y'all think of a more general implementation in that vein?

@shahn
Copy link
Author

shahn commented Apr 8, 2024

I think it is hard to know what the original capacity is, generally, as it could've been influenced by lots of hidden operations before.
In addition, try_reclaim also doesn't move any bytes internally, with the goal of being very cheap to call. To me, the allocation behaviour of the entire crate is sometimes a bit hard to predict, so a separate function for it seems sensible

@braddunbar
Copy link
Contributor

I think it is hard to know what the original capacity is, generally, as it could've been influenced by lots of hidden operations before.

That's true, but presumably you know how much data you'll need to reserve regardless (otherwise dynamic allocation seems like the best strategy). So it doesn't particularly matter if you can reclaim the entire buffer, only if it can fit the data you're receiving. Or am I missing the point here?

In addition, try_reclaim also doesn't move any bytes internally, with the goal of being very cheap to call.

Very cheap relative to what though? Based on your description in #680, your only other alternative would be to allocate anyway, right? And that's definitely expensive relative to moving bytes. If that's not the alternative, what is?

@shahn
Copy link
Author

shahn commented Apr 8, 2024

My usecase is that I have a couple of BytesMuts and I would only allocate more memory for the current BytesMut if I couldn't reclaim the other. Basically, switch between the buffers, filling them up, while the data is processed on another thread. Once all the handles have been dropped I could reclaim the allocation and start filling the other buffer, etc.

@shahn
Copy link
Author

shahn commented May 13, 2024

rebased this on top of current master, to make it mergable again. But I am also not sure if it is deemed merge-worthy or not :)

There is currently a test case failure because BytesMut::advance() changed from its documented behaviour, but if this is generally acceptable I would try to reconcile that.

@braddunbar
Copy link
Contributor

In my mind, reclaiming memory is always in service of then writing a specific amount of data so I would prefer a try_reserve approach here that allows both at once.

@shahn
Copy link
Author

shahn commented May 15, 2024

Wouldn't that just be reserve? Or maybe more specifically, where would the try_ part come in?

@braddunbar
Copy link
Contributor

reserve allocates if it can't reclaim enough space for the requested number of bytes. try_reserve would return false rather than allocate, giving you an opportunity to swap buffers instead.

@shahn
Copy link
Author

shahn commented May 16, 2024

Ah, I misunderstood. Yeah, that'd work as well for me! I will update the branch when I get a chance

@Darksonn
Copy link
Contributor

Naming nit: The name try_reserve is currently used on Vec for an operation that reallocates but returns an error on allocation failure instead of calling abort. So we shouldn't add a method of that name that does something else.

This fixes tokio-rs#680, where it was noted that it is hard to use
BytesMut without additional allocations in some circumstances.
@shahn
Copy link
Author

shahn commented May 17, 2024

Implemented the suggestion, I kept the name try_reclaim based on Darksonn's suggestion to avoid try_reserve. Thanks a lot for the review and comments

Comment on lines +769 to +773
/// Reclaiming the allocation cheaply is possible if the `BytesMut` is currently empty and
/// there are no outstanding references through other `BytesMut`s or `Bytes` which point to the
/// same underlying storage.
///
/// A call to `try_reclaim` never copies inside the allocation nor allocates new storage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably worth thinking about the behavior when it is non-empty. The reserve method will copy data backwards only if the full capacity is at least twice the amount of data being copied (I think). What logic makes sense here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it depends on usage patterns. For my use case, it is likely the buffer is either completely empty or more than one reference to the underlying storage still exists, therefore it doesn't make a big difference.

I guess it could make a bigger difference when you are reading a lot into a large buffer, start reading the second thing which also takes a while, then the first thing is handled and when you call try_reclaim you either copy the large thing to the front of the buffer or reclaim tells you "couldn't reclaim", but I am not sure what you would want in that case

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this further, it would mean try_reclaim would become more expensive to call, because we wouldn't just check whether the BytesMut is empty but also which kind it is and whether we can move bytes, before returning. I wonder if @braddunbar has an opinion?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants