Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deserializing CQL collections causes frequent reallocations of Vec #1058

Open
pkolaczk opened this issue Aug 16, 2024 · 1 comment
Open

Deserializing CQL collections causes frequent reallocations of Vec #1058

pkolaczk opened this issue Aug 16, 2024 · 1 comment
Assignees
Labels
area/deserialization performance Improves performance of existing features
Milestone

Comments

@pkolaczk
Copy link

pkolaczk commented Aug 16, 2024

This is an issue I found when testing the performance of vector type, but apparently this applies to all collections.
The code uses itertools to collect fallible items into a Vec<Result<...>>:

                ListlikeIterator::<'frame, T>::deserialize(typ, v)
                    .and_then(|it| it.collect::<Result<_, DeserializationError>>())
                    .map_err(deser_error_replace_rust_name::<Self>)

Unfortunately itertools does not use the size hint in this case and does not reserve proper space in the returned vector.
Instead, it grows the vector as more items are added.

Also it the size hint for ListLikeIterator doesn't look correct anyways (it delegates to raw iterator, which provides the default None hint).

This can be trivially fixed by creating the vector with proper capacity first (we know the number of elements very early on and it is stored in the field of ListLikeIterator) and then adding the items with e.g. extend.

@wprzytula wprzytula self-assigned this Aug 19, 2024
@wprzytula wprzytula added performance Improves performance of existing features area/deserialization labels Aug 19, 2024
@wprzytula
Copy link
Collaborator

@pkolaczk Thanks for pointing this out!

@wprzytula wprzytula added this to the 0.15.0 milestone Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/deserialization performance Improves performance of existing features
Projects
None yet
Development

No branches or pull requests

2 participants