-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
editions: maps are incorrectly serialized if file has default message encoding of DELIMITED #1615
Comments
@stapelberg, @lfolger, have either of you seen this? Now that v27.0 of |
(Lasse is on vacation until June 10th.) Thanks for the report. We haven’t encountered this issue elsewhere. Yes, if you send pull requests, that would help to get a new release out. I’m out of office on Thursday and Friday, so timeline-wise, I think we should be able to cut a new release next week. To confirm: importing the new descriptor version merely side-steps this issue, yes? For delimited encoding, it sounds like we still need a fix eventually. |
Sorry that I brought up the new descriptor in the same sentence. They are separate -- that and a fix to this bug are two independent things I'd like to get into a release. |
Ah, thanks for clarifying. Will you also send a PR for the bug fix? I’m afraid we’re currently stretched very thin (with others on vacation and parental leave), so to make progress in a reasonable time frame we’re reliant on contributions. Thanks |
Just opened one: https://go-review.googlesource.com/c/protobuf/+/588976 |
Apologies for the delay, I was out sick the last few days. Should I start a new release, or would you like to wait for https://go.dev/cl/589336 to land as well? |
We can wait to see the outcome of that other CL. If it is approved, it would be nice for it to be included as well. |
For one, the
protoreflect.FieldDescriptor.Kind()
method returnsprotoreflect.GroupKind
for the map field itself (which is defined under-the-hood as a repeated message field) and also for any values in the map entry that are also messages. This was recently confirmed by the Protobuf team to be a bug (protocolbuffers/protobuf#16549) and then fixed in the C++ runtime/protoc implementation in protocolbuffers/protobuf@0dbd99a#diff-eccabec4b932653c8b1a80b49eb3eeecdbab939a444d9518f5c240e3e55c8c29R5546-R5567.But the issue in the Go runtime (even as recent as v1.34.1) is actually a little bit worse than the C++ behavior: instead of serializing the maps using group encoding (which is technically not correct as maps cannot be configured to have delimited encoding), it serializes them in a way that is not valid Protobuf binary data.
It actually uses the "start group" wire type (3), but then emits a length-prefix and has no "end group" tag at the end.
Here's an example:
How this gets interpreted by a decoder:
0b
: Wire type "start group", tag 108
: Since the previous wire type was group this gets interpreted as a field in the group. So wire type "varint", tag 1.0a
: A valid varint value of 10.03
: The next tag in the group: but this one is invalid 💀. It encodes a wire type of 3 but a tag of zero which is not legal 💥However, what's actually going on:
0b
: Wire type "start group", tag 108
: The varint length of 8 bytes. Oops! (Pretend the previous byte was0a
: wire type "bytes", tag 1.). The contents are a nested message entry.0a
: Wire type "bytes", tag 103
: Varint length of 3 bytes616263
: The string "abc"12
: Wire type "bytes", tag 201
: Varint length of 1 byte31
: The string "1"Interestingly, there is no trailing tag with an "end group" wire type. It uses "start group" in the initial tag but then acts as it if it otherwise encoded with length-prefix instead of delimited.
At the very end, we can see another example, but this time it's the value of a map entry that is similarly encoded incorrectly. The last 9 bytes of the above output are
0b 060a 0161 1201 3114
. Like the above entries, this one also starts with a "group start" wire type, but is then followed by a length (6 bytes). The value inside that entry is the final four bytes1201 3114
. Interestingly, this has the reverse issue as the entry itself: it starts off with12
, which is wire type "bytes", tag 2. It then has the length (01
) and the data (31
). But, despite starting with a "bytes" wire type, it does conclude with an "end group" tag:14
is wire type "end group", tag 2.The above invalid output was produced using the following file:
It was compiled using v27.0-rc1 of protoc and v1.34.1 of protoc-gen-go:
protoc test.proto --go_opt paths=source_relative --go_opt Mtest.proto=tmp/main --go_out .
And then the above was produced using this small Go program (
main.go
in the same folder/package as above generated code):It fails to round-trip the message with the an error:
It writes the actual encoded bytes to stdout, which is the example binary data above.
The text was updated successfully, but these errors were encountered: