-
Notifications
You must be signed in to change notification settings - Fork 680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Binary IDL With MessagePack Bytes #5742
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #5742 +/- ##
==========================================
+ Coverage 36.21% 36.30% +0.08%
==========================================
Files 1303 1305 +2
Lines 109644 109991 +347
==========================================
+ Hits 39710 39927 +217
- Misses 65810 65909 +99
- Partials 4124 4155 +31
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
|-----------------------------------|----------------------------------------------| | ||
| Protobuf Struct -> JSON String -> Python Val | Binary (value: MessagePack Bytes, tag: msgpack) IDL Object -> Bytes -> (Dict ->) -> Python Val | | ||
|
||
Note: if a python value can't directly be converted to `MessagePack Bytes`, we can convert it to `Dict`, and then convert it to `MessagePack Bytes`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a note here to say very clearly that there is no JSON in the new type at all. JSON plays zero part of the new spec (except for the schema).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few comments. Overall, I like the direction.
msgpack_bytes = lv.scalar.json.value | ||
else: | ||
raise ValueError(f"{tag} is not supported to decode this Binary Literal: {lv.scalar.binary}.") | ||
return msgpack.loads(msgpack_bytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add the full signature of to_python_value
? Specifically, we can use typing.cast(expected_python_type, msgpack.loads(msgpack_bytes))
to get type-checkers to agree with this, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
We don't need this, because we will use
from mashumaro.codecs.msgpack import MessagePackDecoder, MessagePackEncoder
to encode and decode.
It will make sure we convert it back to a type we 100% want. -
msgpack.dumps
will only be used when dealing withuntyped dict
.
} | ||
} | ||
// Use Message Pack as Default Tag for deserialization. | ||
func MakeBinaryLiteral(v []byte) *core.Literal { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not called anywhere yet (besides a few tests). We can make the tag part of the signature (and assume a default value).
MsgPack is a good choice because it's more smaller and faster than UTF-8 Encoded JSON String. | ||
|
||
You can see the performance comparison here: https://github.com/flyteorg/flyte/pull/5607#issuecomment-2333174325 | ||
We will use `msgpack` to do it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comparison should move to the Alternatives section. The conclusion should be about how this design ticks the two problems we set out to solve (1. a better representation for json objects in Flyte, and 2. Fix Attribute Access once and for all).
|
||
1. No JSON Schema provided: | ||
|
||
Input is expected as an `Object` (e.g., `{"a": 1}`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a Javascript Object right? Let's add that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add a section about pasting? How do users paste in something? Can add to unresolved questions if you want. If this is a Javascript object, we should allow json pasting, but what about yaml? What about msgpack bytes if they were copied from a binary file?
|
||
Input is expected as an `Object` (e.g., `{"a": 1}`). | ||
|
||
2. JSON Schema provided: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In both cases, with or without JSON schema... what happens after the user enters data? Can you add a short description of what happens? I assume it's some JS msgpack library that will turn the object into bytes.
ff8c99a
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
bf2a8af
to
3716255
Compare
Thank you for all your work on this, @Future-Outlier ! This feature is going to solve a massive a pain in the Flyte ecosystem. |
return true | ||
} | ||
|
||
return isSameTypeInJSON(upstreamMetadata, downstreamMetadata) ||\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @Future-Outlier for identifying a solution to this problem 🙇
Tracking issue
#5318
Why are the changes needed?
What changes were proposed in this pull request?
How was this patch tested?
Setup process
Screenshots
Check all the applicable boxes
Related PRs
Docs link