Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Binary IDL With MessagePack Bytes #2751

Closed
wants to merge 11 commits into from

Conversation

Future-Outlier
Copy link
Member

@Future-Outlier Future-Outlier commented Sep 13, 2024

Summary

This will be merged in separate stack PRs.

flytekit

  1. simple type only
  2. dict transformer
  3. pure dataclass and nested dataclass
  4. List and Dict Transformer
  5. attribute access
  6. flyte types

Tracking issue

flyteorg/flyte#5318

Why are the changes needed?

What changes were proposed in this pull request?

  • dict transformer to_literal
  • dict transformer to_python_val
  • simple transformer to_python_val
  • pure dataclass
  • nested dataclass
  • attribute access (not include flyte type)
  • non-strict types (Dict[int, str]) supported.
  • flyte types
  • propeller's attribute access

How was this patch tested?

flyte type attribute access

from dataclasses import dataclass, field
from flytekit.types.file import FlyteFile
from flytekit import task, workflow, ImageSpec

flytekit_hash = "442b2bf7a7c221777afa37793fd598e4e1ea4ca9"
flytekit = f"git+https://github.com/flyteorg/flytekit.git@{flytekit_hash}"

image = ImageSpec(
    packages=[flytekit],
    apt_packages=["git"],
    registry="localhost:30000",
)

@dataclass
class DC:
    f: FlyteFile = field(default_factory=lambda: FlyteFile("s3://my-s3-bucket/example.txt"))

@task(container_image=image)
def t_all(dc: DC):
    assert (type(dc.f), FlyteFile)
    print(dc.f)
    print("FlyteFile Content:", open(dc.f, "r").read())

@task(container_image=image)
def t_flytefile(f: FlyteFile):
    print("FlyteFile Content:", open(f, "r").read())

@workflow
def wf(dc: DC):
    t_all(dc=dc)
    t_flytefile(f=dc.f)


if __name__ == "__main__":
    from flytekit.clis.sdk_in_container import pyflyte
    from click.testing import CliRunner
    import os

    runner = CliRunner()
    path = os.path.realpath(__file__)
    result = runner.invoke(pyflyte.main,
                           ["run", path, "wf", "--dc",'{}'])
    print("Local Execution: ", result.output)
    #
    result = runner.invoke(pyflyte.main, ["run", "--remote", path, "wf", "--dc",'{}'])
    print("Remote Execution: ", result.output)

nested cases

import typing
from dataclasses import dataclass, field
from typing import Dict, List
from flytekit import task, workflow, ImageSpec

flytekit_hash = "c24077bce6e63bf8df0d80dbc2c5e2ff3322bca8"

flytekit = f"git+https://github.com/flyteorg/flytekit.git@{flytekit_hash}"

image = ImageSpec(
    packages=[flytekit],
    apt_packages=["git"],
    registry="localhost:30000",
)

@dataclass
class InnerDC:
    x: int = 1
    y: float = 2.0
    z: str = "inner_string"

@dataclass
class MiddleDC:
    inner: InnerDC = field(default_factory=InnerDC)
    b_list: List[InnerDC] = field(default_factory=lambda: [InnerDC(), InnerDC()])
    c_dict: Dict[str, InnerDC] = field(default_factory=lambda: {"key1": InnerDC(), "key2": InnerDC()})

@dataclass
class OuterDC:
    middle: MiddleDC = field(default_factory=MiddleDC)
    list_of_lists: List[List[InnerDC]] = field(default_factory=lambda: [[InnerDC(), InnerDC()], [InnerDC()]])
    dict_of_dicts: Dict[str, Dict[str, InnerDC]] = field(default_factory=lambda: {"key1": {"subkey1": InnerDC()}})
    dict_of_lists: Dict[str, List[InnerDC]] = field(default_factory=lambda: {"key1": [InnerDC(), InnerDC()]})
    list_of_dicts: List[Dict[str, InnerDC]] = field(
        default_factory=lambda: [{"key1": InnerDC(), "key2": InnerDC()}])

@task(container_image=image)
def t_x(x: int):
    assert isinstance(x, int), f"Expected int, but got {type(x)}"
    print(f"x: {x}")
    print("================")

@task(container_image=image)
def t_y(y: float):
    assert isinstance(y, float), f"Expected float, but got {type(y)}"
    print(f"y: {y}")
    print("================")

@task(container_image=image)
def t_middle(middle: MiddleDC):
    assert isinstance(middle, MiddleDC), f"Expected MiddleDC, but got {type(middle)}"
    print(f"middle.inner.x: {middle.inner.x}, middle.a.y: {middle.inner.y}")
    print(f"middle.b_list: {middle.b_list}")
    print(f"middle.c_dict: {middle.c_dict}")
    print("================")

@task(container_image=image)
def t_inner(inner: InnerDC):
    assert isinstance(inner, InnerDC), f"Expected InnerDC, but got {type(inner)}"
    print(f"inner.x: {inner.x}, inner.y: {inner.y}, inner.z: {inner.z}")
    print("================")

@task(container_image=image)
def t_list_of_dc(list_of_dc: List[InnerDC]):
    for i, inner in enumerate(list_of_dc):
        assert isinstance(inner, InnerDC), f"Expected InnerDC, but got {type(inner)}"
        print(f"list_of_dc[{i}] - x: {inner.x}, y: {inner.y}, z: {inner.z}")
    print("================")

@task(container_image=image)
def t_list_of_lists(list_of_lists: List[List[InnerDC]]):
    for i, inner_list in enumerate(list_of_lists):
        for j, inner in enumerate(inner_list):
            assert isinstance(inner, InnerDC), f"Expected InnerDC, but got {type(inner)}"
            print(f"list_of_lists[{i}][{j}] - x: {inner.x}, y: {inner.y}, z: {inner.z}")
    print("================")

@task(container_image=image)
def create_outer_dc() -> OuterDC:
    return OuterDC()

@workflow
def dataclass_wf() -> OuterDC:
    input = create_outer_dc()
    t_x(x=input.middle.inner.x)
    t_y(y=input.middle.inner.y)
    t_middle(middle=input.middle)
    t_inner(inner=input.middle.inner)
    t_list_of_dc(list_of_dc=input.list_of_lists[0])
    t_list_of_lists(list_of_lists=input.list_of_lists)
    t_x(x=input.list_of_lists[0][0].x)
    t_x(x=input.list_of_lists[0][1].x)
    t_x(x=input.dict_of_dicts["key1"]["subkey1"].x)

    return input

if __name__ == "__main__":
    from flytekit.clis.sdk_in_container import pyflyte
    from click.testing import CliRunner

    runner = CliRunner()
    path = "/Users/future-outlier/code/dev/flytekit/build/PR/JSON/demo/dataclasss_simple_dataclass_1_and_2_level.py"

    # result = runner.invoke(pyflyte.main, ["run", path, "dataclass_wf", "--input", '{}'])
    # result = runner.invoke(pyflyte.main, ["run", path, "dataclass_wf"])
    # print("Local Execution: ", result.output)
    # print("================")
    #
    # result = runner.invoke(pyflyte.main, ["run", "--remote", path, "dataclass_wf", "--input", '{}'])
    result = runner.invoke(pyflyte.main, ["run", "--remote", path, "dataclass_wf"])
    print("Remote Execution: ", result.output)

Setup process

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

@Future-Outlier Future-Outlier changed the title Binary IDL With MessagePack Bytes [WIP] Binary IDL With MessagePack Bytes Sep 13, 2024
@Future-Outlier Future-Outlier marked this pull request as draft September 13, 2024 13:53
Copy link

codecov bot commented Sep 16, 2024

Codecov Report

Attention: Patch coverage is 15.38462% with 66 lines in your changes missing coverage. Please review.

Project coverage is 49.52%. Comparing base (7f54171) to head (8873507).
Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
flytekit/core/type_engine.py 18.51% 42 Missing and 2 partials ⚠️
flytekit/core/promise.py 8.33% 22 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##            master    #2751       +/-   ##
============================================
- Coverage   100.00%   49.52%   -50.48%     
============================================
  Files            5      194      +189     
  Lines          122    19821    +19699     
  Branches         0     4132     +4132     
============================================
+ Hits           122     9816     +9694     
- Misses           0     9475     +9475     
- Partials         0      530      +530     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
…lass_int in binary idl in dataclass transformer

Signed-off-by: Future-Outlier <[email protected]>
Signed-off-by: Future-Outlier <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant