Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.Net Processes - Map Step Feature #9339

Draft
wants to merge 51 commits into
base: main
Choose a base branch
from
Draft

.Net Processes - Map Step Feature #9339

wants to merge 51 commits into from

Conversation

crickman
Copy link
Contributor

@crickman crickman commented Oct 20, 2024

DRAFT

Motivation and Context

Fixes: #9193

Description

Map each value from a set to a map-operation and present the results as a set for potential reduction.

Includes:

  • ProcessMapBuilder (Core)
  • KernelProcessMap / KernelProcessMapState (Abstractions)
  • LocalMap (LocalRuntime)

Features:

  • Handles when output type has been transformed from input type
  • Accepts either step or subprocess for map operation

Contribution Checklist

@crickman crickman added PR: in progress Under development and/or addressing feedback .NET Issue or Pull requests regarding .NET code experimental Associated with an experimental feature enhancement processes labels Oct 20, 2024
@crickman crickman self-assigned this Oct 20, 2024
@github-actions github-actions bot changed the title .NET Processes - Map Step Feature .Net Processes - Map Step Feature Oct 20, 2024
ProcessBuilder mapProcess = this.MapProcess;

KernelProcessMapState state = new(this.Name, this.Id);
return new KernelProcessMap(state, mapProcess.Build(), this.TargetFunction.ParameterName!, builtEdges);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking about state propagation at this stage there may need a state validation to either pass the state or pass null (like no state was given) before letting the build logic take care of it.

Also thinking about if it's needed to have a stepId here or just keep treating the map steps as an array

@esttenorio
Copy link
Contributor

General thoughts about State + Map:

Since the map is a subprocess with N steps (to be defined at runtime), I'm thinking of 2 scenarios:

  • Map steps are volatile and no state is saved, and if the root process pauses while running the map, the last event received by the map gets re-propagated and the map steps get re-spawned -> only works if map steps steps are not too complex
  • Assuming the Map steps are complex enough (another subprocess as step + state) different strategies could be taken, since at this point the map steps act as workers -> trying to get multiple jobs (same jobs) done, and return a result:
    • all Map steps steps are singletons and share same resources, so on pause/resume only 1 state per map step step is saved -> using as example the foodprep -> if 5 fish orders get triggered the ingredient stock should be globally -5
      • also the singleton step may be an interesting feature to explore when reusing steps so they all use same resources just used multiple times by different processes, emitting result to different parent processes
    • on pause, the most completed process state gets saved, and on resume this one gets passed as initial state to all map steps:
      • with this consideration, when loading a processes with state at most for map steps we offer to pass along the same initial state to all map children

@esttenorio
Copy link
Contributor

Additional considerations - not for this PR but for the feature in general:

  • timeout management: if spawning N events, have timeout logic and if some map steps are not done, null is returned
  • limiting number of N elements: depending on how complex the inner steps are (nested subprocesses)
  • volatile state: since map steps are spawned on runtime have checkpoints before and after the map step

@crickman
Copy link
Contributor Author

Additional considerations - not for this PR but for the feature in general:

  • timeout management: if spawning N events, have timeout logic and if some map steps are not done, null is returned
  • limiting number of N elements: depending on how complex the inner steps are (nested subprocesses)
  • volatile state: since map steps are spawned on runtime have checkpoints before and after the map step

I like these considerations. I feel that timeout and depth considerations apply generally to the entire framework, not just the map-step.

I very much like the volatile concept. I have to admit, I'm not entirely clear on the pause or cancel mechanics in the runtime. I don't see the cancel-token being propagated much, but I need to take a closer look

@crickman
Copy link
Contributor Author

crickman commented Oct 26, 2024

  • all Map steps steps are singletons and share same resources, so on pause/resume only 1 state per map step step is saved -> using as example the foodprep -> if 5 fish orders get triggered the ingredient stock should be globally -5

    • also the singleton step may be an interesting feature to explore when reusing steps so they all use same resources just used multiple times by different processes, emitting result to different parent processes
  • on pause, the most completed process state gets saved, and on resume this one gets passed as initial state to all map steps:

Ben and I did discuss this "functoid" constraint for the map operation and felt it might be too limiting. At this point, the feature is functional without resorting to constraining the map operation. Also if we apply the "volatile" concept, I wonder if the need exists to further constrain the map operation.

Another question that comes to mind is how to enforce such a constraint (singleton / functoid)...i suspect such enforcement may not be trivial.

What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement experimental Associated with an experimental feature .NET Issue or Pull requests regarding .NET code PR: in progress Under development and/or addressing feedback processes
Projects
Status: Sprint: In Progress
Development

Successfully merging this pull request may close these issues.

.Net: Process Framework: Map Step
2 participants