Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve workflow creation logic, and boundaries between r-server and r-w-controller #478

Open
VMois opened this issue Apr 7, 2022 · 0 comments

Comments

@VMois
Copy link

VMois commented Apr 7, 2022

While working on #474, I realized that the logic of fetching GitLab repositories is quite complicated, and spread across r-server and r-w-controller. This makes it hard to add new things, plus easy to introduce bugs. An example is an above-mentioned issue where the workflow flow looked like this:

  1. Detect that this is GitLab webhook request in r-server, so we are dealing with a repository that had recently a commit;
  2. Fetch only reana.yaml from the Gitlab repo, and create reana_specification; this specification is incomplete as engines like Yadage, CWL, and Snakemake require more files;
  3. Send a request to r-w-controller to create a workflow passing reana_specification, and fetch Gitlab repository;
  4. Result is returned to r-server, where, before gitlab: specification for Snakemake and CWL workflows is not loaded correctly #474, only Yadage specification was updated; that is why CWL and Snakemake were failing, we forgot to update reana_specification there; below is the chunk that is responsible for update:

# This is necessary for GitLab integration
if workflow.type_ == "yadage":
_load_and_save_yadage_spec(
workflow, workflow_dict["operational_options"]
)
elif workflow.type_ in ["cwl", "snakemake"]:
reana_yaml_path = os.path.join(workflow.workspace_path, "reana.yaml")
workflow.reana_specification = load_reana_spec(
reana_yaml_path, workflow.workspace_path
)
Session.commit()

We also have similar update logic in a different endpoint, not sure why it is there:

if "yadage" in (workflow.type_, restart_type):
_load_and_save_yadage_spec(workflow, operational_options)

  1. Publish workflow to the workflow-submission queue, so it can start.

The flow above spreads logic for fetching and creating workflows accross two services.

An example of a better flow is the recently added /launch endpoint:

  1. Fetch the repository in full;
  2. Create reana_specification based on all files from the fetched repository;
  3. Send a request to r-w-controller to create a workflow;
  4. Publish workflow to the workflow-submission queue, so it can start.

It still has things that, probably, are a better fit for r-w-controller like creating reana_specification and publishing workflow, but at least it doesn't have a tricky logic with updating reana_specification.

Proposed actions

  1. We need to define clear boundaries between r-server and r-w-controller and enforce them in the future.
  2. Based on defined boundaries, we will need to move some logic from r-server to r-w-controller and vice-versa. For example (this is subject to discussions):
  • GitLab fetcher can be moved to r-server and merged with launch fetcher, forming a single fetcher codebase responsible for all external workflow sources;
  • instead of creating reana_specification in r-server, we can move it to r-w-controller; in addition, we might change our approach and load specification directly from reana.yaml file instead of relying on a client to generate it, related models: rethink reana_specification storage philosophy reana-db#162;
  • instead of publishing workflows in r-server, we can do it in r-w-controller; this will be a nice abstraction for r-server
  • with all the above, r-server will be responsible for preparing and fetching files, and, after, it will delegate workflow initiation to r-w-controller;

As mentioned by @audrium, this issue can nicely fit into our future work on making REANA CLI client smaller (so-called "thin client").

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant