Skip to content

Architecture

Vladyslav Moisieienkov edited this page Aug 1, 2022 · 6 revisions

This page describes high-level architecture of the REANA platform for version 0.8.x, 0.9.x.

Contents

  1. Overview
  2. Workflows

Overview

Here you can find an overview diagram (and its drawio source) of the REANA architecture. It describes some common operations and components. This is a good starting point before diving into the codebase.

REANA platform general architecture diagram

Researchers (Users) can interact with REANA platform via two different ways:

Summary of core platform components:

  • rest-api (reana-server), REST API server, processes and dispatches requests from users to internal REANA services;
  • workflow scheduler, service responsible for scheduling workflows from different users based on scheduling rules;
  • rest-api (reana-workflow-controller), internal REST API responsible for managing workflows (create, start workflows, etc.);
  • reana-workflow-engine-xxxx, controls execution of a single workflow of some type like Yadage, CWL, Snakemake or Serial;
  • reana-job-controller, works alongside workflow engine and dispatches workflow jobs to different clusters like Kubernetes, HTCondor or Slurm;
  • reana-message-broker, RabbitMQ message broker used for distributing messages for workflows scheduling and progress updates;
  • job-status-consumer, receives update messages from all running workflows via queue, updates database;
  • SQL database (DB), depending on a setup, can be connected to REANA from outside, or hosted as part of REANA cluster.

Workflows

Workflows are the basic units in REANA. Each workflow has a name (e.g. myanalysis), a run number (e.g. 42), and is identified with a unique UUID.

The workflow name is the same throughout the analysis lifetime, the run number is incremented every time the workflow is run, and the UUID is generated each time anew.

The workspace is a place where user stores their files for the workflows.

It is also possible to restart workflows. In this case, the workflow run number is followed by a dot, followed by a restart number. The workspace is shared between workflow restarts.

Statuses

The workflow run during its execution may take up different statuses, from its creation to its conclusion. Below you can find a table of workflow statuses and their corresponding values in the database:

Status name Value in database
created 0
running 1
finished 2
failed 3
deleted 4
stopped 5
queued 6
pending 7

Throughout the workflow lifetime it will transition from one status to another. Here is the diagram (and its drawio source) of possible transitions between different workflow statuses:

Workflow transitions diagram