Database Schema

Goscheduler uses Cassandra as its database. The Goscheduler database schema is divided into two parts:

`schedule_management`

This keyspace contains all the data related to one-time schedules, recurring schedules, status, etc.

`schedules`

Purpose: Stores one-time schedules information. Application pollers query this table every minute and check if there are any schedules to be triggered.

Columns:

app_id: The application ID.
partition_id: The partition ID.
schedule_time_group: The scheduled time rounded to a minute level.
schedule_id: The unique identifier for the schedule.
callback_type: The type of callback.
callback_details: The details of the callback.
payload: The payload data.
schedule_time: The scheduled time.
parent_schedule_id: The parent schedule ID. Applicable for cron use-cases.

Primary Key: ((app_id, partition_id, schedule_time_group), schedule_id)

Clustering Order: schedule_id DESC

`view_schedules` (Materialized View)

Purpose: Provides an alternative view of the schedules. Mainly used to query schedules by schedule_id.

Primary Key: (schedule_id, app_id, partition_id, schedule_time_group)

Clustering Order: app_id ASC, partition_id ASC, schedule_time_group ASC

`status`

Purpose: Stores the status of the schedules.

Columns:

app_id: The application ID.
partition_id: The partition ID.
schedule_time_group: The scheduled time rounded to a minute level.
schedule_id: The unique identifier for the schedule.
schedule_status: The status of the schedule.
error_msg: The error message, if any.
reconciliation_history: The reconciliation history.

Primary Key: ((app_id, partition_id), schedule_id)

Clustering Order: schedule_id DESC

`recurring_schedules_by_partition`

Purpose: Stores information about recurring schedules by partition. A special app with a fixed number of pollers is configured to query this table every minute. If the time matches with the cron_expression a one-time schedule is created for that time.

Columns:

app_id: The application ID.
partition_id: The partition ID.
schedule_id: The unique identifier for the schedule.
callback_type: The type of callback.
callback_details: The details of the callback.
payload: The payload data.
cron_expression: The cron expression for recurring schedules.
status: The status of the recurring schedule.

Primary Key: (partition_id, schedule_id, app_id)

`recurring_schedules_by_id`

Purpose: Stores information about recurring schedules by ID. Mainly used to get recurring schedules by schedule_id.

Columns:

app_id: The application ID.
partition_id: The partition ID.
schedule_id: The unique identifier for the schedule.
callback_type: The type of callback.
callback_details: The details of the callback.
payload: The payload data.
cron_expression: The cron expression for recurring schedules.
status: The status of the recurring schedule.

Primary Key: (schedule_id)

`recurring_schedule_runs`

Purpose: Stores information about the runs of recurring schedules. This tables stores the parent-child mapping of each cron and its individual run.

Columns:

app_id: The application ID.
partition_id: The partition ID.
schedule_time_group: The group of scheduled times.
schedule_id: The unique identifier for the schedule.
callback_type: The type of callback.
callback_details: The details of the callback.
payload: The payload data.
schedule_time: The scheduled time.
parent_schedule_id: The parent schedule ID.

Primary Key: (parent_schedule_id, schedule_time_group)

Clustering Order: schedule_time_group DESC

`cluster`

This keyspace contains all the meta-data information i.e. apps onboarded, pollers to node mapping etc.

`apps`

Purpose: Stores information about applications within the cluster. This table stores all the onboarded client apps, resource quota (no. of pollers), and app status.

Columns:

id: The application ID.
partitions: The number of partitions.
active: Whether the application is active or not.

Primary Key: (id)

`entity`

Purpose: Stores information about cluster entities. This table is used during node bootstrap where each nodes reads the table and with the help of Ringpop library decides whether it should start the poller entity on it or not.

Columns:

id: The poller ID. It is formed with the concatenation of app_id and partition_id. For ex. app with 5 partitions will have test.0, test.1.. test.4 ids.
nodename: The ringpop node on which the poller entity is running.
status: The status of the entity.
history: The history of the entity.

Primary Key: (id)

`nodes` (Materialized View)

Purpose: Provides an alternative view of the cluster nodes.

Columns:

nodename: The ringpop node on which the poller entity is running.
id: The poller ID.
status: The status of the entity.

Primary Key: (nodename, id)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Database Schema

`schedule_management`

`schedules`

`view_schedules` (Materialized View)

`status`

`recurring_schedules_by_partition`

`recurring_schedules_by_id`

`recurring_schedule_runs`

`cluster`

`apps`

`entity`

`nodes` (Materialized View)

Clone this wiki locally

Database Schema

schedule_management

schedules

view_schedules (Materialized View)

status

recurring_schedules_by_partition

recurring_schedules_by_id

recurring_schedule_runs

cluster

apps

entity

nodes (Materialized View)

Clone this wiki locally

`schedule_management`

`schedules`

`view_schedules` (Materialized View)

`status`

`recurring_schedules_by_partition`

`recurring_schedules_by_id`

`recurring_schedule_runs`

`cluster`

`apps`

`entity`

`nodes` (Materialized View)