Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle HealthyQueueNotFoundError for new processes. #894

Merged

Conversation

Alex-Izquierdo
Copy link
Contributor

@Alex-Izquierdo Alex-Izquierdo commented May 7, 2024

  • Handle HealthyQueueNotFoundError when a new process is going to be created (restart, start, autostart)
  • Update the status message when an activation is pending because there are no healthy nodes.
  • Fixes a regresion where an activation running in a unhealthy node is scheduled automatically in a new node instead of set "workers-offline" status
  • Fixes a bug where we don't check the health of the queue if only one queue (default queue for k8s and single-node deployments) was set.

Test flow:

  1. create an activation until reach running state.
  2. kill the activation workers of the node where the activation is running (from now node A), the activation goes into workers-offline state
  3. (optional) relaunch workers to test recoverability
  4. restart activation, it should be running in node B
  5. kill workers of node B, activation goes into workers-offline.
  6. Create a new activation, it should go into pending state with proper status message
  7. Restart the first activation in workers offline state, it should go into pending state with proper status message.
  8. (optional) relaunch some activation worker to test that both activations runs.

Replaces #892
Fixes: https://issues.redhat.com/browse/AAP-23378 and https://issues.redhat.com/browse/AAP-22907

@Alex-Izquierdo Alex-Izquierdo requested a review from a team as a code owner May 7, 2024 22:07
Copy link
Contributor

@jshimkus-rh jshimkus-rh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The title still has WIP and is not descriptive (within title limitations) of all that the PR contains.

The CI/test is indicating failure.

@Alex-Izquierdo Alex-Izquierdo changed the title WIP: Handle HealthyQueueNotFoundError for new processes. Handle HealthyQueueNotFoundError for new processes. May 8, 2024
@Alex-Izquierdo Alex-Izquierdo requested review from jshimkus-rh and a team May 8, 2024 18:58
@Alex-Izquierdo
Copy link
Contributor Author

@jshimkus-rh It was in WIP because it was not finished as it proves the CI fails. The description contains the details. It is ready now.

@Alex-Izquierdo Alex-Izquierdo requested a review from bzwei May 8, 2024 21:22
@Alex-Izquierdo Alex-Izquierdo merged commit d08cc36 into ansible:main May 8, 2024
3 checks passed
@Alex-Izquierdo Alex-Izquierdo deleted the handle-not-healthy-queues branch May 8, 2024 21:34
jshimkus-rh pushed a commit to jshimkus-rh/eda-server that referenced this pull request May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants