Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow configuring the default disruption budget to deny/allow #1190

Open
dschunack opened this issue Apr 17, 2024 · 5 comments
Open

Allow configuring the default disruption budget to deny/allow #1190

dschunack opened this issue Apr 17, 2024 · 5 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@dschunack
Copy link

Description

What problem are you trying to solve?
Hi,

We have 3 maintenance windows in a week to evict pods, but to configuration of the Disruption via Budget is not so easy and in parts not possible.

We want to run the disruption on Mon,Wed,Fri for 2 hours in the night.

The Current config looks like this and it's not easy to understand on the first look.

  disruption:
    budgets:
    - nodes: 10%
    - nodes: "5"
    - duration: 46h
      nodes: "0"
      schedule: 0 2 * * mon
    - duration: 46h
      nodes: "0"
      schedule: 0 2 * * wed
    - duration: 70h
      nodes: "0"
      schedule: 0 2 * * fri
    consolidationPolicy: WhenUnderutilized
    expireAfter: 1440h

Our proposal is to configure it like this:

  disruption:
    budgets:
    - nodes: "0"
    - duration: 2h
      nodes: "5"
      schedule: 0 2 * * mon,wed,fri
    consolidationPolicy: WhenUnderutilized
    expireAfter: 720h

It will help to configure the disruption budget easier as before.

How important is this feature to you?

It's important for us to customize and set the disruption budget in an easy way that is clear to understand for anyone.
We think this is not the case at the moment.

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@dschunack dschunack added kind/feature Categorizes issue or PR as related to a new feature. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 17, 2024
@dschunack dschunack changed the title Ability to define multiple specific disruption windows by budget Ability to define multiple disruption windows by budget Apr 18, 2024
@jonathan-innis
Copy link
Member

cc: @akestner

@jonathan-innis jonathan-innis removed the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Apr 22, 2024
@jonathan-innis
Copy link
Member

Just to clarify: You are basically wanting the most permissive budget to be respected rather than the most restrictive -- in reality I don't think you care too much about restrictive/permissive, I think you are just looking for a way to have the configuration more closely match the semantic meaning of what you are intending.

Does it make sense to update the title of this issue to something like "Improve semantic meaning of disruption budgets configuration" or something like it. The current title is actually something that we already support so it's a bit confusing from looking through the issues.

Next -- in terms of getting something that's closer to what you intend -- I'd agree that we should be less extemporaneous about schedules. I think the key struggle here is the different use-cases that we are trying to meet with this single feature. Today, we're basically trying to meet the following cases:

  1. Configuring a default parallelism
  2. Configuring d default base parallelism with an ability to extend into higher parallelism if your cluster meets certain sizes (percentages)
  3. Configuring block days (holidays where you don't want to roll, weekends when you don't want to roll etc.)
  4. Configuring specific times of day to disrupt

Number 4 is basically the only one that's difficult to write with budgets today since it's a "open during this window", consider other things closed by default concept whereas the others are a open as wide as possible by default and constrain or block during different times concept.

I'm not suggesting that this is a good idea, but to achieve something like this, you basically need to override what the "default open window" is. Today, the "default open window" is of infinite node parallelism and infinite time. You can block this window by applying constraints (blocking budgets) on top of it. To achieve what you want, you would need to override this open window to only be open to the times that you wanted and then apply blocking windows layered on top of it. Effectively

budgets:
- duration: 2h
  nodes: "5"
  schedule: 0 2 * * mon,wed,fri
  policy: Open

This kind of semantic would imply that this is the default budgets applied if you didn't specify any

budgets:
- nodes: 100%
  policy: Open
- nodes: 10%
  policy: Closed (this is the default and implicit if not specified)

@dschunack dschunack changed the title Ability to define multiple disruption windows by budget Improve semantic meaning of disruption budgets configuration Apr 23, 2024
@ellistarn
Copy link
Contributor

Are we deviating from the linux cron specification https://www.ibm.com/docs/en/db2oc?topic=task-unix-cron-format

@njtran
Copy link
Contributor

njtran commented May 8, 2024

Seems like this issue is asking for the feature add of "toggle budget default behavior", where the current behavior is that when no budget is active, it's maximally permissive (meaning no limit), and some users want maximally restrictive (meaning a budget of 0). Is this right? @dschunack

@dschunack
Copy link
Author

dschunack commented May 9, 2024

Yes, this is correct. We need to restrict it more on some k8s with a lot of static and critical workloads.

@njtran njtran changed the title Improve semantic meaning of disruption budgets configuration Allow configuring the default disruption budget to deny/allow May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

4 participants