-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add kubernetes-mixin runbooks #11
Conversation
✔️ Deploy Preview for distracted-northcutt-e0bccc ready! 🔨 Explore the source changes: c9a73f4 🔍 Inspect the deploy log: https://app.netlify.com/sites/distracted-northcutt-e0bccc/deploys/620fbc637e7ed50008f25a71 😎 Browse the preview: https://deploy-preview-11--distracted-northcutt-e0bccc.netlify.app |
Some are with TODO, though.
I guess certain sections should be extracted and added to |
@paulfantom ping |
@@ -1,5 +1,5 @@ | |||
--- | |||
title: Alertmanager ConfigInconsistent | |||
title: Alertmanager Config Inconsistent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WDYT about keeping it consistent with alert name in prometheus and removing spaces instead of adding them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah was wondering about it:
- with spaces - page title with spaces will be on the left side of the web page and it is easier to read
- without spaces - some alert names are really long and this may look ugly there.
On the other hand first header in the page is left as is.
Also I will check if searching for alerts from the search bar works with names with spaces and without it.
I would rather keep it with spaces and hid somwhere the name without spaces :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing entries without spaces makes them impossible to find via such short alert name, which is a bit problematic.
Will have to look into the Hugo options or something.
|
||
## Meaning | ||
|
||
Given container in the pod is throttled to avoid excessive CPU usage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should clearly state that this alert is just informative and user shouldn't increase CPU limits unless the application is behaving erratically (another alert firing). For this particular reason, the alert is inhibited by default in kube-prometheus and can be sent only if another alert in the same namespace is firing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should link to kubernetes-monitoring/kubernetes-mixin#108 for more curious folks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in e30e62a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
notice there will be a lot of such entries, unfortunately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall great work!!! 🎉
I've reviewed only till KubePersistentVolumeErrors
runbook and I will resume in next days.
Few generic nits:
- Since "Service degradation or unavailability." is very vague I would like to refrain from using it as an Impact and it would be good to specify what is the direct consequence. I've put a few suggestions on what I mean by it.
- When linking to some concept, like "APIServer aggregation", let's maybe put those links between
<details></details>
in the "Meaning" section. WDYT? - If we create issues for TODO sections, we can increase the visibility of what needs to be done.
|
||
## Meaning | ||
|
||
Given container in the pod is throttled to avoid excessive CPU usage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should link to kubernetes-monitoring/kubernetes-mixin#108 for more curious folks?
Co-authored-by: Paweł Krupa <[email protected]>
Co-authored-by: Paweł Krupa <[email protected]>
Co-authored-by: Paweł Krupa <[email protected]>
Co-authored-by: Paweł Krupa <[email protected]>
Co-authored-by: Paweł Krupa <[email protected]>
Co-authored-by: Paweł Krupa <[email protected]>
Co-authored-by: Paweł Krupa <[email protected]>
Co-authored-by: Paweł Krupa <[email protected]>
Co-authored-by: Paweł Krupa <[email protected]>
Co-authored-by: Paweł Krupa <[email protected]>
Co-authored-by: Paweł Krupa <[email protected]>
Co-authored-by: Paweł Krupa <[email protected]>
Co-authored-by: Paweł Krupa <[email protected]>
Co-authored-by: Paweł Krupa <[email protected]>
Co-authored-by: Paweł Krupa <[email protected]>
Co-authored-by: Paweł Krupa <[email protected]>
Co-authored-by: Paweł Krupa <[email protected]>
Co-authored-by: Paweł Krupa <[email protected]>
Co-authored-by: Paweł Krupa <[email protected]>
After slack talk it is better to close it and split into smaller commits. |
@nvtkaszpir Do we currently have any PRs open to fix the dead links. Still seeing https://runbooks.prometheus-operator.dev/runbooks/kubernetes/kubecontainerwaiting and many other alerts |
yeah, AFAIR it was not merged yet |
Hi @paulfantom is there any update on this? |
remove excess icons
fixes #8 and a lot of other dead urls.
todo:
fix references between files.