-
Notifications
You must be signed in to change notification settings - Fork 598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
USE method dashboards are broken #30
Comments
Can you show what you mean? For me everything works as far as I can tell. |
For me all of these work, I think you'll need to dig deeper into the recording rules and figure out which labeling if off or which metrics you are missing. Most likely it's the same problem for all of them. My first guess would be the node-exporter -> node name mappings that is done through kube-state-metrics metrics. If I recall correctly, @tomwilkie mentioned that for the kausal ksonnet prometheus setup he had to set the |
We saw this when using node_exporter version Reverting node_exporter to |
There’s a branch on node exporter repo with some updates for 0.16, I intend
to move the node exporter specific stuff there.
…On Tue, 19 Jun 2018 at 15:00, conradj87 ***@***.***> wrote:
We saw this when using node_exporter version v0.16.0. There are breaking
changes to many metric names:
https://github.com/prometheus/node_exporter/blob/master/CHANGELOG.md#0160--2018-05-15
Reverting node_exporter to v0.15.2 fixed it for us.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#30 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAbGheWyjbjlwtxVoaUdRsS15-4c9lylks5t-QP6gaJpZM4UpDvG>
.
|
I'm having the same problem. Looking at one empty dashboard shows this recording rule doesn't have any data. There are others, but they look to be the same issue as this.
A quick change to the window (
This works, but the recording rules would need their names changed when the metrics are fixed. |
I also have this problem, but changing range window to |
@serathius do the individual metrics return results for you?
and
|
Yes, In #38 also checked if I'm missing any labels |
Could you share some results of each of them so we can see if the join should be possible? This might be due to labeling of your time-series. |
Sure
For
|
Your |
Isn't |
The point is that the I recommend you to change your relabeling rules for your node-exporter scrape job to relabel the namespace and pod label onto those targets. |
I'm working around this another way. I have federation setup across a few prometheus setups and was scraping every 60s, which I think was too slow for |
Migrating node_exporter job from node to pod and labeling with |
I'm having a similar issue where
I deployed prometheus server (+ kube state metrics + node exporter + alertmanager) through the prometheus helm chart using the chart's default values, including the chart's default scrape_configs. What would I need to change to make the custom rules and thus the dashboards work? Sorry if this is obvious, I'm new to prometheus. Some additional info:
|
I figured out how to get these labels added thanks to the helpful prometheus service-discovery status UI page. Below is a diff of what I changed in the helm chart's - job_name: 'kubernetes-service-endpoints'
+ honor_labels: true
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
action: replace
target_label: __scheme__
regex: (https?)
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
target_label: __address__
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
- action: labelmap
regex: __meta_kubernetes_service_label_(.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
- target_label: kubernetes_namespace
+ target_label: namespace
- source_labels: [__meta_kubernetes_service_name]
action: replace
- target_label: kubernetes_name
+ target_label: service
+ - source_labels: [__meta_kubernetes_pod_name]
+ action: replace
+ target_label: pod
+ - source_labels: [__meta_kubernetes_pod_node_name]
+ action: replace
+ target_label: node |
Generally speaking I recommend not adding non-identifying information to a target's label-set, and instead add this information at query time. That's also the approach this mixin takes largly, it makes queries and thus dashboards and alerting rules a lot more reusable, and not as tightly coupled to the individual configuration of Prometheus. |
@brancz Are you saying there is a better way to solve the problem? So many things in this mixin seem to be reliant on having
If these labels aren't being added in at the prometheus configuration level, I don't know of a way to add them at "query time". |
The |
In case it helps, I was experiencing the same issue, but it turned out to be because I was using an installation from (a random commit from) master instead of a particular release. When I checked out and applied v0.28.0, the metric collection and dashes were fixed |
@brancz Why do we need those joins? For calculating "node:node_num_cpu:sum" we join "node_cpu_seconds_total" with "node_namespace_pod:kube_pod_info:". I'm bad at promql, but my understanding is only change from joining is that we only use "node_cpu_seconds_total" values that are labeled with "pod" and "namespace" as existing pods. Node exporter metric is already filtered by "nodeExporterSelector" so I don't understand what is benefit of this join. This also disallows scraping node_exporter from outside of cluster or outside kubernetes (systemd service). I would prefer to have my node_exporter not labeled with "pod" and "namespace". My main problem is that all dashboards queries needs to be rewritten to remove those labels, because every time we redeploy node_exporter (update, config change) I get different timeseries. |
It's being joined to reliably get the |
Shouldn't then node_exporter have separate port for it's own metrics like kube-state-metrics? We don't have any dashboards or alerts on node_exporter internal metrics. Should we care about them being properly labeled. For me this is tradeoff where I would prefer simpler node metrics then correctly labeled node_exporter internal metrics. |
I do agree with your first statement, but there are various tradeoffs at work, for example node-exporter as opposed to kube-state-metrics runs on every node so on very large kubernetes clusters doubling the amount of requests for node metrics. While there are no dashboards or alerts, it's still very valuable information, that has helped us numerous times in detecting cpu/memory (mainly cpu) issues/leaks. Having all the labels is the common denominator, so while I understand your point, I think the trade-off we have chosen currently is the more appropriate and exact one. |
This issue has not had any activity in the past 30 days, so the
Thank you for your contributions! |
Some recording rules are missing, we either add those or remove the dashboards totally. WDYT @tomwilkie @brancz ?
The text was updated successfully, but these errors were encountered: