Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no graphs in UI #850

Open
sstarcher opened this issue Aug 9, 2023 · 10 comments
Open

no graphs in UI #850

sstarcher opened this issue Aug 9, 2023 · 10 comments

Comments

@sstarcher
Copy link

No graphs show up in the UI. The links to prometheus show the correct data, but all graphs and all time frames are empty. I only see debug messages in the pyrra logs with no errors.

Screenshot 2023-08-09 at 9 16 15 AM
Screenshot 2023-08-09 at 9 16 31 AM

@sstarcher
Copy link
Author

in addition availability and error budget on the main page all show no data.
Screenshot 2023-08-09 at 9 17 32 AM

@sstarcher
Copy link
Author

Looking at the query being ran.
QueryRange is responding with {"matrix":{}}

The query in the payload looks correct and is what shows up when you click the prometheus link which has a correct graph.

end
: 
"1691592397"
query
: 
"((1 - 0.99) - (sum(grpc_server_handled:increase2w{app_kubernetes_io_name=\"xx\",code=~\"yyy\",grpc_method=~\"x\",slo=\"xx\"} or vector(0)) / sum(grpc_server_handled:increase2w{app_kubernetes_io_name=\"xx\",grpc_method=~\"xx\",slo=\"xx\"}))) / (1 - 0.99)"
start
: 
"1691588797"
step
: 
"3"

@metalmatze
Copy link
Member

Does the data show up after a bit?
This is most likely due to the recording rules only being created and computed once the SLO has been created.

If the graphs eventually fill with data this issue would be actually related to prometheus/prometheus#10202 where one idea would be to add an API to Prometheus to trigger backfilling of data for newly created recording rules.

@sstarcher
Copy link
Author

I have about 2 hours of data now. Do I need to wait longer or would suspect that 2 hours would show?

@metalmatze
Copy link
Member

The first data should show up within 3min and from there every 2.5min a new data point will be added to the recording rule's metric name. I suspect something about the time series and the config you have must be mismatched. It's hard to say with the information we have in this issue now. I would try looking into the grpc_server_handled:increase2w time series in Prometheus and check the queries in Prometheus directly that Pyrra sends to Prometheus (the one you have pasted above).

@sstarcher
Copy link
Author

grpc_server_handled:increase2w shows up perfectly fine in Prometheus.

An entry looks like this and I have about a dozen of them

grpc_server_handled:increase2w{app_kubernetes_io_name="xxx", code="xxx", grpc_method="xxx", slo="xxx", team="xxx"}
record:[grpc_server_handled:increase2w](https://dataservices.dev.telemetry.bwce.io/graph?g0.expr=grpc_server_handled%3Aincrease2w&g0.tab=1&g0.stacked=0&g0.show_exemplars=0.g0.range_input=1h.)
expr:[sum by (code, grpc_method, route) (increase(grpc_server_handled_total{app_kubernetes_io_name="xxx",grpc_method=~"xxx"}[2w]))](https://dataservices.dev.telemetry.bwce.io/graph?g0.expr=sum%20by%20(code%2C%20grpc_method%2C%20route)%20(increase(grpc_server_handled_total%7Bapp_kubernetes_io_name%3D%22cachefiller%22%2Cgrpc_method%3D~%22Cache%22%7D%5B2w%5D))&g0.tab=1&g0.stacked=0&g0.show_exemplars=0.g0.range_input=1h.)
labels:
app_kubernetes_io_name: xxx
slo: xxx
team: xxx

Could the addition of code be causing your logic some issue?

As I mentioned the queries that Pyrra links work when I run them in Prometheus. This has now been running for over 24 hours and I still see the same result.

Let me know if I can provide anything else or do anything on my side, but I don't see anything in the Pyrra logs.

My SLO looks like this


---
apiVersion: pyrra.dev/v1alpha1
kind: ServiceLevelObjective
metadata:
  creationTimestamp: "2023-08-09T12:13:08Z"
  generation: 2
  labels:
    prometheus: k8s
    pyrra.dev/team: xxx
    role: alert-rules
  name: xxx
  namespace: qqqqq
spec:
  description: Pyrra's API requests and response errors over time grouped by route.
  indicator:
    ratio:
      errors:
        metric: grpc_server_handled_total{app_kubernetes_io_name="xxx",
          grpc_method=~"xxx", code=~"OK|NOT_FOUND|INVALID_ARGUMENT|PERMISSION_DENIED|FAILED_PRECONDITION|RESOURCE_EXHAUSTED"}
      grouping:
      - route
      total:
        metric: grpc_server_handled_total{app_kubernetes_io_name="xxx",
          grpc_method=~"xxx"}
  target: "99"
  window: 2w
---
apiVersion: pyrra.dev/v1alpha1
kind: ServiceLevelObjective
metadata:
  creationTimestamp: "2023-08-09T12:28:00Z"
  generation: 1
  labels:
    prometheus: k8s
    pyrra.dev/method: yyyyy
    pyrra.dev/team: test
    role: alert-rules
  name: fffff
  namespace: qqqqq
spec:
  description: Pyrra's API requests and response errors over time grouped by route.
  indicator:
    ratio:
      errors:
        metric: grpc_server_handled_total{grpc_method=~"yyyyy",
          code=~"OK|NOT_FOUND|INVALID_ARGUMENT|PERMISSION_DENIED|FAILED_PRECONDITION|RESOURCE_EXHAUSTED"}
      grouping:
      - route
      total:
        metric: grpc_server_handled_total{grpc_method=~"yyyyy"}
  target: "99"
  window: 2w
---
apiVersion: pyrra.dev/v1alpha1
kind: ServiceLevelObjective
metadata:
  creationTimestamp: "2023-08-09T12:07:16Z"
  generation: 1
  labels:
    prometheus: k8s
    pyrra.dev/team: operations
    role: alert-rules
  name: pyrra-api-errors
  namespace: qqqqq
spec:
  description: Pyrra's API requests and response errors over time grouped by route.
  indicator:
    ratio:
      errors:
        metric: grpc_server_handled_total{grpc_method=~"xxx", code=~"OK|NOT_FOUND|INVALID_ARGUMENT|PERMISSION_DENIED|FAILED_PRECONDITION|RESOURCE_EXHAUSTED"}
      grouping:
      - route
      total:
        metric: grpc_server_handled_total{grpc_method=~"xxx"}
  target: "99"
  window: 2w
---
apiVersion: pyrra.dev/v1alpha1
kind: ServiceLevelObjective
metadata:
  creationTimestamp: "2023-08-09T12:16:28Z"
  generation: 2
  labels:
    prometheus: k8s
    pyrra.dev/method: zzzz
    pyrra.dev/team: zzzz
    role: alert-rules
  name: zzzzz
  namespace: qqqqq
spec:
  description: Pyrra's API requests and response errors over time grouped by route.
  indicator:
    ratio:
      errors:
        metric: grpc_server_handled_total{app_kubernetes_io_name="zzzz",
          grpc_method=~"zzzz", code=~"OK|NOT_FOUND|INVALID_ARGUMENT|PERMISSION_DENIED|FAILED_PRECONDITION|RESOURCE_EXHAUSTED"}
      grouping:
      - route
      total:
        metric: grpc_server_handled_total{app_kubernetes_io_name="zzzz",
          grpc_method=~"zzzz"}
  target: "99"
  window: 2w
---
apiVersion: pyrra.dev/v1alpha1
kind: ServiceLevelObjective
metadata:
  creationTimestamp: "2023-08-09T12:16:28Z"
  generation: 2
  labels:
    prometheus: k8s
    pyrra.dev/method: zzzzz
    pyrra.dev/team: zzzz
    role: alert-rules
  name: qqqq
  namespace: qqqqq
spec:
  description: Pyrra's API requests and response errors over time grouped by route.
  indicator:
    ratio:
      errors:
        metric: grpc_server_handled_total{app_kubernetes_io_name="zzzz",
          grpc_method=~"zzzzz", code=~"OK|NOT_FOUND|INVALID_ARGUMENT|PERMISSION_DENIED|FAILED_PRECONDITION|RESOURCE_EXHAUSTED"}
      grouping:
      - route
      total:
        metric: grpc_server_handled_total{app_kubernetes_io_name="zzzz",
          grpc_method=~"zzzzz"}
  target: "99"
  window: 2w
---
apiVersion: pyrra.dev/v1alpha1
kind: ServiceLevelObjective
metadata:
  creationTimestamp: "2023-08-09T11:57:05Z"
  generation: 2
  labels:
    prometheus: k8s
    pyrra.dev/team: operations
    role: alert-rules
  name: pyrra-api-errors
  namespace: monitoring
spec:
  description: Pyrra's API requests and response errors over time grouped by route.
  indicator:
    ratio:
      errors:
        metric: grpc_server_handled_total{grpc_method=~"xxx", code=~"OK|NOT_FOUND|INVALID_ARGUMENT|PERMISSION_DENIED|FAILED_PRECONDITION|RESOURCE_EXHAUSTED"}
      grouping:
      - route
      total:
        metric: grpc_server_handled_total{grpc_method=~"xxx"}
  target: "99"
  window: 2w

@sstarcher
Copy link
Author

Checked back and still no info in logs and no graphs are visible.

@giz33
Copy link

giz33 commented Apr 2, 2024

I have the same issue here.
On the home page of pyrra it shows No data on Availbility and Budget:

image

But when I click on it, it shows the values normally(Don't know if 100% is ok for them, but at least show some values):

image

Can someone help please?

Tks in advance.

@giz33
Copy link

giz33 commented Apr 2, 2024

I was able to make it works!
Again, was a mismatch between label that the prometheus operator uses and the label that pyrra puts on the ServiceLevelObectives and PrometheusRules.
After change the Prometheus to look for the labels used on the Pyrra's objects it works.

image

image

Tks again.

@prein
Copy link

prein commented Apr 25, 2024

Same here. No data in Pyrra UI but all visible in Grafana. I noticed the same issue on this public Pyrra UI
Can it be related to prometheus / thanos version? I think it happens more after an upgrade I performed recently

@giz33 can you elaborate on the labels change? In my case it started working after adding some labels but I don't understand how it worked.

PS. Possibly same issue #632 - solved by getting SLO resources to live in the same NS with Pyrra

@metalmatze do you think it would be possible to make issues like this visible in SLO CR status (available with kubectl describe ServiceLevelObjective)? Or is it strictly UI?

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants