Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gnatsd Prometheus alerts #165

Open
sommerit opened this issue Mar 16, 2022 · 1 comment
Open

gnatsd Prometheus alerts #165

sommerit opened this issue Mar 16, 2022 · 1 comment

Comments

@sommerit
Copy link

Hello Guys,

I implement the Nats Exporter into my K8s / Prometheus Stack and ever things works like charm
Thanks for that community

Now I look for some Monitoring Rules because my experience with Nats is not that big.

For other services I like to use https://awesome-prometheus-alerts.grep.to/rules.

Have, maybe someone experiences and can provide some Rules?

I will ofc research and if I find something put here.

Thanks

Greetings

@manuelottlik
Copy link

Hey, I was also looking for some prometheus alerts for JetStream but did not find anything yet. I am really inexperienced when it comes to PQL and alerts, but this is what I came up with:

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: i3t-nats
spec:
  groups:
    - name: nats.rules
      rules:
        - alert: NatsConsumerPendingMessagesTooHigh
          expr: nats_consumer_num_pending > {{ .Values.alerting.rules.natsMessagesPendingThreshold }}
          for: 3m
          labels:
            severity: critical
          annotations:
            description: {{` Consumer "{{$labels.consumer_name}}" has {{ $value }} pending messages. `}}
            summary: {{` The amount of pending messages is too high for 3 minutes. `}}
        - alert: NatsConsumerPendingMessagesIncreasing
          expr: deriv(nats_consumer_num_pending[1m]) > 0
          for: 3m
          labels:
            severity: critical
          annotations:
            description: {{` Consumer "{{$labels.consumer_name}}" is receiving more messages than it can process. `}}
            summary: {{` The amount of pending messages has increased for more than 3 minutes. `}}
        - alert: NatsConsumerRedeliveredMessagePercentageTooHigh
          expr: rate(nats_consumer_num_redelivered[1m]) / rate(nats_consumer_delivered_stream_seq[1m]) > {{ .Values.alerting.rules.natsMessagesRedeliveredPercentageThreshold }}
          for: 1m
          labels:
            severity: critical
          annotations:
            description: {{` Consumer "{{$labels.consumer_name}}" gets {{ $value }} of its messages redelivered. `}}
            summary: {{` The percentage of redelivered messages is too high. `}}

Its written to be processed by helm, so if you use it directly you probably want to remove the {{` and the .Values... stuff.

If anyone has more experience or other ideas for prometheus rules I would love to see them!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants