Skip to content

wichert/k8s-sentry

Repository files navigation

k8s-sentry

k8s-sentry is a simple tool to monitor a Kubernetes cluster and report all operational issues to Sentry.

Screenshot

There are two alternatives implementations:

k8s-sentry watches for several things:

  • All warning and error events
  • Pod containers terminating with a non-zero exit code
  • Pods failing completely

Deployment

See deploy for Kubernetes manifests and installation instructions.

Configuration

Configuration is done completely via environment variables.

Variable Description
SENTRY_DSN Required DSN for a Sentry project.
SENTRY_ENVIRONMENT Environment for Sentry issues. If not set the namespace is used as environment.
NAMESPACE Comma separated set of namespaces to minitor. If not set all namespaces are monitored (as far as permissions allow)

Issue grouping

k8s-sentry tries to be smart about grouping issues. To handle that several strategies are used:

  • all issues use the event type, event reason and event message as part of the fingerprint
  • events related to controlled Pods (for example Pods created through a ReplicaSet (which is automatically done if you use a StatefulSet or Deployment) are grouped by the ReplicateSet.
  • other events are grouped by the the involved object

Building

This project uses Go modules and requires Go 1.13 or later. From a git checkout you can build the binary using go build:

$ go build
go: downloading k8s.io/apimachinery v0.0.0-20191020214737-6c8691705fc5
go: downloading k8s.io/client-go v0.0.0-20191016111102-bec269661e48
go: downloading k8s.io/api v0.0.0-20191016110408-35e52d86657a
...

You can then run k8s-sentry directly (assuming you have a valid kubectl configuration):

$ ./k8s-sentry
2019/10/22 15:55:41 Warning: DSN environment variable not set. Can not report to Sentry
2019/10/22 15:55:41 Warning HorizontalPodAutoscaler/istio-ingressgateway: unable to get metrics for resource cpu: no metrics returned from resource metrics API
2019/10/22 15:55:41 Warning HorizontalPodAutoscaler/istio-pilot: unable to get metrics for resource cpu: no metrics returned from resource metrics API