Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

poc: a metrics module for pebble #519

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

IronCore864
Copy link
Contributor

@IronCore864 IronCore864 commented Nov 13, 2024

A self-implemented metrics module for Pebble. I referred to the Prometheus Golang client a bit, not so much to runtime/metrics because it seems it's too complicated.

According to the spec, it seems we only need the counter type and the gauge type, but I added the histogram type anyway. Prometheus Golang client has more types but I don't think we will need it any time soon, so I ignored it.

Binary size doesn't increase (no extra libs used), still ~7.8MB with -trimpath -ldflags='-s -w'.

Manual test:

$ curl localhost:8000/metrics
# HELP my_counter A simple counter
# TYPE my_counter counter
my_counter 3

Some basic unit test is added to make sure the functionalities work as expected and there is no deadlock issue.

TODO: investigate how the official Prometheus Golang client gathers basic metrics like CPU usage and stuff.

@IronCore864 IronCore864 requested a review from benhoyt November 13, 2024 09:11
@IronCore864 IronCore864 marked this pull request as ready for review November 13, 2024 09:11
@IronCore864
Copy link
Contributor Author

IronCore864 commented Nov 14, 2024

Trying to integrate with Prometheus to make sure they are compatible:

Screenshot-counter Screenshot-gauge

@IronCore864
Copy link
Contributor Author

IronCore864 commented Nov 14, 2024

Some investigation into the default metrics that come with the Prometheus Go client:

1 List of Metrics from Go runtime/metrics

c.a. 80 items.

  • /cgo/go-to-c-calls:calls
  • /cpu/*
  • /gc/cycles/*
  • /gc/gogc:percent
  • /gc/gomemlimit:bytes
  • /gc/heap/*
  • /gc/limiter/last-enabled:gc-cycle
  • /gc/pauses:seconds
  • /gc/scan/*
  • /gc/stack/starting-size:bytes
  • /godebug/*
  • /memory/*
  • /sched/gomaxprocs:threads
  • /sched/goroutines:goroutines
  • /sched/latencies:seconds
  • /sched/pauses/*
  • /sync/mutex/wait/total:seconds

2 List of Default Metrics from prometheus/client_golang

c.a. 40 items.

  • go_gc_duration_seconds{quantile="*"}
  • go_gc_duration_seconds_sum
  • go_gc_duration_seconds_count
  • go_gc_gogc_percent
  • go_gc_gomemlimit_bytes
  • go_goroutines
  • go_info{version="go1.23.1"}
  • go_memstats_*
  • go_sched_gomaxprocs_threads
  • go_threads
  • promhttp_metric_handler_requests_in_flight
  • promhttp_metric_handler_requests_total{code="200"}
  • promhttp_metric_handler_requests_total{code="500"}
  • promhttp_metric_handler_requests_total{code="503"}

3 How Are the Default Metrics from prometheus/client_golang Fetched/Calculated

3.1 From runtime and runtime/debug

  • go_goroutines
  • go_threads
  • go_gc_duration_seconds
  • go_memstats_last_gc_time_seconds
  • go_info

3.2 From runtime.MemStats with Some Calculations

  • go_memstats_alloc_bytes
  • go_memstats_alloc_bytes_total
  • go_memstats_buck_hash_sys_bytes
  • go_memstats_frees_total
  • go_memstats_gc_sys_bytes
  • go_memstats_heap_alloc_bytes
  • go_memstats_heap_idle_bytes
  • go_memstats_heap_inuse_bytes
  • go_memstats_heap_objects
  • go_memstats_heap_released_bytes
  • go_memstats_heap_sys_bytes
  • go_memstats_last_gc_time_seconds
  • go_memstats_mallocs_total
  • go_memstats_mcache_inuse_bytes
  • go_memstats_mcache_sys_bytes
  • go_memstats_mspan_inuse_bytes
  • go_memstats_mspan_sys_bytes
  • go_memstats_next_gc_bytes
  • go_memstats_other_sys_bytes
  • go_memstats_stack_inuse_bytes
  • go_memstats_stack_sys_bytes
  • go_memstats_sys_bytes

3.3 Directly from runtime/metrics but Go-Version-Dependant

  • go_gc_gogc_percent
  • go_gc_gomemlimit_bytes
  • go_sched_gomaxprocs_threads

3.4 Prometheus-Related

  • promhttp_metric_handler_requests_in_flight
  • promhttp_metric_handler_requests_total{code="200"}
  • promhttp_metric_handler_requests_total{code="500"}
  • promhttp_metric_handler_requests_total{code="503"}

Self-processed.

4 Summary

If we want to include these metrics in our self-implemented metrics module, a lot of duplicated work (duplicated with Prometheus) needs to be done, some of which seem to be Go-version-dependent, which means extra work and tests. This creates a bunch of operational overhead.

If these metrics are not necessary, we probably should go with a self-implemented module but if they are nice to have, a 2.2MB size increase of the Pebble binary is worth the price in the long run.

@IronCore864
Copy link
Contributor Author

IronCore864 commented Nov 26, 2024

A PoC to add a new type of identity (basicauth) for the metrics endpoint.

1 Manually Add an Identity from a YAML File

$ cat identity.yaml
identities:
    bob:
        access: read
        basicauth:
            username: foo
            password: bar
$ ./pebble add-identities --from ./identity.yaml
Added 1 new identity.

2 Start Pebble

$ ./pebble run --http=:4000
2024-11-26T14:27:53.682Z [pebble] HTTP API server listening on ":4000".
2024-11-26T14:27:53.682Z [pebble] Started daemon.
2024-11-26T14:27:53.686Z [pebble] POST /v1/services 63.667µs 400
2024-11-26T14:27:53.686Z [pebble] Cannot start default services: no default services

3 Access the Metrics Endpoint with the Newly Created Identity

$ curl -u foo:bar localhost:4000/metrics
# HELP my_counter A simple counter
# TYPE my_counter counter
my_counter 4

4 Access without Identity or with an Invalid Username/Password

$ curl localhost:4000/metrics
{"type":"error","status-code":401,"status":"Unauthorized","result":{"message":"access denied","kind":"login-required"}}
$ curl -u invalid:invalid localhost:4000/metrics
{"type":"error","status-code":401,"status":"Unauthorized","result":{"message":"access denied","kind":"login-required"}}

@IronCore864
Copy link
Contributor Author

According to the last spec review, the following changes have been made:

  1. The identity type is renamed from "basicauth" to just "basic".
  2. The username field in the basic identity is removed -- we can just use the name of the identity instead.
  3. Adding a new access type "metrics".
  4. Passwords are hashed using OpenSSL (TODO).

After the first round of refactoring, here are some results:

1 Baisc Identity Name with Special Characters

$ cat identity.yaml
identities:
    "bob:asdf":
        access: read
        basic:
            password: bar
$ ./pebble add-identities --from ./identity.yaml
error: identity "bob:asdf" invalid: identity name "bob:asdf" contains invalid characters (only
       alphanumeric, underscore, and hyphen allowed)

2 Baisc Identity without Username

$ cat identity.yaml
identities:
    bob:
        access: read
        basic:
            password: bar
ubuntu@primary:~/work/pebble2$ ./pebble add-identities --from ./identity.yaml
Added 1 new identity.

3 Basic Identity Type "metrics"

$ # access type: read
$ cat identity.yaml
identities:
    bob:
        access: read
        basic:
            password: bar
$ ./pebble add-identities --from ./identity.yaml
Added 1 new identity.
$ # open access is fine
$ curl -u bob:bar localhost:4000/v1/health
{"type":"sync","status-code":200,"status":"OK","result":{"healthy":true}}
$ # no access on the metrics endpoint
$ curl -u bob:bar localhost:4000/metrics
{"type":"error","status-code":401,"status":"Unauthorized","result":{"message":"access denied","kind":"login-required"}}
$ # access type: metrics
$ cat identity.yaml
identities:
    bob:
        access: metrics
        basic:
            password: bar
$ ./pebble update-identities --from ./identity.yaml
Updated 1 identity.
$ # open access is fine
$ curl -u bob:bar localhost:4000/v1/health
{"type":"sync","status-code":200,"status":"OK","result":{"healthy":true}}
$ # accessing metrics
$ curl -u bob:bar localhost:4000/metrics
# HELP my_counter Total number of something processed.
# TYPE my_counter counter
my_counter{operation=read,status=success} 11
my_counter{operation=write,status=success} 22
my_counter{operation=read,status=failed} 11
# HELP my_gauge Current value of something.
# TYPE my_gauge gauge
my_gauge{sensor=temperature} 28.12
$ # no access on other endpoints
$ curl -u bob:bar localhost:4000/v1/changes
{"type":"error","status-code":401,"status":"Unauthorized","result":{"message":"access denied","kind":"login-required"}}
$ # access type: admin
$ cat identity.yaml
identities:
    bob:
        access: admin
        basic:
            password: bar
$ ./pebble update-identities --from ./identity.yaml
Updated 1 identity.
$ # admin can read metrics
$ curl -u bob:bar localhost:4000/v1/metrics
# HELP my_counter Total number of something processed.
# TYPE my_counter counter
my_counter{operation=read,status=success} 176
my_counter{operation=write,status=success} 352
my_counter{operation=read,status=failed} 176
# HELP my_gauge Current value of something.
# TYPE my_gauge gauge
my_gauge{sensor=temperature} 24.48

TODO: hashing password.

BasicAuth *marshalledBasicAuthIdentity `json:"basicauth,omitempty"`
Access string `json:"access"`
Local *marshalledLocalIdentity `json:"local,omitempty"`
Basic *marshalledBasicIdentity `json:"basi,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just skimming through this PR (not reviewing it yet), but noticed this, so commenting before I forget: basi -> basic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants