Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus stats naming and histogram cleanup #195

Open
alexmv opened this issue Jul 13, 2023 · 0 comments
Open

Prometheus stats naming and histogram cleanup #195

alexmv opened this issue Jul 13, 2023 · 0 comments

Comments

@alexmv
Copy link
Contributor

alexmv commented Jul 13, 2023

The default Prometheus metrics contain histograms for in and out byte counts. However, they use the default histogram buckets, which means that they measure up to 10 bytes, and down to 0.005 bytes:

# HELP cn_bytes_in
# TYPE cn_bytes_in histogram
cn_bytes_in_bucket{role="",le="0.005"} 0
cn_bytes_in_bucket{role="",le="0.01"} 0
cn_bytes_in_bucket{role="",le="0.025"} 0
cn_bytes_in_bucket{role="",le="0.05"} 0
cn_bytes_in_bucket{role="",le="0.1"} 0
cn_bytes_in_bucket{role="",le="0.25"} 0
cn_bytes_in_bucket{role="",le="0.5"} 0
cn_bytes_in_bucket{role="",le="1"} 0
cn_bytes_in_bucket{role="",le="2.5"} 0
cn_bytes_in_bucket{role="",le="5"} 0
cn_bytes_in_bucket{role="",le="10"} 0
cn_bytes_in_bucket{role="",le="+Inf"} 1
cn_bytes_in_sum{role=""} 22925
cn_bytes_in_count{role=""} 1
# HELP cn_bytes_out
# TYPE cn_bytes_out histogram
cn_bytes_out_bucket{role="",le="0.005"} 0
cn_bytes_out_bucket{role="",le="0.01"} 0
cn_bytes_out_bucket{role="",le="0.025"} 0
cn_bytes_out_bucket{role="",le="0.05"} 0
cn_bytes_out_bucket{role="",le="0.1"} 0
cn_bytes_out_bucket{role="",le="0.25"} 0
cn_bytes_out_bucket{role="",le="0.5"} 0
cn_bytes_out_bucket{role="",le="1"} 0
cn_bytes_out_bucket{role="",le="2.5"} 0
cn_bytes_out_bucket{role="",le="5"} 0
cn_bytes_out_bucket{role="",le="10"} 0
cn_bytes_out_bucket{role="",le="+Inf"} 1
cn_bytes_out_sum{role=""} 516
cn_bytes_out_count{role=""} 1

Per the prometheus best practices, bytes is the canonical unit to use here, which means that we should be adjusting the bucketcount. The same applies to a lesser degree with proxy_duration_ms_timer_bucket -- but in general Prometheus timing metrics use seconds as their unit, rather than shifting the buckets upwards. cn_duration_bucket, by contrast, is in seconds (and doesn't use TimingWithTags, possibly for that reason?)

Based on that same link, the names for the Prometheus metrics are also pretty non-standard -- cn_atpt_total would more canonically be smokescreen_connection_attempt_count, for instance. We could hardcode a mapping of statsd name to prometheus name in sanitisePrometheusMetricName, but I don't love that -- nor polluting the interface with a function per metric name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant