feat: add TunnelTime metric #171

sbruens · 2024-03-14T21:24:09Z

Over a given timeframe, we observe the duration that a given IPKey pair is sending traffic through the server (i.e. how long is it active) and measure it, aggregated by key or location. This will allow service providers to measure TunnelTime.

…nfo.

…per access key and location.

…per_location`.

sbruens · 2024-03-15T22:20:06Z

Actually just realized I need to still put this behind a flag, so reverting to draft. But feel free to leave comments on what's there already.

fortuna

I think we can move some things around a bit. We can also lean more on the "Tunnel Time" name. And we need to make the code concurrency-safe.

cmd/outline-ss-server/metrics.go

cmd/outline-ss-server/main.go

cmd/outline-ss-server/metrics.go

This reverts commit f5e3cf5.

`MustParseAddr` is meant to be used for tests only, according to the docs.

cmd/outline-ss-server/metrics.go

Instead look it up when we first create an `activeClient`.

sbruens · 2024-03-25T18:38:03Z

Made some big changes to use the Prometheus Collector interface (thanks for the pointer, much better).

fortuna

Thanks, this looks much cleaner. I think the service code looks pretty good, only one small note.

As for the Prometheus code, I'm pretty sure you need to accumulate the counter. Prometheus only mirrors it. So you will need to keep track of the two counters (per key and per location+asn).

cmd/outline-ss-server/metrics.go

service/tcp.go

fortuna · 2024-03-25T19:33:37Z

cmd/outline-ss-server/metrics_test.go

It would be helpful to run the benchmarks with and without your changes, then diff, to make sure that's no significant difference we are not accounting for.

panic: duplicate metrics collector registration attempted [recovered] goos: linux goarch: amd64 pkg: github.com/Jigsaw-Code/outline-ss-server/internal/integration_test cpu: Intel(R) Xeon(R) W-2135 CPU @ 3.70GHz │ bench_master.log │ bench_branch.log │ │ sec/op │ sec/op vs base │ TCPThroughput-6 11.51µ ± 2% 11.54µ ± 3% ~ (p=0.796 n=10) TCPMultiplexing-6 122.2µ ± 9% 125.2µ ± 8% ~ (p=0.190 n=10) UDPEcho-6 135.4µ ± 53% 128.9µ ± 35% ~ (p=0.853 n=10) UDPManyKeys-6 1.449m ± 66% 1.113m ± 78% ~ (p=0.165 n=10) geomean 128.9µ 120.0µ -6.93% │ bench_master.log │ bench_branch.log │ │ mbps │ mbps vs base │ TCPThroughput-6 694.8 ± 2% 693.0 ± 2% ~ (p=0.781 n=10) │ bench_master.log │ bench_branch.log │ │ B/op │ B/op vs base │ TCPThroughput-6 11.00 ± 55% 10.00 ± 50% ~ (p=0.897 n=10) TCPMultiplexing-6 81.96Ki ± 3% 82.05Ki ± 2% ~ (p=0.684 n=10) UDPEcho-6 5.282Ki ± 1% 5.287Ki ± 1% ~ (p=0.540 n=10) UDPManyKeys-6 128.3Ki ± 63% 128.3Ki ± 74% ~ (p=0.382 n=10) geomean 4.942Ki 4.828Ki -2.30% │ bench_master.log │ bench_branch.log │ │ allocs/op │ allocs/op vs base │ TCPThroughput-6 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ TCPMultiplexing-6 646.5 ± 5% 644.0 ± 3% ~ (p=0.493 n=10) UDPEcho-6 97.00 ± 0% 97.00 ± 0% ~ (p=1.000 n=10) ¹ UDPManyKeys-6 909.0 ± 58% 909.0 ± 69% ~ (p=0.960 n=10) geomean ² -0.10% ² ¹ all samples are equal ² summaries must be >0 to compute geomean pkg: github.com/Jigsaw-Code/outline-ss-server/internal/slicepool │ bench_master.log │ bench_branch.log │ │ sec/op │ sec/op vs base │ Pool-6 49.55n ± 26% 43.32n ± 43% ~ (p=0.645 n=10) │ bench_master.log │ bench_branch.log │ │ B/op │ B/op vs base │ Pool-6 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ ¹ all samples are equal │ bench_master.log │ bench_branch.log │ │ allocs/op │ allocs/op vs base │ Pool-6 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ ¹ all samples are equal pkg: github.com/Jigsaw-Code/outline-ss-server/ipinfo │ bench_master.log │ bench_branch.log │ │ sec/op │ sec/op vs base │ GetIPInfoFromAddr-6 748.1n ± 38% 700.5n ± 29% ~ (p=0.515 n=10) NewMMDBIPInfoMap-6 19.87µ ± 36% 19.15µ ± 28% ~ (p=0.912 n=10) geomean 3.856µ 3.663µ -5.01% │ bench_master.log │ bench_branch.log │ │ B/op │ B/op vs base │ GetIPInfoFromAddr-6 56.00 ± 0% 56.00 ± 0% ~ (p=1.000 n=10) ¹ NewMMDBIPInfoMap-6 1.617Ki ± 0% 1.617Ki ± 0% ~ (p=1.000 n=10) ¹ geomean 304.5 304.5 +0.00% ¹ all samples are equal │ bench_master.log │ bench_branch.log │ │ allocs/op │ allocs/op vs base │ GetIPInfoFromAddr-6 4.000 ± 0% 4.000 ± 0% ~ (p=1.000 n=10) ¹ NewMMDBIPInfoMap-6 61.00 ± 0% 61.00 ± 0% ~ (p=1.000 n=10) ¹ geomean 15.62 15.62 +0.00% ¹ all samples are equal pkg: github.com/Jigsaw-Code/outline-ss-server/service │ bench_master.log │ bench_branch.log │ │ sec/op │ sec/op vs base │ Locking-6 247.1n ± 40% 190.8n ± 96% ~ (p=0.684 n=10) Snapshot-6 2.927µ ± 18% 2.717µ ± 12% ~ (p=0.247 n=10) ReplayCache_Creation-6 24.60µ ± 8% 24.66µ ± 5% ~ (p=0.912 n=10) ReplayCache_Max-6 111.2n ± 10% 102.3n ± 17% ~ (p=0.516 n=10) ReplayCache_Min-6 294.8n ± 86% 306.0n ± 55% ~ (p=1.000 n=10) ReplayCache_Parallel-6 262.8n ± 10% 242.8n ± 14% ~ (p=0.247 n=10) ServerSaltGenerator-6 1.903µ ± 5% 1.840µ ± 3% ~ (p=0.052 n=10) TCPFindCipherFail-6 932.6µ ± 128% 1039.9µ ± 25% ~ (p=0.631 n=10) TCPFindCipherRepeat-6 121.48µ ± 736% 80.13µ ± 1178% ~ (p=0.315 n=10) UDPUnpackFail-6 1.063m ± 46% 1.201m ± 27% ~ (p=0.684 n=10) UDPUnpackRepeat-6 312.22µ ± 89% 62.81µ ± 1226% ~ (p=0.971 n=10) UDPUnpackSharedKey-6 13.29µ ± 25% 13.10µ ± 20% ~ (p=0.853 n=10) geomean 8.517µ 7.036µ -17.39% │ bench_master.log │ bench_branch.log │ │ B/op │ B/op vs base │ Locking-6 8.000 ± 0% 8.000 ± 0% ~ (p=1.000 n=10) ¹ Snapshot-6 8.000Ki ± 0% 8.000Ki ± 0% ~ (p=1.000 n=10) ¹ ReplayCache_Creation-6 208.0Ki ± 0% 208.0Ki ± 0% ~ (p=1.000 n=10) ReplayCache_Max-6 10.00 ± 0% 10.00 ± 0% ~ (p=1.000 n=10) ¹ ReplayCache_Min-6 95.00 ± 0% 95.00 ± 0% ~ (p=1.000 n=10) ¹ ReplayCache_Parallel-6 10.00 ± 0% 10.00 ± 0% ~ (p=1.000 n=10) ¹ ServerSaltGenerator-6 976.0 ± 0% 976.0 ± 0% ~ (p=1.000 n=10) ¹ TCPFindCipherFail-6 116.9Ki ± 0% 116.9Ki ± 0% ~ (p=0.724 n=10) TCPFindCipherRepeat-6 27.37Ki ± 188% 25.64Ki ± 207% ~ (p=0.353 n=10) UDPUnpackFail-6 116.7Ki ± 0% 116.7Ki ± 0% ~ (p=1.000 n=10) ¹ UDPUnpackRepeat-6 19.603Ki ± 82% 4.918Ki ± 805% ~ (p=1.000 n=10) UDPUnpackSharedKey-6 1.166Ki ± 0% 1.166Ki ± 0% ~ (p=1.000 n=10) ¹ geomean 1.768Ki 1.567Ki -11.37% ¹ all samples are equal │ bench_master.log │ bench_branch.log │ │ allocs/op │ allocs/op vs base │ Locking-6 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ Snapshot-6 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ ReplayCache_Creation-6 2.000 ± 0% 2.000 ± 0% ~ (p=1.000 n=10) ¹ ReplayCache_Max-6 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ ReplayCache_Min-6 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=10) ¹ ReplayCache_Parallel-6 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=10) ¹ ServerSaltGenerator-6 10.00 ± 0% 10.00 ± 0% ~ (p=1.000 n=10) ¹ TCPFindCipherFail-6 1.605k ± 0% 1.606k ± 0% ~ (p=0.771 n=10) TCPFindCipherRepeat-6 123.00 ± 579% 99.50 ± 738% ~ (p=0.403 n=10) UDPUnpackFail-6 1.602k ± 0% 1.602k ± 0% ~ (p=1.000 n=10) ¹ UDPUnpackRepeat-6 259.00 ± 86% 56.50 ± 967% ~ (p=1.000 n=10) UDPUnpackSharedKey-6 17.00 ± 0% 17.00 ± 0% ~ (p=1.000 n=10) ¹ geomean ² -13.46% ² ¹ all samples are equal ² summaries must be >0 to compute geomean

cmd/outline-ss-server/metrics.go

…the hood.

fortuna

This looks great. Just one small adjustment. Also, were you able to diff the benchmarks?

cmd/outline-ss-server/metrics.go

Commit suggestions. Co-authored-by: Vinicius Fortuna <[email protected]>

sbruens added 15 commits February 1, 2024 00:12

Fix log message.

70f1f3e

Fix Prometheus instructions.

46dcef4

Add new proposed metrics.

50a39bf

Pass the IP to Add(Open|Closed)TCPConnection instead of the clientI…

8efddb8

…nfo.

Add ipinfo helper function to get IP info using a net.IP.

2e1c4d9

Add an activeIPKeyTracker struct and use it to measure IPKeyTime …

54b18c5

…per access key and location.

Fix casing of GetIpInfoFromIP.

8f200a5

Fix existing TCP and UDP tests.

db1be10

Fix more tests.

e8d0fad

Merge remote-tracking branch 'origin' into sbruens/ip-key-metrics

00c1045

Exercise each scenario in its own test.

7f3d9d0

Add tests for metrics.

fb7bc2e

Use in IPInfo map to test `shadowsocks_ip_key_connectivity_seconds_…

db15a2f

…per_location`.

Merge branch 'master' into sbruens/ip-key-metrics

f49c40d

Remove key_connectivity_seconds for the time being.

0520fd2

sbruens marked this pull request as ready for review March 15, 2024 22:17

sbruens requested a review from a team as a code owner March 15, 2024 22:17

sbruens requested a review from fortuna March 15, 2024 22:17

sbruens marked this pull request as draft March 15, 2024 22:20

sbruens added 3 commits March 15, 2024 20:00

Merge branch 'master' into sbruens/ip-key-metrics

423565f

Fix bad merge.

6613942

Put the IPKey metrics behind a flag for now.

f5e3cf5

sbruens marked this pull request as ready for review March 18, 2024 19:56

fortuna requested changes Mar 19, 2024

View reviewed changes

sbruens added 3 commits March 20, 2024 12:26

Revert "Put the IPKey metrics behind a flag for now."

959de84

This reverts commit f5e3cf5.

Address review comments.

5c75727

Remove ipinfo.IPInfo from RemoveUDPNatEntry, as it doesn't need it.

4d0a20a

sbruens changed the title ~~feat: add IPKey time metric~~ feat: add TunnelTime metric Mar 20, 2024

Undo changes to ipinfo.

73fb6ab

sbruens added 2 commits March 21, 2024 13:56

Use netip.Addr as IP type in the IPKey struct.

b14464d

Replace MustParseAddr with a new toIPAddr helper function.

2d1252b

`MustParseAddr` is meant to be used for tests only, according to the docs.

fortuna requested changes Mar 21, 2024

View reviewed changes

sbruens mentioned this pull request Mar 21, 2024

feat(server): add an experimental opt-in API to enable ASN metrics Jigsaw-Code/outline-server#1523

Merged

sbruens added 10 commits March 22, 2024 10:37

Remove the clientInfo param from startConnection.

e60e53a

Instead look it up when we first create an `activeClient`.

Make startConnection and stopConnection use IPKey as param.

cfa143d

Consolidate toIPAddr() and toIPKey().

c56bcfd

Add comment to mutex.

bfd381c

Remove lock from activeClient struct.

55ba178

Make stubbable time.Now private.

e428f8d

Pass around the IPKey instead of storing it on the activeClient.

a5731fc

Initialize TunnelTime for new IPKeys with default value of 0.

4112238

Refactor to implement a custom Collector to track TunnelTime.

7cb66b3

Update comments and log.

ced36bd

sbruens force-pushed the sbruens/ip-key-metrics branch from 188ca9d to ced36bd Compare March 22, 2024 22:07

sbruens requested a review from fortuna March 25, 2024 18:38

fortuna requested changes Mar 25, 2024

View reviewed changes

fortuna reviewed Mar 25, 2024

View reviewed changes

cmd/outline-ss-server/metrics.go Outdated Show resolved Hide resolved

sbruens added 3 commits March 25, 2024 17:02

Handle error case from SplitHostPort().

0ee630d

Address review comments.

3960510

Refactor the tunnel time collector to use 2 CounterVecs underneath …

3eb0114

…the hood.

fortuna self-requested a review March 26, 2024 19:19

fortuna approved these changes Mar 26, 2024

View reviewed changes

cmd/outline-ss-server/metrics.go Outdated Show resolved Hide resolved

fortuna reviewed Mar 26, 2024

View reviewed changes

cmd/outline-ss-server/metrics.go Outdated Show resolved Hide resolved

Apply suggestions from code review

714252f

Commit suggestions. Co-authored-by: Vinicius Fortuna <[email protected]>

fortuna approved these changes Mar 26, 2024

View reviewed changes

sbruens merged commit 53fd283 into master Mar 27, 2024
4 checks passed

sbruens deleted the sbruens/ip-key-metrics branch March 27, 2024 17:46

sbruens added a commit that referenced this pull request Mar 28, 2024

feat: add TunnelTime metric (#171)

4c35a51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add TunnelTime metric #171

feat: add TunnelTime metric #171

sbruens commented Mar 14, 2024 •

edited

Loading

sbruens commented Mar 15, 2024

fortuna left a comment

sbruens commented Mar 25, 2024

fortuna left a comment

fortuna Mar 25, 2024

sbruens Mar 27, 2024

fortuna left a comment

feat: add TunnelTime metric #171

feat: add TunnelTime metric #171

Conversation

sbruens commented Mar 14, 2024 • edited Loading

sbruens commented Mar 15, 2024

fortuna left a comment

Choose a reason for hiding this comment

sbruens commented Mar 25, 2024

fortuna left a comment

Choose a reason for hiding this comment

fortuna Mar 25, 2024

Choose a reason for hiding this comment

sbruens Mar 27, 2024

Choose a reason for hiding this comment

fortuna left a comment

Choose a reason for hiding this comment

sbruens commented Mar 14, 2024 •

edited

Loading