-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
local rate limit: add cross local cluster rate limit support #34276
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: wbpcode <[email protected]>
Signed-off-by: wbpcode <[email protected]>
…ross-instance-local-limit
CC @envoyproxy/api-shepherds: Your approval is needed for changes made to |
Signed-off-by: wbpcode <[email protected]>
api/envoy/extensions/filters/http/local_ratelimit/v3/local_rate_limit.proto
Outdated
Show resolved
Hide resolved
Signed-off-by: wbpcode <[email protected]>
Can we have Tianyu take a pass and then I will send to a maintainer? |
Seems @markdroth is busy recently, re-assign this to @adisuissa for API review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting idea, thanks!
Left high-level comments/questions.
api/envoy/extensions/filters/http/local_ratelimit/v3/local_rate_limit.proto
Show resolved
Hide resolved
/retest |
@adisuissa looks like this is ready for review - the main merge is just for review notes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
I've left a few high-level comments.
One other idea: will it be possible to redesign this so that each instance owns a reference to the endpointStats() object or the membership_total object of the static local cluster, and when tokensPerFill is called, it will fetch the current number of endpoints?
I think this will still need to by guarded (mutex), if the membership_total is not atomic. However, it will remove the additional share-monitor, and the member-update callback (may simplify the solution).
source/extensions/filters/common/local_ratelimit/local_ratelimit_impl.cc
Show resolved
Hide resolved
source/extensions/filters/common/local_ratelimit/local_ratelimit_impl.cc
Show resolved
Hide resolved
source/extensions/filters/common/local_ratelimit/local_ratelimit_impl.cc
Show resolved
Hide resolved
…ross-instance-local-limit
It's my initial implementation when doing the POC. It's actually simpler but means it's hard to use different algorithms to calculate the ratio. For example, we may want take the weight into the account in the future. Current design provides a well defined interface and abstraction for more complex share calculating. |
source/extensions/filters/common/local_ratelimit/local_ratelimit_impl.cc
Show resolved
Hide resolved
source/extensions/filters/common/local_ratelimit/local_ratelimit_impl.cc
Show resolved
Hide resolved
Signed-off-by: wbpcode <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, overall LGTM.
Left some minor comments.
source/extensions/filters/common/local_ratelimit/local_ratelimit_impl.cc
Show resolved
Hide resolved
source/extensions/filters/common/local_ratelimit/local_ratelimit_impl.cc
Show resolved
Hide resolved
@@ -84,6 +124,9 @@ class LocalRateLimiterImpl { | |||
TokenState tokens_; | |||
absl::flat_hash_set<LocalDescriptorImpl, LocalDescriptorHash, LocalDescriptorEqual> descriptors_; | |||
std::vector<LocalDescriptorImpl> sorted_descriptors_; | |||
|
|||
ShareProviderSharedPtr share_provider_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OOC does this really need to be a shared-ptr and not just a reference. AFAIK the singleton should outlive the filter, but I may be mistaken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the singleton has longer lifetime than the filter. But note the getShareProvider()
not guarantee that the
singleton will keep a copy of returned shared pointer.
For example, if we support new algorithms in the future, then the singleton will only keep a map from the proto to the weak pointer of share provider.
So, I think a shared pointer here would be better.
api/envoy/extensions/filters/http/local_ratelimit/v3/local_rate_limit.proto
Show resolved
Hide resolved
api/envoy/extensions/filters/http/local_ratelimit/v3/local_rate_limit.proto
Outdated
Show resolved
Hide resolved
Co-authored-by: Adi (Suissa) Peleg <[email protected]> Signed-off-by: code <[email protected]>
…e_limit.proto Co-authored-by: Adi (Suissa) Peleg <[email protected]> Signed-off-by: code <[email protected]>
Signed-off-by: wbpcode <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm api
@@ -315,5 +431,44 @@ TEST_P(LocalRateLimitFilterIntegrationTest, BasicTestPerRouteAndRds) { | |||
cleanUpXdsConnection(); | |||
} | |||
|
|||
#ifdef NDEBUG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late observation, but IMHO this is a red-flag when it comes to tests (especially integration tests). Which assertion in the singleton manager is triggered?
Specifically I would like to see a validation of the MAIN_THREAD assertions that were added being exercised in an integration test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The get
method of singleton manager. The singleton manager could only be accessed in the thread where it's created. (the server main thread for integration test).
But these tests will access singleton manger in the test thread which is not allowed and will trigger the assertion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed this macro because I created another PR to resolve this problem. See #34766
/retest |
…ross-instance-local-limit
Signed-off-by: wbpcode <[email protected]>
Signed-off-by: wbpcode <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
/lgtm api
@tyxia can you PTAL?
Commit Message: local rate limit: add cross local cluster rate limit support
Additional Description:
Envoy provides lots of rate limit related filters/features. The global rate limit provides the most powerful feature. But it also introduce additional dependencies (rate limit server, redis, etc) and latency (additional calling to rate limit server).
The local rate limit is more stable and has better performance, and has no dependency to external server. But the local rate limit is work at single instance or connection scope. That means that if local rate limit is used, we cannot get a stable total limit for a Envoy cluster. Because the total limit of local rate limit filter is single instance limit multiply the instance number of Envoy. But the instance number may changed when the cluster scaling.
This PR add a new interesting feature. It make the local rate limit filter could aware the membership of the local cluster (the cluster contains current Envoy self). That means the local rate limit could compute it's tokens based on the membership of local cluster and achieve the target to share the total limit between multiple Envoy instances (a Envoy cluster).
See #34230 for more discussion.
Risk Level: low. (nothing will be changed if we don't enable the feature explicitly)
Testing: unit.
Docs Changes: n/a.
Release Notes: n/a.
Platform Specific Features: n/a.