-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support ramping up endpoints for XdsEndpointGroup
#5688
Conversation
🔍 Build Scan® (commit: 30644c0)
|
1ca79d3
to
0c83dc9
Compare
0c83dc9
to
90f35f6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically looks great! Left a few comments. 😉
core/src/main/java/com/linecorp/armeria/client/endpoint/EndpointWeightTransition.java
Outdated
Show resolved
Hide resolved
xds/src/main/java/com/linecorp/armeria/xds/client/endpoint/DelegatingEndpointGroup.java
Outdated
Show resolved
Hide resolved
xds/src/main/java/com/linecorp/armeria/xds/client/endpoint/ClusterEntry.java
Outdated
Show resolved
Hide resolved
xds/src/main/java/com/linecorp/armeria/xds/client/endpoint/DelegatingEndpointGroup.java
Outdated
Show resolved
Hide resolved
xds/src/main/java/com/linecorp/armeria/xds/client/endpoint/DelegatingEndpointGroup.java
Outdated
Show resolved
Hide resolved
xds/src/main/java/com/linecorp/armeria/xds/client/endpoint/EndpointUtil.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fantastic now. Thanks! 😄
timestamp = createdAtNanos(endpoint); | ||
} else { | ||
timestamp = createdTimestamps.getOrDefault(endpoint, defaultTimestamp); | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
}; | |
} |
xds/src/main/java/com/linecorp/armeria/xds/client/endpoint/EndpointsPool.java
Show resolved
Hide resolved
xds/src/main/java/com/linecorp/armeria/xds/client/endpoint/EndpointsPool.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice. 👍👍
xds/src/main/java/com/linecorp/armeria/xds/client/endpoint/EndpointUtil.java
Outdated
Show resolved
Hide resolved
void updateClusterSnapshot(ClusterSnapshot clusterSnapshot) { | ||
final EndpointSnapshot endpointSnapshot = clusterSnapshot.endpointSnapshot(); | ||
assert endpointSnapshot != null; | ||
endpointsPool.updateClusterSnapshot(clusterSnapshot); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional) The cross-coupling between EndpointsPool
and ClusterEntry
could be eliminated by passing a callback to endpointsPool.updateClusterSnapshot()
.
endpointsPool.updateClusterSnapshot(clusterSnapshot); | |
endpointsPool.updateClusterSnapshot(clusterSnapshot, endpoints -> { | |
accept(clusterSnapshot, endpoints); | |
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, done 👍
Motivation: The motivation for this PR stems from #5688. The current `WeightRampingUpStrategy` internally maintains the ramping up status of endpoints. However, it is possible that although an `EndpointGroup` has just been created certain endpoints must be considered already ramped up. For instance, in xDS we newly create an `EndpointGroup` every time a `ClusterSnapshot` is updated. However, even if a `ClusterSnapshot` has changed, we need to retain an endpoint's ramping up status. To resolve this, I propose a timestamp based solution. Specifically, I believe an `Endpoint` can maintain a state `createdAtNanos` and `WeightRampingUpStrategy` can refer to this value to determine the ramp-up status. We can also allow users to specify their own timestamp via an attribute if they are maintaining their own `Endpoint` pool like done in xDS. While implementing a timestamp based solution, I found that the previous logic is probably simpler to manage if it were refactored. Specifically, I propose that a `rampingUpInterval` is divided into `rampingUpTaskWindow`s. Each `rampingUpTaskWindow` will be assigned a scheduler. If a scheduler doesn't have any more endpoints to ramp up, the scheduler is stopped. I've also removed the endpoint deduplication related logic to make the implementation simpler. Modifications: - Defined an attribute `createdAtNanos` which signifies when an `Endpoint` has been created. - Refactored `WeightRampingUpStrategy` to refer to `createdAtNanos` when computing the initial ramping up step - Refactored `WeightRampingUpStrategy` overall to always use the created timestamp when 1) determining the initial ramping up step 2) determining which ramping up scheduler to use - Removed deduplication logic for simplifying logic - Removed weight update logic for simplifying logic. Result: - We can prepare to support `WeightRampingUpStrategy` for xDS. <!-- Visit this URL to learn more about how to write a pull request description: https://armeria.dev/community/developer-guide#how-to-write-pull-request-description -->
Oops, could you resolve the conflict? |
Did an |
Motivation: The motivation for this PR stems from line#5688. The current `WeightRampingUpStrategy` internally maintains the ramping up status of endpoints. However, it is possible that although an `EndpointGroup` has just been created certain endpoints must be considered already ramped up. For instance, in xDS we newly create an `EndpointGroup` every time a `ClusterSnapshot` is updated. However, even if a `ClusterSnapshot` has changed, we need to retain an endpoint's ramping up status. To resolve this, I propose a timestamp based solution. Specifically, I believe an `Endpoint` can maintain a state `createdAtNanos` and `WeightRampingUpStrategy` can refer to this value to determine the ramp-up status. We can also allow users to specify their own timestamp via an attribute if they are maintaining their own `Endpoint` pool like done in xDS. While implementing a timestamp based solution, I found that the previous logic is probably simpler to manage if it were refactored. Specifically, I propose that a `rampingUpInterval` is divided into `rampingUpTaskWindow`s. Each `rampingUpTaskWindow` will be assigned a scheduler. If a scheduler doesn't have any more endpoints to ramp up, the scheduler is stopped. I've also removed the endpoint deduplication related logic to make the implementation simpler. Modifications: - Defined an attribute `createdAtNanos` which signifies when an `Endpoint` has been created. - Refactored `WeightRampingUpStrategy` to refer to `createdAtNanos` when computing the initial ramping up step - Refactored `WeightRampingUpStrategy` overall to always use the created timestamp when 1) determining the initial ramping up step 2) determining which ramping up scheduler to use - Removed deduplication logic for simplifying logic - Removed weight update logic for simplifying logic. Result: - We can prepare to support `WeightRampingUpStrategy` for xDS. <!-- Visit this URL to learn more about how to write a pull request description: https://armeria.dev/community/developer-guide#how-to-write-pull-request-description -->
Consider reviewing line#5693 before this PR. Motivation: This changeset attempts to support [slow start mode](https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/load_balancing/slow_start). While simply supporting this is trivial with the existing `WeightRampingUpStrategy`, ramping up needs to work irregardless of whether a 1) cluster is updated 2) or the set of endpoints is updated. The current implementation creates a new `EndpointGroup` whenever one of the two updates above is triggered. ##### The `ClusterEntry` itself is reconstructed https://github.com/line/armeria/blob/806556e5a4d27b549c4c6c24f75d96a1447d85d1/xds/src/main/java/com/linecorp/armeria/xds/client/endpoint/ClusterManager.java#L128 ##### The set of endpoints is updated https://github.com/line/armeria/blob/806556e5a4d27b549c4c6c24f75d96a1447d85d1/xds/src/main/java/com/linecorp/armeria/xds/client/endpoint/ClusterEntry.java#L60-L62 Consequentially, endpoints will always be ramped up from the beginning with the current implementation. e.g. if `endpointA` is added, the new `EndpointGroup` will trigger all endpoints to be ramped up all together. As a result `endpointA` won't receive less traffic compared to the other endpoints. I propose the following changes to handle this. ##### 1. The `ClusterEntry` itself is reconstructed I propose that `ClusterEntry` is maintained per cluster name. If a `ClusterEntry` already exists for a `ClusterSnapshot`, then `ClusterEntry#updateClusterSnapshot` will be called instead of creating a new `ClusterEntry`. `SubsetLoadBalancer` also cannot keep `ClusterSnapshot` as a final field, so the related logic has also been refactored. This change is actually probably more in-sync with envoy's implementation as well. ref: https://github.com/envoyproxy/envoy/blob/2e055e40bb93e4bbd97a10b7fedee9c50c5ba8af/source/common/upstream/cluster_manager_impl.cc#L1318 ##### 2. The set of endpoints is updated In order to handle this, I propose that we keep a pool of endpoints in a class named `DelegatingEndpointGroup`. The `EndpointGroup` will store the `createdAtNanos` timestamp of previous endpoints, and set these values if a new `Endpoint` is added. These timestamps will be used to calculate which ramping up step an `Endpoint` was in previously. Note that `DelegatingEndpointGroup` is at the outermost layer of `EndpointGroup`s. This is done to ramp up an `Endpoint` if it was filtered out by a `HealthCheckedEndpointGroup` and then added back in. Once a `ClusterSnapshot` is updated, `DelegatingEndpointGroup#updateClusterSnapshot` will be called. `DelegatingEndpointGroup` will listen for updates to endpoints, set the `createdAtNanos` attribute for each endpoint, and call `ClusterEntry#accept`. It is now possible for `ClusterEntry#accept` to be called from multiple threads (xDS event loop, health check event loop, etc..) so I've also added logic to reschedule. Lastly, in order to fully support xDS's slow-start I've added `EndpointWeightTransition#aggression`. If a `SlowStartConfig` is provided, the `EndpointGroup` is constructed with a `WeightRampingUpStrategy`. Modifications: - Modified `ClusterManager` not create a new `ClusterEntry` as long as the `clusterName` is the same. To support this change, `ClusterEntry#clusterSnapshotUpdated` is added. - Added a `DelegatingEndpointGroup` which keeps a reference to the `EndpointGroup` associated with the `ClusterSnapshot`. `DelegatingEndpointGroup` also sets `createdAtNanos` to endpoints. - Added `EndpointWeightTransition#aggression` which allows non-linear weight transition. - `WeightRampingUpStrategy` is not set when the `SlowStartConfig` parameter is set. Result: - `XdsEndpointGroup` now supports slow start. <!-- Visit this URL to learn more about how to write a pull request description: https://armeria.dev/community/developer-guide#how-to-write-pull-request-description --> --------- Co-authored-by: minux <[email protected]>
Consider reviewing #5693 before this PR.
Motivation:
This changeset attempts to support slow start mode.
While simply supporting this is trivial with the existing
WeightRampingUpStrategy
, ramping up needs to work irregardless of whether a 1) cluster is updated 2) or the set of endpoints is updated.The current implementation creates a new
EndpointGroup
whenever one of the two updates above is triggered.The
ClusterEntry
itself is reconstructedarmeria/xds/src/main/java/com/linecorp/armeria/xds/client/endpoint/ClusterManager.java
Line 128 in 806556e
The set of endpoints is updated
armeria/xds/src/main/java/com/linecorp/armeria/xds/client/endpoint/ClusterEntry.java
Lines 60 to 62 in 806556e
Consequentially, endpoints will always be ramped up from the beginning with the current implementation. e.g. if
endpointA
is added, the newEndpointGroup
will trigger all endpoints to be ramped up all together. As a resultendpointA
won't receive less traffic compared to the other endpoints.I propose the following changes to handle this.
1. The
ClusterEntry
itself is reconstructedI propose that
ClusterEntry
is maintained per cluster name. If aClusterEntry
already exists for aClusterSnapshot
, thenClusterEntry#updateClusterSnapshot
will be called instead of creating a newClusterEntry
.SubsetLoadBalancer
also cannot keepClusterSnapshot
as a final field, so the related logic has also been refactored.This change is actually probably more in-sync with envoy's implementation as well.
ref: https://github.com/envoyproxy/envoy/blob/2e055e40bb93e4bbd97a10b7fedee9c50c5ba8af/source/common/upstream/cluster_manager_impl.cc#L1318
2. The set of endpoints is updated
In order to handle this, I propose that we keep a pool of endpoints in a class named
DelegatingEndpointGroup
. TheEndpointGroup
will store thecreatedAtNanos
timestamp of previous endpoints, and set these values if a newEndpoint
is added. These timestamps will be used to calculate which ramping up step anEndpoint
was in previously. Note thatDelegatingEndpointGroup
is at the outermost layer ofEndpointGroup
s. This is done to ramp up anEndpoint
if it was filtered out by aHealthCheckedEndpointGroup
and then added back in.Once a
ClusterSnapshot
is updated,DelegatingEndpointGroup#updateClusterSnapshot
will be called.DelegatingEndpointGroup
will listen for updates to endpoints, set thecreatedAtNanos
attribute for each endpoint, and callClusterEntry#accept
. It is now possible forClusterEntry#accept
to be called from multiple threads (xDS event loop, health check event loop, etc..) so I've also added logic to reschedule.Lastly, in order to fully support xDS's slow-start I've added
EndpointWeightTransition#aggression
. If aSlowStartConfig
is provided, theEndpointGroup
is constructed with aWeightRampingUpStrategy
.Modifications:
ClusterManager
not create a newClusterEntry
as long as theclusterName
is the same. To support this change,ClusterEntry#clusterSnapshotUpdated
is added.DelegatingEndpointGroup
which keeps a reference to theEndpointGroup
associated with theClusterSnapshot
.DelegatingEndpointGroup
also setscreatedAtNanos
to endpoints.EndpointWeightTransition#aggression
which allows non-linear weight transition.WeightRampingUpStrategy
is not set when theSlowStartConfig
parameter is set.Result:
XdsEndpointGroup
now supports slow start.