-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge the gateway handlers into the standard handlers. #1261
Merge the gateway handlers into the standard handlers. #1261
Conversation
The two `WebSocketChannelsHandler` and `GatewayResourceHandler` classes are removed and their corresponding functionality is merged into the respective `KernelWebsocketHandler` and `KernelSpecResourceHandler` classes. For the `KernelSpecResourceHandler` class, this change is rather straightforward as we can simply make the existing handler check if the kernel spec manager has a `get_kernel_spec_resource` method, and if so delegate to that method instead of trying to read resources from disk. The `KernelWebsocketHandler` conversion is more complicated, though. The handling of websocket connections was generalized/extended in jupyter-server#1047 to allow the definition of a `kernel_websocket_connection_class` as an alternative to replacing the entire websocket handler. This change builds on that by converting the `GatewayWebSocketClient` class to be an instance of the kernel websocket connection class, and accordingly renames it to `GatewayWebSocketConnection`. When the gateway client is enabled, the default `kernel_websocket_connection_class` is changed to this `GatewayWebSocketConnection` class similarly to how the kernel and kernel spec manager default classes are updated.
@kevin-bates @Zsailer FYI, this is my attempt at first steps around what we had discussed for the "Kernel Providers" proposal. The idea here is to first remove the gateway client specific handlers so that the subsequent work to multiplex between various kernel providers has fewer moving pieces to deal with. This is almost certainly in need of more work before merging as I had multiple open questions I need to ask regarding the
Thanks in advance for any advice and/or suggestions you might have here. |
Thank you for the PR, @ojarjur! This is great stuff.
I think renaming to We may need to leave the old handlers around for a bit with a deprecation warning (set to 3.0), in case someone has subclassed these handlers somewhere in their own setup.
Good point. Let me look at this a little bit closer and see what makes sense. I'll post back here when I can a chance too look at this closer
Ugh, this is annoyingly specific to the I would propose that we use this opportunity to do the |
1. Rename the package holding the GatewayWebSocketConnection class from `handlers` to `connections`. 2. Make the serverapp call to the `close_all` method on the websocket connection class more resilient. Specifically, the `self.kernel_websocket_connection_class.close_all()` method invocation relies on two different attribute lookups, either of which might fail: 1. The `self.kernel_websocket_connection_class` lookup can raise an AttributeError if the method that sets the default value of that trait raises an AttributeError. That, in turn, occurs when the `self.gateway_config` attribute is not yet set because the `initialize` method has not yet been invoked. During server startup this scenario is gracefully handled, but during cleanup it was previously not. 2. The `self.kernel_websocket_connection_class.close_all` lookup can fail if the provided class does not implement the `close_all` class method. For both cases, we catch the raised AttributeError and treat it as a no-op. 3. Remove the `close_all` method from the GatewayWebSocketConnection class since it was a no-op, and is no longer required by serverapp.
The previous version of the code used the wrong future callback for messages when the underlying gateway websocket connection had not yet been created. That resulted in messages received before the end-to-end websocket creation completing getting dropped.
Hi @ojarjur - this is looking good - thank you. I have built this branch and targeted a K8s cluster fronted by Jupyter Kernel Gateway. When creating a kernel for the first time, I see the following stack trace in the jupyter server log:
I've included preceding lines up to the informational messages in hopes that might provide richer context for troubleshooting. Items of note:
Since this first time behavior is fairly consistent (there was one time (out of several) in which I didn't see the stack on the first attempt following server startup), I was wondering if it might have to do with the resource file acquisition, since that's an area of change and might only occur once in the life of a server's duration. |
@kevin-bates Thank you so much for the thorough testing and the detailed report. I'm going to try to see if I can modify the existing
I don't think so... from the stacktrace, it looks like the websocket connection to the backend is working and received a message, but the websocket connection to the client is closed. I'll also compare against the existing code to see why the behavior between the two is different in this scenario. |
@kevin-bates I see the issue now. The new Restoring that check is easy enough, but I also want to try to extend the tests to make sure we don't accidentally drop this again in the future. |
Awesome!
Excellent idea - thank you. I'll look for the update and spin it again at that time. |
…t connection is closed
…ketChannelsHandler to gateway.GatewayWebSocketConnection
@kevin-bates I was eventually able to get a (minimal) test case that reproduces that error showing up in the logs, and fails if it sees that (or any) error in the logs. I also replicated the corresponding logic from the existing codebase into the new connection class, and it makes that new test now pass. Thanks for catching that! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @ojarjur!
I've confirmed behavior outside of the test framework as well.
def __init__(self, **kwargs): | ||
super().__init__(**kwargs) | ||
self.ws = None | ||
self.ws_future: Future = Future() | ||
self.disconnected = False | ||
self.retry = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a LoggingConfigurable
, can we make these attributes into traits
? In general, it's best to avoid calling __init__
on a subclass of traitlets.HasTraits
object.
This would looks something like:
import tornado.websocket as tornado_websocket
...
class GatewayWebSocketConnection(BaseKernelWebsocketConnection):
...
ws = Instance(klass=tornado_websocket.WebSocketHandler, allow_none=True)
ws_future = Instance(default_value=Future(), klass=Future)
disconnected = Bool(False)
retry = Int(0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the guidance and explanation!
Done.
I dug into this a bit. We do need to handle the subprotocol here in the As you mentioned, we should move the kernel_ws_protocol flag and logic for serialization/deserialization into the We may want to do this work in a separate PR, since the current PR should probably stay focused on absorbing the gateway handlers into the main handlers. That way, our changelog decouples these two changes. Would you like to make the separate PR moving the subprotocol flag to the base handler? Ideally, that PR would get reviewed and merged first, then rebase here (though this isn't strictly necessary). |
Otherwise, this PR looks good to me 👍 |
… into ojarjur/kernel-servers
…erver into ojarjur/kernel-servers
…ebSocketConnection class
Thank you for digging into it!
I'll give it a shot, but this is well outside my area of experience and my bandwidth for the next week or so is quite limited. If someone else manages to get that in before me, I would be happy about it. |
…socketConnection This change removes a temporary work-around to the required `kernel_ws_protocol` field not being defined in the `BaseKernelWebsocketConnection` base class. Previously, that field was only defined in the ZMQChannelsWebsocketConnection class, so we had to reproduce it in the `GatewayWebsocketConnection` class in order to get that class to work correctly. However, we only provided a minimal default value of `None` rather than properly supporting the user setting their preferred websocket protocol. This workaround was made unnecessary by jupyter-server#1264 which moved the `kernel_ws_protocol` field from the ZMQChannelsWebsocketConnection class to the base class. Accordingly, this commit removes that temporary work-around that is no longer needed. We needed to provide a value for that,
Head branch was pushed to by a user without write access
Thanks for the reviews, All! I finished up the clean-up on this PR needed to (semantically) merge in the changes from #1264, and pushed it, so the I don't know why some of the tests are failing now; GitHub isn't showing me any logs for them, which is a different behavior than I have seen before. |
It looks like there is an ongoing GitHub incident that affects actions: https://www.githubstatus.com/incidents/pr3498h3qkfy |
The two
WebSocketChannelsHandler
andGatewayResourceHandler
classes are deprecated and their corresponding functionality is merged into the respectiveKernelWebsocketHandler
andKernelSpecResourceHandler
classes.For the
KernelSpecResourceHandler
class, this change is rather straightforward as we can simply make the existing handler check if the kernel spec manager has aget_kernel_spec_resource
method, and if so delegate to that method instead of trying to read resources from disk.The
KernelWebsocketHandler
conversion is more complicated, though. The handling of websocket connections was generalized/extended in #1047 to allow the definition of akernel_websocket_connection_class
as an alternative to replacing the entire websocket handler.This change builds on that by converting the
GatewayWebSocketClient
class to be an instance of the kernel websocket connection class, and accordingly renames it toGatewayWebSocketConnection
.When the gateway client is enabled, the default
kernel_websocket_connection_class
is changed to thisGatewayWebSocketConnection
class similarly to how the kernel and kernel spec manager default classes are updated.