`ruff server`: Support Jupyter Notebook (`*.ipynb`) files #11206

snowsignal · 2024-04-29T17:49:46Z

Summary

ruff server now supports *.ipynb (aka Jupyter Notebook) files. Extensive internal changes have been made to facilitate this, which I've done some work to contextualize with documentation and an pre-review that highlights notable sections of the code.

*.ipynb cells should behave similarly to *.py documents, with one major exception. The format command ruff.applyFormat will only apply to the currently selected notebook cell - if you want to format an entire notebook document, use Format Notebook from the VS Code context menu.

Test Plan

The VS Code extension does not yet have Jupyter Notebook support enabled, so you'll first need to enable it manually. To do this, checkout the pre-release branch and modify src/common/server.ts as follows:

Before:

After:

I recommend testing this PR with large, complicated notebook files. I used notebook files from this popular repository in my preliminary testing.

The main thing to test is ensuring that notebook cells behave the same as Python documents, besides the aforementioned issue with ruff.applyFormat. You should also test adding and deleting cells (in particular, deleting all the code cells and ensure that doesn't break anything), changing the kind of a cell (i.e. from markup -> code or vice versa), and creating a new notebook file from scratch. Finally, you should also test that source actions work as expected (and across the entire notebook).

Note: ruff.applyAutofix and ruff.applyOrganizeImports are currently broken for notebook files, and I suspect it has something to do with #11248. Once this is fixed, I will update the test plan accordingly.

github-actions · 2024-04-29T18:07:29Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

snowsignal

I've left some comments to add context for a few parts of the code and also to remind myself to refactor a few things before we merge 🙂

crates/ruff_server/src/edit/range.rs

snowsignal · 2024-05-14T05:26:51Z

crates/ruff_server/src/session/index.rs

+
+/// Stores and tracks all open documents in a session, along with their associated settings.
+#[derive(Default)]
+pub(crate) struct Index {


Some context for this:

This is the replacement for Workspaces, which used to be the encapsulating struct that stored all documents.

Index uses a flat structure to store documents and workspace settings, which makes it marginally more efficient to look-up documents by URL. The big advantage, however, is that we can map arbitrary notebook cells to document paths really easily, without having to search each workspace in turn.

I've moved the logic for mutating documents to here - in other words, DocumentController is private to index.rs, and to update documents you need to call the associated update function on Index. The reason why I made this change is because we need to update the notebook cell index after cells get added or deleted, and I wanted to centralize that logic here.

crates/ruff_server/src/session/index.rs

crates/ruff_server/src/fix.rs

snowsignal · 2024-05-14T05:50:59Z

crates/ruff_server/src/lint.rs

+ /// Edits to fix the diagnostic. If this is empty, a fix
+ /// does not exist.
+ pub(crate) edits: Vec<lsp_types::TextEdit>,


For context: we now generate the text edits in advance and store that in the associated data struct, instead of generating them later.

snowsignal · 2024-05-14T05:54:59Z

crates/ruff_server/src/lint.rs

+ let tokens: Vec<LexResult> =
+ ruff_python_parser::tokenize(source_kind.source_code(), source_type.as_mode());
+
+ let index = LineIndex::from_source_text(source_kind.source_code());


Notebook documents don't have a line index, so we need to build a new line index from the combined source. The side effect of this is that we also re-generate the line index for regular documents instead of using their existing one.

(I'd like to eventually remove the line index from TextDocument, since at the moment we wastefully re-generate it whenever it gets updated - and we don't even use it in that many places now).

Can we separate the logic for Notebook and text document to avoid building the index multiple times? But, I see that you want to get remove the index altogether for text document so maybe this isn't worth looking into right now.

We could separate the logic but I don't think it's worth the technical investment IMO if we plan to eventually remove this anyway.

crates/ruff_server/src/lint.rs

dhruvmanila

I've provided some initial comments and I plan to look at other changes soon. It's quite a big PR :)

It would be helpful if you could briefly enumerate all the changes made in the PR description. I'd especially find the list of refactoring useful. It would be also useful for posterity to provide any context on the reasoning behind the implementation and if there were any other solutions that you thought of but didn't implement it.

Can you also update the test plan with all the cases that you've tested this against? Ideally it could contain the VS Code settings used and the action performed (command, code action, etc.). There are so many VS Code settings, commands and code actions for Notebook 😅

crates/ruff_notebook/src/notebook.rs

crates/ruff_server/src/edit.rs

crates/ruff_server/src/edit/notebook.rs

crates/ruff_server/src/edit/range.rs

dhruvmanila · 2024-05-15T13:35:36Z

crates/ruff_server/src/lint.rs

+ let tokens: Vec<LexResult> =
+ ruff_python_parser::tokenize(source_kind.source_code(), source_type.as_mode());
+
+ let index = LineIndex::from_source_text(source_kind.source_code());


Can we separate the logic for Notebook and text document to avoid building the index multiple times? But, I see that you want to get remove the index altogether for text document so maybe this isn't worth looking into right now.

dhruvmanila · 2024-05-15T13:44:09Z

crates/ruff_server/src/server.rs

+ types::NotebookDocumentFilter::ByType {
+ notebook_type: "jupyter-notebook".to_string(),
+ scheme: Some("file".to_string()),
+ pattern: None,
+ },


Is this specific to VS Code? I'm referring to the "jupyter-notebook" type. For reference, we don't use any notebook document filter in ruff-lsp:

https://github.com/astral-sh/ruff-lsp/blob/b8a48009f35756dc51268beb6bbc39c21a1e3a46/ruff_lsp/server.py#L129-L140

This asks to only send the request to sync the Python code cells as we aren't interested in any kind of cell.

I got this from the LSP specification:

I assume this would be the same across editors.

I see. I'm not sure where does the jupyter-notebook type comes from.

I assume this would be the same across editors.

I think it depends - If it is set by VS Code, then I don't think it would be same.

I don't think we need to worry about this now but, just to be on the safer side, I'd probably keep this same as ruff-lsp.

Does this work on unsaved Notebook files? Like, a new notebook file is opened in VS Code but it doesn't exists on disk yet.

Does this work on unsaved Notebook files? Like, a new notebook file is opened in VS Code but it doesn't exists on disk yet.

This does fail - I think it might be because an unsaved URL isn't a valid file path.

Hmmm.... it seems like, in general, that the server doesn't support files created from Ctrl+N in VS Code if they aren't saved to the file system. I'll create a follow-up issue for this.

This does fail, but only because new notebook cell url gets passed into didOpenTextDocument, which isn't expected. Do we need to support 'unattached' notebook cells?

I just checked, we do support it. It's probably because we never touch the filesystem in ruff-lsp and always work with stdin with ruff command.

crates/ruff_server/src/server/api.rs

crates/ruff_server/src/server/api/notifications/did_close.rs

crates/ruff_server/src/server/api/notifications/did_open_notebook.rs

crates/ruff_server/src/session.rs

crates/ruff_server/src/session/index.rs

crates/ruff_server/src/format.rs

dhruvmanila · 2024-05-16T05:44:50Z

I think there's another problem here. To demonstrate that refer:

`ruff-lsp`

Screen.Recording.2024-05-16.at.11.01.41.mov

`ruff server`

Screen.Recording.2024-05-16.at.10.59.05.mov

You can see that there's an additional newline which gets added at the end for ruff server. The reason this is happening is because the cell content are concatenated by a newline but the newline doesn't actually exists in the original source code.

The reason it doesn't happen in ruff-lsp is because we post-process the Notebook where we strip this additional newline before updating the cell content:

ruff/crates/ruff_notebook/src/notebook.rs

Lines 243 to 273 in 75f3022

 /// Update the cell contents with the transformed content. 

 /// 

 /// ## Panics 

 /// 

 /// Panics if the transformed content is out of bounds for any cell. This 

 /// can happen only if the cell offsets were not updated before calling 

 /// this method or the offsets were updated incorrectly. 

 fn update_cell_content(&mut self, transformed: &str) { 

 for (&idx, (start, end)) in self 

 .valid_code_cells 

 .iter() 

 .zip(self.cell_offsets.iter().tuple_windows::<(_, _)>()) 

 { 

 let cell_content = transformed 

 .get(start.to_usize()..end.to_usize()) 

 .unwrap_or_else(|| { 

 panic!( 

 "Transformed content out of bounds ({start:?}..{end:?}) for cell at {idx:?}" 

 ); 

 }); 

 self.raw.cells[idx as usize].set_source(SourceValue::StringArray( 

 UniversalNewlineIterator::from( 

 // We only need to strip the trailing newline which we added 

 // while concatenating the cell contents. 

 cell_content.strip_suffix('\n').unwrap_or(cell_content), 

 ) 

 .map(|line| line.as_full_str().to_string()) 

 .collect::<Vec<_>>(), 

 )); 

 } 

 }

This post-processing doesn't happen for the new server because it uses a wrapper to lint / format (lint.rs and format.rs).

dhruvmanila · 2024-05-16T12:23:22Z

Notes from testing

Did you implement the notebook.* variants of the source code actions? It doesn't seem to be working. They are notebook.source.fixAll and notebook.source.organizeImports which are specific to Notebook. Here, the server will receive only the first cell and we're responsible on returning the edits for the entire notebook. I'm using the following VS Code settings:
```
  "notebook.codeActionsOnSave": {
    "notebook.source.fixAll": "explicit"
  }
```
For reference, https://github.com/astral-sh/ruff-lsp/blob/b8a48009f35756dc51268beb6bbc39c21a1e3a46/ruff_lsp/server.py#L219-L229

This might be unrelated to this PR but I'm getting the following error quite frequently:

2024-05-16 17:34:08.101 [info]  196.046579s WARN ruff_server::server::api Received notification $/setTrace which does not have a handler.

2024-05-16 17:34:10.471 [info]  198.416243s ERROR ruff_server::server::api An error occurred while running textDocument/didClose: Unable to take snapshot for document with URL file:///Users/dhruv/.pyenv/versions/3.7.17/lib/python3.7/random.py

It's usually occurs after the setTrace notification. The confusing part is that I have no idea why is it occurring on textDocument/didClose when I haven't closed any document.

Based on this in the PR description:

Note: ruff.applyAutofix and ruff.applyOrganizeImports are currently broken for notebook files, and I suspect it has something to do with Workspace edits from ruff server commands occasionally fail in VS Code when operating on an yet-to-be-created file #11248. Once this is fixed, I will update the test plan accordingly.

Were you able to look into this and fix it? I'm unable to run any "Ruff:" prefixed commands except for "Format Document".

snowsignal · 2024-05-16T15:38:22Z

@dhruvmanila

Did you implement the notebook.* variants of the source code actions?

No, this still needs to be added.

This might be unrelated to this PR but I'm getting the following error quite frequently

This is an unrelated bug that I've seen several times before. I'll open a proper issue for it.

Were you able to look into this and fix it? I'm unable to run any "Ruff:" prefixed commands except for "Format Document".

No, this still needs to be fixed.

snowsignal added the server Related to the LSP server label Apr 29, 2024

snowsignal added this to the Ruff Server: Beta milestone Apr 29, 2024

snowsignal force-pushed the jupyter-notebook branch 5 times, most recently from 3a721e2 to ef8727f Compare May 9, 2024 09:02

snowsignal force-pushed the jupyter-notebook branch 13 times, most recently from 4fb4f0f to a8e00a6 Compare May 14, 2024 05:40

snowsignal commented May 14, 2024

View reviewed changes

snowsignal force-pushed the jupyter-notebook branch from a8e00a6 to 4ddd930 Compare May 14, 2024 06:38

snowsignal marked this pull request as ready for review May 14, 2024 06:55

snowsignal requested a review from dhruvmanila as a code owner May 14, 2024 06:55

dhruvmanila reviewed May 14, 2024

View reviewed changes

dhruvmanila reviewed May 15, 2024

View reviewed changes

crates/ruff_server/src/session/index.rs Outdated Show resolved Hide resolved

snowsignal requested a review from dhruvmanila May 16, 2024 04:39

Forked lsp-types and register notebook capabilities (boilerplate only).

9301f7a

nolanking90 and others added 6 commits May 15, 2024 21:42

Wrong comment syntax in Cargo.toml

f617211

Update lsp-types crate dependency to use Git fork

e974562

Switch to Astral fork

cdc4a32

Implement support for Jupyter Notebooks in ruff server

efb08ec

Address some suggestions

884dba1

Address additional suggestions

8cb7d55

snowsignal force-pushed the jupyter-notebook branch from 38f718d to 8cb7d55 Compare May 16, 2024 04:45

dhruvmanila reviewed May 16, 2024

View reviewed changes

crates/ruff_server/src/format.rs Outdated Show resolved Hide resolved

snowsignal added 2 commits May 15, 2024 22:32

Pass in correct source type to formatter

b4a1ff2

Remove notebook type check

75f3022

Update notebook selector to match ruff-lsp

f45cd55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`ruff server`: Support Jupyter Notebook (`*.ipynb`) files #11206

`ruff server`: Support Jupyter Notebook (`*.ipynb`) files #11206

snowsignal commented Apr 29, 2024 •

edited

github-actions bot commented Apr 29, 2024 •

edited

snowsignal left a comment

snowsignal May 14, 2024

snowsignal May 14, 2024

snowsignal May 14, 2024

dhruvmanila May 15, 2024

snowsignal May 16, 2024

dhruvmanila left a comment

dhruvmanila May 15, 2024

dhruvmanila May 15, 2024

snowsignal May 16, 2024

dhruvmanila May 16, 2024

snowsignal May 16, 2024 •

edited

snowsignal May 16, 2024

dhruvmanila May 16, 2024 •

edited

dhruvmanila commented May 16, 2024 •

edited

dhruvmanila commented May 16, 2024

snowsignal commented May 16, 2024

ruff server: Support Jupyter Notebook (*.ipynb) files #11206

Are you sure you want to change the base?

ruff server: Support Jupyter Notebook (*.ipynb) files #11206

Conversation

snowsignal commented Apr 29, 2024 • edited

Summary

Test Plan

github-actions bot commented Apr 29, 2024 • edited

ruff-ecosystem results

Linter (stable)

Linter (preview)

Formatter (stable)

Formatter (preview)

snowsignal left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dhruvmanila left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

snowsignal May 16, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dhruvmanila May 16, 2024 • edited

Choose a reason for hiding this comment

dhruvmanila commented May 16, 2024 • edited

ruff-lsp

ruff server

dhruvmanila commented May 16, 2024

Notes from testing

snowsignal commented May 16, 2024

`ruff server`: Support Jupyter Notebook (`*.ipynb`) files #11206

`ruff server`: Support Jupyter Notebook (`*.ipynb`) files #11206

snowsignal commented Apr 29, 2024 •

edited

github-actions bot commented Apr 29, 2024 •

edited

`ruff-ecosystem` results

snowsignal May 16, 2024 •

edited

dhruvmanila May 16, 2024 •

edited

dhruvmanila commented May 16, 2024 •

edited

`ruff-lsp`

`ruff server`