New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ruff server
: Support Jupyter Notebook (*.ipynb
) files
#11206
base: main
Are you sure you want to change the base?
Conversation
|
3a721e2
to
ef8727f
Compare
4fb4f0f
to
a8e00a6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left some comments to add context for a few parts of the code and also to remind myself to refactor a few things before we merge 🙂
|
||
/// Stores and tracks all open documents in a session, along with their associated settings. | ||
#[derive(Default)] | ||
pub(crate) struct Index { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some context for this:
- This is the replacement for
Workspaces
, which used to be the encapsulating struct that stored all documents. Index
uses a flat structure to store documents and workspace settings, which makes it marginally more efficient to look-up documents by URL. The big advantage, however, is that we can map arbitrary notebook cells to document paths really easily, without having to search each workspace in turn.- I've moved the logic for mutating documents to here - in other words,
DocumentController
is private toindex.rs
, and to update documents you need to call the associated update function onIndex
. The reason why I made this change is because we need to update the notebook cell index after cells get added or deleted, and I wanted to centralize that logic here.
/// Edits to fix the diagnostic. If this is empty, a fix | ||
/// does not exist. | ||
pub(crate) edits: Vec<lsp_types::TextEdit>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For context: we now generate the text edits in advance and store that in the associated data struct, instead of generating them later.
let tokens: Vec<LexResult> = | ||
ruff_python_parser::tokenize(source_kind.source_code(), source_type.as_mode()); | ||
|
||
let index = LineIndex::from_source_text(source_kind.source_code()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notebook documents don't have a line index, so we need to build a new line index from the combined source. The side effect of this is that we also re-generate the line index for regular documents instead of using their existing one.
(I'd like to eventually remove the line index from TextDocument
, since at the moment we wastefully re-generate it whenever it gets updated - and we don't even use it in that many places now).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we separate the logic for Notebook and text document to avoid building the index multiple times? But, I see that you want to get remove the index altogether for text document so maybe this isn't worth looking into right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could separate the logic but I don't think it's worth the technical investment IMO if we plan to eventually remove this anyway.
a8e00a6
to
4ddd930
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've provided some initial comments and I plan to look at other changes soon. It's quite a big PR :)
It would be helpful if you could briefly enumerate all the changes made in the PR description. I'd especially find the list of refactoring useful. It would be also useful for posterity to provide any context on the reasoning behind the implementation and if there were any other solutions that you thought of but didn't implement it.
Can you also update the test plan with all the cases that you've tested this against? Ideally it could contain the VS Code settings used and the action performed (command, code action, etc.). There are so many VS Code settings, commands and code actions for Notebook 😅
let tokens: Vec<LexResult> = | ||
ruff_python_parser::tokenize(source_kind.source_code(), source_type.as_mode()); | ||
|
||
let index = LineIndex::from_source_text(source_kind.source_code()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we separate the logic for Notebook and text document to avoid building the index multiple times? But, I see that you want to get remove the index altogether for text document so maybe this isn't worth looking into right now.
crates/ruff_server/src/server.rs
Outdated
types::NotebookDocumentFilter::ByType { | ||
notebook_type: "jupyter-notebook".to_string(), | ||
scheme: Some("file".to_string()), | ||
pattern: None, | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this specific to VS Code? I'm referring to the "jupyter-notebook" type. For reference, we don't use any notebook document filter in ruff-lsp
:
This asks to only send the request to sync the Python code cells as we aren't interested in any kind of cell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I'm not sure where does the jupyter-notebook
type comes from.
I assume this would be the same across editors.
I think it depends - If it is set by VS Code, then I don't think it would be same.
I don't think we need to worry about this now but, just to be on the safer side, I'd probably keep this same as ruff-lsp
.
Does this work on unsaved Notebook files? Like, a new notebook file is opened in VS Code but it doesn't exists on disk yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work on unsaved Notebook files? Like, a new notebook file is opened in VS Code but it doesn't exists on disk yet.
This does fail - I think it might be because an unsaved URL isn't a valid file path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm.... it seems like, in general, that the server doesn't support files created from Ctrl+N
in VS Code if they aren't saved to the file system. I'll create a follow-up issue for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does fail, but only because new notebook cell url gets passed into
didOpenTextDocument
, which isn't expected. Do we need to support 'unattached' notebook cells?
I just checked, we do support it. It's probably because we never touch the filesystem in ruff-lsp
and always work with stdin with ruff
command.
crates/ruff_server/src/server/api/notifications/did_open_notebook.rs
Outdated
Show resolved
Hide resolved
38f718d
to
8cb7d55
Compare
I think there's another problem here. To demonstrate that refer:
|
/// Update the cell contents with the transformed content. | |
/// | |
/// ## Panics | |
/// | |
/// Panics if the transformed content is out of bounds for any cell. This | |
/// can happen only if the cell offsets were not updated before calling | |
/// this method or the offsets were updated incorrectly. | |
fn update_cell_content(&mut self, transformed: &str) { | |
for (&idx, (start, end)) in self | |
.valid_code_cells | |
.iter() | |
.zip(self.cell_offsets.iter().tuple_windows::<(_, _)>()) | |
{ | |
let cell_content = transformed | |
.get(start.to_usize()..end.to_usize()) | |
.unwrap_or_else(|| { | |
panic!( | |
"Transformed content out of bounds ({start:?}..{end:?}) for cell at {idx:?}" | |
); | |
}); | |
self.raw.cells[idx as usize].set_source(SourceValue::StringArray( | |
UniversalNewlineIterator::from( | |
// We only need to strip the trailing newline which we added | |
// while concatenating the cell contents. | |
cell_content.strip_suffix('\n').unwrap_or(cell_content), | |
) | |
.map(|line| line.as_full_str().to_string()) | |
.collect::<Vec<_>>(), | |
)); | |
} | |
} |
This post-processing doesn't happen for the new server because it uses a wrapper to lint / format (lint.rs
and format.rs
).
Notes from testing
|
No, this still needs to be added.
This is an unrelated bug that I've seen several times before. I'll open a proper issue for it.
No, this still needs to be fixed. |
Summary
Closes #10858.
ruff server
now supports*.ipynb
(aka Jupyter Notebook) files. Extensive internal changes have been made to facilitate this, which I've done some work to contextualize with documentation and an pre-review that highlights notable sections of the code.*.ipynb
cells should behave similarly to*.py
documents, with one major exception. The format commandruff.applyFormat
will only apply to the currently selected notebook cell - if you want to format an entire notebook document, useFormat Notebook
from the VS Code context menu.Test Plan
The VS Code extension does not yet have Jupyter Notebook support enabled, so you'll first need to enable it manually. To do this, checkout the
pre-release
branch and modifysrc/common/server.ts
as follows:Before:
After:
I recommend testing this PR with large, complicated notebook files. I used notebook files from this popular repository in my preliminary testing.
The main thing to test is ensuring that notebook cells behave the same as Python documents, besides the aforementioned issue with
ruff.applyFormat
. You should also test adding and deleting cells (in particular, deleting all the code cells and ensure that doesn't break anything), changing the kind of a cell (i.e. from markup -> code or vice versa), and creating a new notebook file from scratch. Finally, you should also test that source actions work as expected (and across the entire notebook).Note:
ruff.applyAutofix
andruff.applyOrganizeImports
are currently broken for notebook files, and I suspect it has something to do with #11248. Once this is fixed, I will update the test plan accordingly.