New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(treesitter): use child_containing_descendant() in has-ancestor? #28512
Conversation
|
Obviously we need to bump the tree-sitter dependency in |
@@ -725,6 +725,7 @@ static struct luaL_Reg node_meta[] = { | |||
{ "descendant_for_range", node_descendant_for_range }, | |||
{ "named_descendant_for_range", node_named_descendant_for_range }, | |||
{ "parent", node_parent }, | |||
{ "child_containing_descendant", node_child_containing_descendant }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is TSNode
intended to directly mirror the TS api? or can we choose better names?
The ancestor checking part of the predicate can be moved into C, which would reduce the time down to 1.5ms from 7.5ms (measured on a different computer). ['has-ancestor?'] = function(match, _, _, predicate)
local nodes = match[predicate[2]]
if not nodes or #nodes == 0 then
return true
end
for _, node in ipairs(nodes) do
if node:__has_ancestor(predicate) then
return true
end
end
return false
end, static int __has_ancestor(lua_State *L)
{
TSNode descendant = node_check(L, 1);
if(lua_type(L, 2) != LUA_TTABLE) {
lua_pushboolean(L, false);
return 1;
}
int const pred_len = lua_objlen(L, 2);
TSNode node = ts_tree_root_node(descendant.tree);
while(!ts_node_is_null(node)) {
char const *node_type = ts_node_type(node);
size_t node_type_len = strlen(node_type);
for (int i = 3; i <= pred_len; i++) {
lua_rawgeti(L, 2, i);
if (lua_type(L, -1) == LUA_TSTRING) {
size_t check_len;
char const *check_str = lua_tolstring(L, -1, &check_len);
if(node_type_len == check_len && memcmp(node_type, check_str, check_len) == 0) {
lua_pushboolean(L, true);
return 1;
}
}
lua_pop(L, 1);
}
node = ts_node_child_containing_descendant(node, descendant);
}
lua_pushboolean(L, false);
return 1;
} Is this an overkill? Or how should I name the function? |
I wonder if that is not something upstream would be interested in as well? @amaanq |
Upstream doesn't use the Lua API, so this would need to be significantly rewritten to use only TS structs/types. I think this is fine as it is. Whether it's done in C or Lua doesn't matter too much IMO. |
The only worry here is that we're injecting our own API functions into the upstream tree-sitter API; that may lead to confusion. But maybe it's worth it? We're already not exposing the API exactly (e.g., I'd be fine with keeping it in Lua for now, but we could also keep it internal at first and discuss exposing it (as |
@vanaigr I've just bumped to tree-sitter 0.22.6 on |
Yeah has-ancestor is a pretty good candidate for a predicate for upstream to support - it'd be worth potentially opening a PR for Max's thoughts |
11445a3
to
589d39a
Compare
@vanaigr This PR needs one of two things:
As we need to bump anyway for wasm parsers, I would prefer 1. for simplicity. |
Bumping the min version sounds good to me. |
required for `ts_node_child_containing_descendant()`
589d39a
to
2f89f59
Compare
Force-pushed. @vanaigr is there any reason not to squash these commits before merging? |
Either way is fine for me. |
Squashed, then, with notes from the PR desciption added to the commit message. Thank you! |
…eovim#28512) Problem: `has-ancestor?` is O(n²) for the depth of the tree since it iterates over each of the node's ancestors (bottom-up), and each ancestor takes O(n) time. This happens because tree-sitter's nodes don't store their parent nodes, and the tree is searched (top-down) each time a new parent is requested. Solution: Make use of new `ts_node_child_containing_descendant()` in tree-sitter v0.22.6 (which is now the minimum required version) to rewrite the `has-ancestor?` predicate in C to become O(n). For a sample file, decreases the time taken by `has-ancestor?` from 360ms to 6ms.
…eovim#28512) Problem: `has-ancestor?` is O(n²) for the depth of the tree since it iterates over each of the node's ancestors (bottom-up), and each ancestor takes O(n) time. This happens because tree-sitter's nodes don't store their parent nodes, and the tree is searched (top-down) each time a new parent is requested. Solution: Make use of new `ts_node_child_containing_descendant()` in tree-sitter v0.22.6 (which is now the minimum required version) to rewrite the `has-ancestor?` predicate in C to become O(n). For a sample file, decreases the time taken by `has-ancestor?` from 360ms to 6ms.
This comment was marked as off-topic.
This comment was marked as off-topic.
it appears the treesitter version has to be bumped :) |
…eovim#28512) Problem: `has-ancestor?` is O(n²) for the depth of the tree since it iterates over each of the node's ancestors (bottom-up), and each ancestor takes O(n) time. This happens because tree-sitter's nodes don't store their parent nodes, and the tree is searched (top-down) each time a new parent is requested. Solution: Make use of new `ts_node_child_containing_descendant()` in tree-sitter v0.22.6 (which is now the minimum required version) to rewrite the `has-ancestor?` predicate in C to become O(n). For a sample file, decreases the time taken by `has-ancestor?` from 360ms to 6ms.
* version bump * docs: news neovim#28773 * perf(treesitter): use child_containing_descendant() in has-ancestor? (neovim#28512) Problem: `has-ancestor?` is O(n²) for the depth of the tree since it iterates over each of the node's ancestors (bottom-up), and each ancestor takes O(n) time. This happens because tree-sitter's nodes don't store their parent nodes, and the tree is searched (top-down) each time a new parent is requested. Solution: Make use of new `ts_node_child_containing_descendant()` in tree-sitter v0.22.6 (which is now the minimum required version) to rewrite the `has-ancestor?` predicate in C to become O(n). For a sample file, decreases the time taken by `has-ancestor?` from 360ms to 6ms. * feat: remove deprecated features Remove following functions: - vim.lsp.util.extract_completion_items - vim.lsp.util.get_progress_messages - vim.lsp.util.parse_snippet() - vim.lsp.util.text_document_completion_list_to_complete_items - LanguageTree:for_each_child - health#report_error - health#report_info - health#report_ok - health#report_start - health#report_warn - vim.health.report_error - vim.health.report_info - vim.health.report_ok - vim.health.report_start - vim.health.report_warn * fix(version): fix vim.version().prerelease fixes neovim#28782 (when backported) * fix: extend the life of vim.tbl_flatten to 0.13 `vim.iter(t):flatten():totable()` doesn't handle nil so isn't a good enough replacement. * docs(gen_help_html.lua): handle modeline and note nodes Problem: 'modeline' and 'note' are unhandled in the online HTML documentation. Some (not all) modelines are parsed by the vimdoc parser as a node of type 'modeline'. Solution: - Ignore 'modeline' in HTML rendering. - Render 'note' text in boldface. * fix(health): broken ruby detect neovim#28804 * fix(path): avoid chdir() when resolving path (neovim#28799) Use uv_fs_realpath() instead. It seems that uv_fs_realpath() has some problems on non-Linux platforms: - macOS and other BSDs: this function will fail with UV_ELOOP if more than 32 symlinks are found while resolving the given path. This limit is hardcoded and cannot be sidestepped. - Windows: while this function works in the common case, there are a number of corner cases where it doesn't: - Paths in ramdisk volumes created by tools which sidestep the Volume Manager (such as ImDisk) cannot be resolved. - Inconsistent casing when using drive letters. - Resolved path bypasses subst'd drives. Ref: https://docs.libuv.org/en/v1.x/fs.html#c.uv_fs_realpath I don't know if the old implementation that uses uv_chdir() and uv_cwd() also suffers from the same problems. - For the ELOOP case, chdir() seems to have the same limitations. - On Windows, Vim doesn't use anything like chdir() either. It uses _wfullpath(), while libuv uses GetFinalPathNameByHandleW(). * feat(api): broadcast events to ALL channels neovim#28487 Problem: `vim.rpcnotify(0)` and `rpcnotify(0)` are documented as follows: If {channel} is 0, the event is broadcast to all channels. But that's not actually true. Channels must call `nvim_subscribe` to receive "broadcast" events, so it's actually "multicast". - Assuming there is a use-case for "broadcast", the current model adds an extra step for broadcasting: all channels need to "subscribe". - The presence of `nvim_subscribe` is a source of confusion for users, because its name implies something more generally useful than what it does. Presumably the use-case of `nvim_subscribe` is to avoid "noise" on RPC channels not expected a broadcast notification, and potentially an error if the channel client reports an unknown event. Solution: - Deprecate `nvim_subscribe`/`nvim_unsubscribe`. - If applications want to multicast, they can keep their own multicast list. Or they can use `nvim_list_chans()` and `nvim_get_chan_info()` to enumerate and filter the clients they want to target. - Always send "broadcast" events to ALL channels. Don't require channels to "subscribe" to receive broadcasts. This matches the documented behavior of `rpcnotify()`. * vim-patch:9.1.0414: Unable to leave long line with 'smoothscroll' and 'scrolloff' Problem: Unable to leave long line with 'smoothscroll' and 'scrolloff'. Corrupted screen near the end of a long line with 'scrolloff'. (Ernie Rael, after 9.1.0280) Solution: Only correct cursor in case scroll_cursor_bot() was not itself called to make the cursor visible. Avoid adjusting for 'scrolloff' beyond the text line height (Luuk van Baal) vim/vim@b32055e vim-patch:9.1.0416: some screen dump tests can be improved Problem: some screen dump tests can be improved (after 9.1.0414) Solution: Make sure screen state changes properly and is captured in the screen dumps (Luuk van Baal) vim/vim@2e64273 * fix(vim.iter): enable optimizations for arrays (lists with holes) (neovim#28781) The optimizations that vim.iter uses for array-like tables don't require that the source table has no holes. The only thing that needs to change is the determination if a table is "list-like": rather than requiring consecutive, integer keys, we can simply test for (positive) integer keys only, and remove any holes in the original array when we make a copy for the iterator. * ci: change label `backport` to `target:release` `backport` is too similar `ci:backport release-x.y` and causes confusion. * fix(move): half-page scrolling with resized grid at eob (neovim#28821) * vim-patch:9.1.0418: Cannot move to previous/next rare word (neovim#28822) Problem: Cannot move to previous/next rare word (Colin Kennedy) Solution: Add the ]r and [r motions (Christ van Willegen) fixes: vim/vim#14773 closes: vim/vim#14780 vim/vim@8e4c4c7 Co-authored-by: Christ van Willegen - van Noort <[email protected]> * vim-patch:cf78d0df51f2 runtime(sshdconfig): add basic ftplugin file for sshdconfig (vim/vim#14790) vim/vim@cf78d0d Co-authored-by: Yinzuo Jiang <[email protected]> * vim-patch:94043780196c (neovim#28831) runtime(matchparen): fix :NoMatchParen not working (vim/vim#14797) fixes: neovim#28828 vim/vim@9404378 * refactor(path.c): add nonnull attributes (neovim#28829) This possibly fixes the coverity warning. * refactor!: remove `nvim` and `provider` module for checkhealth The namespacing for healthchecks for neovim modules is inconsistent and confusing. The completion for `:checkhealth` with `--clean` gives ``` nvim provider.clipboard provider.node provider.perl provider.python provider.ruby vim.lsp vim.treesitter ``` There are now three top-level module names for nvim: `nvim`, `provider` and `vim` with no signs of stopping. The `nvim` name is especially confusing as it does not contain all neovim checkhealths, which makes it almost a decoy healthcheck. The confusion only worsens if you add plugins to the mix: ``` lazy mason nvim nvim-treesitter provider.clipboard provider.node provider.perl provider.python provider.ruby telescope vim.lsp vim.treesitter ``` Another problem with the current approach is that it's not easy to run nvim-only healthchecks since they don't share the same namespace. The current approach would be to run `:che nvim vim.* provider.*` and would also require the user to know these are the neovim modules. Instead, use this alternative structure: ``` vim.health vim.lsp vim.provider.clipboard vim.provider.node vim.provider.perl vim.provider.python vim.provider.ruby vim.treesitter ``` and ``` lazy mason nvim-treesitter telescope vim.health vim.lsp vim.provider.clipboard vim.provider.node vim.provider.perl vim.provider.python vim.provider.ruby vim.treesitter ``` Now, the entries are properly sorted and running nvim-only healthchecks requires running only `:che vim.*`. * fix(diagnostic): show backtrace for deprecation warnings Problem: On nvim 11.0-dev, deprecation warnings due to an use of hard-deprecated APIs such as: - `vim.diagnostic.disable()` - `vim.diagnostic.is_disabled()` etc. are not accompanied by backtrace information. It makes difficult for users to figure out which lines or which plugins are still using deprecated APIs. Solution: use `backtrace = true` in vim.deprecate() call. * vim-patch:df859a36d390 runtime(sql): set commentstring for sql files in ftplugin closes: vim/vim#14800 vim/vim@df859a3 Co-authored-by: Riley Bruins <[email protected]> * vim-patch:36e974fdf3f5 runtime(graphql): basic ftplugin file for graphql closes: vim/vim#14801 vim/vim@36e974f Co-authored-by: Riley Bruins <[email protected]> * vim-patch:4d7892bfb1db runtime(dart): add basic dart ftplugin file fixes vim/vim#14793 closes vim/vim#14802 vim/vim@4d7892b Co-authored-by: Riley Bruins <[email protected]> * vim-patch:9.1.0421: filetype: hyprlang files are not recognized Problem: filetype: hyprlang files are not recognized Solution: recognize 'hypr{land,paper,idle,lock}.conf' files as 'hyprlang' filetype, add hyprlang ftplugin (Riley Bruins) closes: vim/vim#14803 vim/vim@5f1b115 Co-authored-by: Riley Bruins <[email protected]> * Update CMakeLists.txt * Create health.lua --------- Co-authored-by: Justin M. Keyes <[email protected]> Co-authored-by: vanaigr <[email protected]> Co-authored-by: dundargoc <[email protected]> Co-authored-by: bfredl <[email protected]> Co-authored-by: Lewis Russell <[email protected]> Co-authored-by: Jongwook Choi <[email protected]> Co-authored-by: MoonFruit <[email protected]> Co-authored-by: zeertzjq <[email protected]> Co-authored-by: Luuk van Baal <[email protected]> Co-authored-by: Gregory Anders <[email protected]> Co-authored-by: Christ van Willegen - van Noort <[email protected]> Co-authored-by: Christian Clason <[email protected]> Co-authored-by: Yinzuo Jiang <[email protected]> Co-authored-by: Riley Bruins <[email protected]>
Closes #24965.
has-ancestor?
is O(n²) for the depth of the tree since it iterates over each of the node's ancestors (bottom-up), and each ancestor takes O(n) time.This happens because tree-sitter's nodes don't store their parent nodes, and the tree is searched (top-down) each time a new parent is requested.
ts_node_child_containing_descendant()
matches how trees-sitter searches for the node's parent internally and makeshas-ancestor?
is O(n).The predicate is also rewritten in C to avoid allocations for each ancestor node and their type strings.
For the file in the issue, decreases the time taken by
has-ancestor?
from 360ms to 6ms.