perf(treesitter): use child_containing_descendant() in has-ancestor? #28512

vanaigr · 2024-04-26T01:20:34Z

has-ancestor? is O(n²) for the depth of the tree since it iterates over each of the node's ancestors (bottom-up), and each ancestor takes O(n) time.
This happens because tree-sitter's nodes don't store their parent nodes, and the tree is searched (top-down) each time a new parent is requested.

ts_node_child_containing_descendant() matches how trees-sitter searches for the node's parent internally and makes has-ancestor? is O(n).

The predicate is also rewritten in C to avoid allocations for each ancestor node and their type strings.

For the file in the issue, decreases the time taken by has-ancestor? from 360ms to 6ms.

vanaigr · 2024-04-26T01:31:14Z

ts_node_child_containing_descendant() is in tree-sitter's master, but not yet available in any release.

clason · 2024-04-26T05:39:51Z

Obviously we need to bump the tree-sitter dependency in deps.txt and the minimal version. I don't think that will happen before the 0.10 release, but we can bump to a prerelease commit early in the 0.11 cycle. The new function(ality) will have to be hidden behind an #ifdef guard, similar to how we did it for the match limit, before we bump to a release.

runtime/doc/treesitter.txt

justinmk · 2024-04-26T11:20:26Z

src/nvim/lua/treesitter.c

@@ -725,6 +725,7 @@ static struct luaL_Reg node_meta[] = {
 { "descendant_for_range", node_descendant_for_range },
 { "named_descendant_for_range", node_named_descendant_for_range },
 { "parent", node_parent },
+ { "child_containing_descendant", node_child_containing_descendant },


is TSNode intended to directly mirror the TS api? or can we choose better names?

vanaigr · 2024-05-02T23:57:22Z

The ancestor checking part of the predicate can be moved into C, which would reduce the time down to 1.5ms from 7.5ms (measured on a different computer).

  ['has-ancestor?'] = function(match, _, _, predicate)
    local nodes = match[predicate[2]]
    if not nodes or #nodes == 0 then
      return true
    end

    for _, node in ipairs(nodes) do
      if node:__has_ancestor(predicate) then
        return true
      end
    end
    return false
  end,

static int __has_ancestor(lua_State *L)
{
  TSNode descendant = node_check(L, 1);
  if(lua_type(L, 2) != LUA_TTABLE) {
    lua_pushboolean(L, false);
    return 1;
  }
  int const pred_len = lua_objlen(L, 2);

  TSNode node = ts_tree_root_node(descendant.tree);
  while(!ts_node_is_null(node)) {
    char const *node_type = ts_node_type(node);
    size_t node_type_len = strlen(node_type);

    for (int i = 3; i <= pred_len; i++) {
      lua_rawgeti(L, 2, i);
      if (lua_type(L, -1) == LUA_TSTRING) {
        size_t check_len;
        char const *check_str = lua_tolstring(L, -1, &check_len);
        if(node_type_len == check_len && memcmp(node_type, check_str, check_len) == 0) {
          lua_pushboolean(L, true);
          return 1;
        }
      }
      lua_pop(L, 1);
    }

    node = ts_node_child_containing_descendant(node, descendant);
  }

  lua_pushboolean(L, false);
  return 1;
}

Is this an overkill? Or how should I name the function?

clason · 2024-05-03T08:42:24Z

I wonder if that is not something upstream would be interested in as well? @amaanq

lewis6991 · 2024-05-03T09:32:49Z

I wonder if that is not something upstream would be interested in as well? @amaanq

Upstream doesn't use the Lua API, so this would need to be significantly rewritten to use only TS structs/types.

I think this is fine as it is. Whether it's done in C or Lua doesn't matter too much IMO.

clason · 2024-05-03T09:59:53Z

The only worry here is that we're injecting our own API functions into the upstream tree-sitter API; that may lead to confusion. But maybe it's worth it? We're already not exposing the API exactly (e.g., named_descendant_for_range is not a tree-sitter API function).

I'd be fine with keeping it in Lua for now, but we could also keep it internal at first and discuss exposing it (as node_has_ancestor()) if some other plugin shows a separate use for it.

clason · 2024-05-05T21:23:36Z

@vanaigr I've just bumped to tree-sitter 0.22.6 on master, so if you rebase, CI should pass.

amaanq · 2024-05-05T21:52:51Z

I wonder if that is not something upstream would be interested in as well? @amaanq

Yeah has-ancestor is a pretty good candidate for a predicate for upstream to support - it'd be worth potentially opening a PR for Max's thoughts

clason · 2024-05-16T10:49:29Z

@vanaigr This PR needs one of two things:

bump the required tree-sitter version to 0.22.6, or
put this change behind a feature guard as in https://github.com/neovim/neovim/pull/22710/files

As we need to bump anyway for wasm parsers, I would prefer 1. for simplicity.

@lewis6991 @jamessan @justinmk ?

jamessan · 2024-05-16T11:10:31Z

Bumping the min version sounds good to me.

required for `ts_node_child_containing_descendant()`

clason · 2024-05-16T13:36:05Z

Force-pushed. @vanaigr is there any reason not to squash these commits before merging?

vanaigr · 2024-05-16T14:54:32Z

is there any reason not to squash these commits before merging?

Either way is fine for me.

clason · 2024-05-16T14:58:20Z

Squashed, then, with notes from the PR desciption added to the commit message. Thank you!

…eovim#28512) Problem: `has-ancestor?` is O(n²) for the depth of the tree since it iterates over each of the node's ancestors (bottom-up), and each ancestor takes O(n) time. This happens because tree-sitter's nodes don't store their parent nodes, and the tree is searched (top-down) each time a new parent is requested. Solution: Make use of new `ts_node_child_containing_descendant()` in tree-sitter v0.22.6 (which is now the minimum required version) to rewrite the `has-ancestor?` predicate in C to become O(n). For a sample file, decreases the time taken by `has-ancestor?` from 360ms to 6ms.

MangoIV · 2024-05-17T10:18:40Z

it appears the treesitter version has to be bumped :)

…eovim#28512) Problem: `has-ancestor?` is O(n²) for the depth of the tree since it iterates over each of the node's ancestors (bottom-up), and each ancestor takes O(n) time. This happens because tree-sitter's nodes don't store their parent nodes, and the tree is searched (top-down) each time a new parent is requested. Solution: Make use of new `ts_node_child_containing_descendant()` in tree-sitter v0.22.6 (which is now the minimum required version) to rewrite the `has-ancestor?` predicate in C to become O(n). For a sample file, decreases the time taken by `has-ancestor?` from 360ms to 6ms.

* version bump * docs: news neovim#28773 * perf(treesitter): use child_containing_descendant() in has-ancestor? (neovim#28512) Problem: `has-ancestor?` is O(n²) for the depth of the tree since it iterates over each of the node's ancestors (bottom-up), and each ancestor takes O(n) time. This happens because tree-sitter's nodes don't store their parent nodes, and the tree is searched (top-down) each time a new parent is requested. Solution: Make use of new `ts_node_child_containing_descendant()` in tree-sitter v0.22.6 (which is now the minimum required version) to rewrite the `has-ancestor?` predicate in C to become O(n). For a sample file, decreases the time taken by `has-ancestor?` from 360ms to 6ms. * feat: remove deprecated features Remove following functions: - vim.lsp.util.extract_completion_items - vim.lsp.util.get_progress_messages - vim.lsp.util.parse_snippet() - vim.lsp.util.text_document_completion_list_to_complete_items - LanguageTree:for_each_child - health#report_error - health#report_info - health#report_ok - health#report_start - health#report_warn - vim.health.report_error - vim.health.report_info - vim.health.report_ok - vim.health.report_start - vim.health.report_warn * fix(version): fix vim.version().prerelease fixes neovim#28782 (when backported) * fix: extend the life of vim.tbl_flatten to 0.13 `vim.iter(t):flatten():totable()` doesn't handle nil so isn't a good enough replacement. * docs(gen_help_html.lua): handle modeline and note nodes Problem: 'modeline' and 'note' are unhandled in the online HTML documentation. Some (not all) modelines are parsed by the vimdoc parser as a node of type 'modeline'. Solution: - Ignore 'modeline' in HTML rendering. - Render 'note' text in boldface. * fix(health): broken ruby detect neovim#28804 * fix(path): avoid chdir() when resolving path (neovim#28799) Use uv_fs_realpath() instead. It seems that uv_fs_realpath() has some problems on non-Linux platforms: - macOS and other BSDs: this function will fail with UV_ELOOP if more than 32 symlinks are found while resolving the given path. This limit is hardcoded and cannot be sidestepped. - Windows: while this function works in the common case, there are a number of corner cases where it doesn't: - Paths in ramdisk volumes created by tools which sidestep the Volume Manager (such as ImDisk) cannot be resolved. - Inconsistent casing when using drive letters. - Resolved path bypasses subst'd drives. Ref: https://docs.libuv.org/en/v1.x/fs.html#c.uv_fs_realpath I don't know if the old implementation that uses uv_chdir() and uv_cwd() also suffers from the same problems. - For the ELOOP case, chdir() seems to have the same limitations. - On Windows, Vim doesn't use anything like chdir() either. It uses _wfullpath(), while libuv uses GetFinalPathNameByHandleW(). * feat(api): broadcast events to ALL channels neovim#28487 Problem: `vim.rpcnotify(0)` and `rpcnotify(0)` are documented as follows: If {channel} is 0, the event is broadcast to all channels. But that's not actually true. Channels must call `nvim_subscribe` to receive "broadcast" events, so it's actually "multicast". - Assuming there is a use-case for "broadcast", the current model adds an extra step for broadcasting: all channels need to "subscribe". - The presence of `nvim_subscribe` is a source of confusion for users, because its name implies something more generally useful than what it does. Presumably the use-case of `nvim_subscribe` is to avoid "noise" on RPC channels not expected a broadcast notification, and potentially an error if the channel client reports an unknown event. Solution: - Deprecate `nvim_subscribe`/`nvim_unsubscribe`. - If applications want to multicast, they can keep their own multicast list. Or they can use `nvim_list_chans()` and `nvim_get_chan_info()` to enumerate and filter the clients they want to target. - Always send "broadcast" events to ALL channels. Don't require channels to "subscribe" to receive broadcasts. This matches the documented behavior of `rpcnotify()`. * vim-patch:9.1.0414: Unable to leave long line with 'smoothscroll' and 'scrolloff' Problem: Unable to leave long line with 'smoothscroll' and 'scrolloff'. Corrupted screen near the end of a long line with 'scrolloff'. (Ernie Rael, after 9.1.0280) Solution: Only correct cursor in case scroll_cursor_bot() was not itself called to make the cursor visible. Avoid adjusting for 'scrolloff' beyond the text line height (Luuk van Baal) vim/vim@b32055e vim-patch:9.1.0416: some screen dump tests can be improved Problem: some screen dump tests can be improved (after 9.1.0414) Solution: Make sure screen state changes properly and is captured in the screen dumps (Luuk van Baal) vim/vim@2e64273 * fix(vim.iter): enable optimizations for arrays (lists with holes) (neovim#28781) The optimizations that vim.iter uses for array-like tables don't require that the source table has no holes. The only thing that needs to change is the determination if a table is "list-like": rather than requiring consecutive, integer keys, we can simply test for (positive) integer keys only, and remove any holes in the original array when we make a copy for the iterator. * ci: change label `backport` to `target:release` `backport` is too similar `ci:backport release-x.y` and causes confusion. * fix(move): half-page scrolling with resized grid at eob (neovim#28821) * vim-patch:9.1.0418: Cannot move to previous/next rare word (neovim#28822) Problem: Cannot move to previous/next rare word (Colin Kennedy) Solution: Add the ]r and [r motions (Christ van Willegen) fixes: vim/vim#14773 closes: vim/vim#14780 vim/vim@8e4c4c7 Co-authored-by: Christ van Willegen - van Noort <[email protected]> * vim-patch:cf78d0df51f2 runtime(sshdconfig): add basic ftplugin file for sshdconfig (vim/vim#14790) vim/vim@cf78d0d Co-authored-by: Yinzuo Jiang <[email protected]> * vim-patch:94043780196c (neovim#28831) runtime(matchparen): fix :NoMatchParen not working (vim/vim#14797) fixes: neovim#28828 vim/vim@9404378 * refactor(path.c): add nonnull attributes (neovim#28829) This possibly fixes the coverity warning. * refactor!: remove `nvim` and `provider` module for checkhealth The namespacing for healthchecks for neovim modules is inconsistent and confusing. The completion for `:checkhealth` with `--clean` gives ``` nvim provider.clipboard provider.node provider.perl provider.python provider.ruby vim.lsp vim.treesitter ``` There are now three top-level module names for nvim: `nvim`, `provider` and `vim` with no signs of stopping. The `nvim` name is especially confusing as it does not contain all neovim checkhealths, which makes it almost a decoy healthcheck. The confusion only worsens if you add plugins to the mix: ``` lazy mason nvim nvim-treesitter provider.clipboard provider.node provider.perl provider.python provider.ruby telescope vim.lsp vim.treesitter ``` Another problem with the current approach is that it's not easy to run nvim-only healthchecks since they don't share the same namespace. The current approach would be to run `:che nvim vim.* provider.*` and would also require the user to know these are the neovim modules. Instead, use this alternative structure: ``` vim.health vim.lsp vim.provider.clipboard vim.provider.node vim.provider.perl vim.provider.python vim.provider.ruby vim.treesitter ``` and ``` lazy mason nvim-treesitter telescope vim.health vim.lsp vim.provider.clipboard vim.provider.node vim.provider.perl vim.provider.python vim.provider.ruby vim.treesitter ``` Now, the entries are properly sorted and running nvim-only healthchecks requires running only `:che vim.*`. * fix(diagnostic): show backtrace for deprecation warnings Problem: On nvim 11.0-dev, deprecation warnings due to an use of hard-deprecated APIs such as: - `vim.diagnostic.disable()` - `vim.diagnostic.is_disabled()` etc. are not accompanied by backtrace information. It makes difficult for users to figure out which lines or which plugins are still using deprecated APIs. Solution: use `backtrace = true` in vim.deprecate() call. * vim-patch:df859a36d390 runtime(sql): set commentstring for sql files in ftplugin closes: vim/vim#14800 vim/vim@df859a3 Co-authored-by: Riley Bruins <[email protected]> * vim-patch:36e974fdf3f5 runtime(graphql): basic ftplugin file for graphql closes: vim/vim#14801 vim/vim@36e974f Co-authored-by: Riley Bruins <[email protected]> * vim-patch:4d7892bfb1db runtime(dart): add basic dart ftplugin file fixes vim/vim#14793 closes vim/vim#14802 vim/vim@4d7892b Co-authored-by: Riley Bruins <[email protected]> * vim-patch:9.1.0421: filetype: hyprlang files are not recognized Problem: filetype: hyprlang files are not recognized Solution: recognize 'hypr{land,paper,idle,lock}.conf' files as 'hyprlang' filetype, add hyprlang ftplugin (Riley Bruins) closes: vim/vim#14803 vim/vim@5f1b115 Co-authored-by: Riley Bruins <[email protected]> * Update CMakeLists.txt * Create health.lua --------- Co-authored-by: Justin M. Keyes <[email protected]> Co-authored-by: vanaigr <[email protected]> Co-authored-by: dundargoc <[email protected]> Co-authored-by: bfredl <[email protected]> Co-authored-by: Lewis Russell <[email protected]> Co-authored-by: Jongwook Choi <[email protected]> Co-authored-by: MoonFruit <[email protected]> Co-authored-by: zeertzjq <[email protected]> Co-authored-by: Luuk van Baal <[email protected]> Co-authored-by: Gregory Anders <[email protected]> Co-authored-by: Christ van Willegen - van Noort <[email protected]> Co-authored-by: Christian Clason <[email protected]> Co-authored-by: Yinzuo Jiang <[email protected]> Co-authored-by: Riley Bruins <[email protected]>

github-actions bot added the treesitter label Apr 26, 2024

zeertzjq added the performance issues reporting performance problems label Apr 26, 2024

clason added this to the 0.11 milestone Apr 26, 2024

justinmk reviewed Apr 26, 2024

View reviewed changes

runtime/doc/treesitter.txt Show resolved Hide resolved

justinmk reviewed Apr 26, 2024

View reviewed changes

vanaigr force-pushed the query-perf-has-ancestor branch 2 times, most recently from 11445a3 to 589d39a Compare May 6, 2024 00:16

vanaigr marked this pull request as ready for review May 6, 2024 00:16

github-actions bot requested review from bfredl, clason and lewis6991 May 6, 2024 00:19

clason added the ci:skip-news label May 16, 2024

github-actions bot requested a review from wookayin May 16, 2024 10:42

clason and others added 5 commits May 16, 2024 15:35

build(deps): require tree-sitter v0.22.6

7e5a698

required for `ts_node_child_containing_descendant()`

feat(treesitter): add TSNode:child_containing_descendant()

9c53ae5

docs(treesitter): add doc for child_containing_descendant()

910c5b9

perf(treesitter): speed up has_ancestor? predicate

2294f5f

perf(treesitter): rewrite has-ancestor? in C

2f89f59

clason force-pushed the query-perf-has-ancestor branch from 589d39a to 2f89f59 Compare May 16, 2024 13:35

clason merged commit 4b02916 into neovim:master May 16, 2024
31 checks passed

github-actions bot removed request for wookayin, bfredl, clason and lewis6991 May 16, 2024 14:58

This comment was marked as off-topic.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(treesitter): use child_containing_descendant() in has-ancestor? #28512

perf(treesitter): use child_containing_descendant() in has-ancestor? #28512

vanaigr commented Apr 26, 2024 •

edited

vanaigr commented Apr 26, 2024

clason commented Apr 26, 2024 •

edited

justinmk Apr 26, 2024

vanaigr commented May 2, 2024 •

edited

clason commented May 3, 2024

lewis6991 commented May 3, 2024

clason commented May 3, 2024 •

edited

clason commented May 5, 2024

amaanq commented May 5, 2024

clason commented May 16, 2024

jamessan commented May 16, 2024

clason commented May 16, 2024

vanaigr commented May 16, 2024

clason commented May 16, 2024

This comment was marked as off-topic.

MangoIV commented May 17, 2024

perf(treesitter): use child_containing_descendant() in has-ancestor? #28512

perf(treesitter): use child_containing_descendant() in has-ancestor? #28512

Conversation

vanaigr commented Apr 26, 2024 • edited

vanaigr commented Apr 26, 2024

clason commented Apr 26, 2024 • edited

justinmk Apr 26, 2024

Choose a reason for hiding this comment

vanaigr commented May 2, 2024 • edited

clason commented May 3, 2024

lewis6991 commented May 3, 2024

clason commented May 3, 2024 • edited

clason commented May 5, 2024

amaanq commented May 5, 2024

clason commented May 16, 2024

jamessan commented May 16, 2024

clason commented May 16, 2024

vanaigr commented May 16, 2024

clason commented May 16, 2024

This comment was marked as off-topic.

MangoIV commented May 17, 2024

vanaigr commented Apr 26, 2024 •

edited

clason commented Apr 26, 2024 •

edited

vanaigr commented May 2, 2024 •

edited

clason commented May 3, 2024 •

edited