Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(treesitter): use child_containing_descendant() in has-ancestor? #28512

Merged
merged 5 commits into from May 16, 2024

Conversation

vanaigr
Copy link
Contributor

@vanaigr vanaigr commented Apr 26, 2024

Closes #24965.

has-ancestor? is O(n²) for the depth of the tree since it iterates over each of the node's ancestors (bottom-up), and each ancestor takes O(n) time.
This happens because tree-sitter's nodes don't store their parent nodes, and the tree is searched (top-down) each time a new parent is requested.

ts_node_child_containing_descendant() matches how trees-sitter searches for the node's parent internally and makes has-ancestor? is O(n).

The predicate is also rewritten in C to avoid allocations for each ancestor node and their type strings.

For the file in the issue, decreases the time taken by has-ancestor? from 360ms to 6ms.

@vanaigr
Copy link
Contributor Author

vanaigr commented Apr 26, 2024

ts_node_child_containing_descendant() is in tree-sitter's master, but not yet available in any release.

@zeertzjq zeertzjq added the performance issues reporting performance problems label Apr 26, 2024
@clason clason added this to the 0.11 milestone Apr 26, 2024
@clason
Copy link
Member

clason commented Apr 26, 2024

Obviously we need to bump the tree-sitter dependency in deps.txt and the minimal version. I don't think that will happen before the 0.10 release, but we can bump to a prerelease commit early in the 0.11 cycle. The new function(ality) will have to be hidden behind an #ifdef guard, similar to how we did it for the match limit, before we bump to a release.

@@ -725,6 +725,7 @@ static struct luaL_Reg node_meta[] = {
{ "descendant_for_range", node_descendant_for_range },
{ "named_descendant_for_range", node_named_descendant_for_range },
{ "parent", node_parent },
{ "child_containing_descendant", node_child_containing_descendant },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is TSNode intended to directly mirror the TS api? or can we choose better names?

@vanaigr
Copy link
Contributor Author

vanaigr commented May 2, 2024

The ancestor checking part of the predicate can be moved into C, which would reduce the time down to 1.5ms from 7.5ms (measured on a different computer).

  ['has-ancestor?'] = function(match, _, _, predicate)
    local nodes = match[predicate[2]]
    if not nodes or #nodes == 0 then
      return true
    end

    for _, node in ipairs(nodes) do
      if node:__has_ancestor(predicate) then
        return true
      end
    end
    return false
  end,
static int __has_ancestor(lua_State *L)
{
  TSNode descendant = node_check(L, 1);
  if(lua_type(L, 2) != LUA_TTABLE) {
    lua_pushboolean(L, false);
    return 1;
  }
  int const pred_len = lua_objlen(L, 2);

  TSNode node = ts_tree_root_node(descendant.tree);
  while(!ts_node_is_null(node)) {
    char const *node_type = ts_node_type(node);
    size_t node_type_len = strlen(node_type);

    for (int i = 3; i <= pred_len; i++) {
      lua_rawgeti(L, 2, i);
      if (lua_type(L, -1) == LUA_TSTRING) {
        size_t check_len;
        char const *check_str = lua_tolstring(L, -1, &check_len);
        if(node_type_len == check_len && memcmp(node_type, check_str, check_len) == 0) {
          lua_pushboolean(L, true);
          return 1;
        }
      }
      lua_pop(L, 1);
    }

    node = ts_node_child_containing_descendant(node, descendant);
  }

  lua_pushboolean(L, false);
  return 1;
}

Is this an overkill? Or how should I name the function?

@clason
Copy link
Member

clason commented May 3, 2024

I wonder if that is not something upstream would be interested in as well? @amaanq

@lewis6991
Copy link
Member

I wonder if that is not something upstream would be interested in as well? @amaanq

Upstream doesn't use the Lua API, so this would need to be significantly rewritten to use only TS structs/types.

I think this is fine as it is. Whether it's done in C or Lua doesn't matter too much IMO.

@clason
Copy link
Member

clason commented May 3, 2024

The only worry here is that we're injecting our own API functions into the upstream tree-sitter API; that may lead to confusion. But maybe it's worth it? We're already not exposing the API exactly (e.g., named_descendant_for_range is not a tree-sitter API function).

I'd be fine with keeping it in Lua for now, but we could also keep it internal at first and discuss exposing it (as node_has_ancestor()) if some other plugin shows a separate use for it.

@clason
Copy link
Member

clason commented May 5, 2024

@vanaigr I've just bumped to tree-sitter 0.22.6 on master, so if you rebase, CI should pass.

@amaanq
Copy link
Contributor

amaanq commented May 5, 2024

I wonder if that is not something upstream would be interested in as well? @amaanq

Yeah has-ancestor is a pretty good candidate for a predicate for upstream to support - it'd be worth potentially opening a PR for Max's thoughts

@vanaigr vanaigr force-pushed the query-perf-has-ancestor branch 2 times, most recently from 11445a3 to 589d39a Compare May 6, 2024 00:16
@vanaigr vanaigr marked this pull request as ready for review May 6, 2024 00:16
@github-actions github-actions bot requested a review from wookayin May 16, 2024 10:42
@clason
Copy link
Member

clason commented May 16, 2024

@vanaigr This PR needs one of two things:

  1. bump the required tree-sitter version to 0.22.6, or
  2. put this change behind a feature guard as in https://github.com/neovim/neovim/pull/22710/files

As we need to bump anyway for wasm parsers, I would prefer 1. for simplicity.

@lewis6991 @jamessan @justinmk ?

@jamessan
Copy link
Member

Bumping the min version sounds good to me.

@clason
Copy link
Member

clason commented May 16, 2024

Force-pushed. @vanaigr is there any reason not to squash these commits before merging?

@vanaigr
Copy link
Contributor Author

vanaigr commented May 16, 2024

is there any reason not to squash these commits before merging?

Either way is fine for me.

@clason clason merged commit 4b02916 into neovim:master May 16, 2024
31 checks passed
@clason
Copy link
Member

clason commented May 16, 2024

Squashed, then, with notes from the PR desciption added to the commit message. Thank you!

altermo pushed a commit to altermo/neovim-fork that referenced this pull request May 16, 2024
…eovim#28512)

Problem: `has-ancestor?` is O(n²) for the depth of the tree since it iterates over each of the node's ancestors (bottom-up), and each ancestor takes O(n) time.
This happens because tree-sitter's nodes don't store their parent nodes, and the tree is searched (top-down) each time a new parent is requested.

Solution: Make use of new `ts_node_child_containing_descendant()` in tree-sitter v0.22.6 (which is now the minimum required version) to rewrite the `has-ancestor?` predicate in C to become O(n).

For a sample file, decreases the time taken by `has-ancestor?` from 360ms to 6ms.
altermo pushed a commit to altermo/neovim-fork that referenced this pull request May 16, 2024
…eovim#28512)

Problem: `has-ancestor?` is O(n²) for the depth of the tree since it iterates over each of the node's ancestors (bottom-up), and each ancestor takes O(n) time.
This happens because tree-sitter's nodes don't store their parent nodes, and the tree is searched (top-down) each time a new parent is requested.

Solution: Make use of new `ts_node_child_containing_descendant()` in tree-sitter v0.22.6 (which is now the minimum required version) to rewrite the `has-ancestor?` predicate in C to become O(n).

For a sample file, decreases the time taken by `has-ancestor?` from 360ms to 6ms.
@MangoIV

This comment was marked as off-topic.

@MangoIV
Copy link
Contributor

MangoIV commented May 17, 2024

it appears the treesitter version has to be bumped :)

icholy pushed a commit to icholy/neovim that referenced this pull request May 17, 2024
…eovim#28512)

Problem: `has-ancestor?` is O(n²) for the depth of the tree since it iterates over each of the node's ancestors (bottom-up), and each ancestor takes O(n) time.
This happens because tree-sitter's nodes don't store their parent nodes, and the tree is searched (top-down) each time a new parent is requested.

Solution: Make use of new `ts_node_child_containing_descendant()` in tree-sitter v0.22.6 (which is now the minimum required version) to rewrite the `has-ancestor?` predicate in C to become O(n).

For a sample file, decreases the time taken by `has-ancestor?` from 360ms to 6ms.
dccampbell added a commit to dccampbell/neovim that referenced this pull request May 20, 2024
* version bump

* docs: news neovim#28773

* perf(treesitter): use child_containing_descendant() in has-ancestor? (neovim#28512)

Problem: `has-ancestor?` is O(n²) for the depth of the tree since it iterates over each of the node's ancestors (bottom-up), and each ancestor takes O(n) time.
This happens because tree-sitter's nodes don't store their parent nodes, and the tree is searched (top-down) each time a new parent is requested.

Solution: Make use of new `ts_node_child_containing_descendant()` in tree-sitter v0.22.6 (which is now the minimum required version) to rewrite the `has-ancestor?` predicate in C to become O(n).

For a sample file, decreases the time taken by `has-ancestor?` from 360ms to 6ms.

* feat: remove deprecated features

Remove following functions:
- vim.lsp.util.extract_completion_items
- vim.lsp.util.get_progress_messages
- vim.lsp.util.parse_snippet()
- vim.lsp.util.text_document_completion_list_to_complete_items
- LanguageTree:for_each_child
- health#report_error
- health#report_info
- health#report_ok
- health#report_start
- health#report_warn
- vim.health.report_error
- vim.health.report_info
- vim.health.report_ok
- vim.health.report_start
- vim.health.report_warn

* fix(version): fix vim.version().prerelease

fixes neovim#28782 (when backported)

* fix: extend the life of vim.tbl_flatten to 0.13

`vim.iter(t):flatten():totable()` doesn't handle nil so isn't a good
enough replacement.

* docs(gen_help_html.lua): handle modeline and note nodes

Problem:

'modeline' and 'note' are unhandled in the online HTML documentation.

Some (not all) modelines are parsed by the vimdoc parser as a node of
type 'modeline'.

Solution:

- Ignore 'modeline' in HTML rendering.
- Render 'note' text in boldface.

* fix(health): broken ruby detect neovim#28804

* fix(path): avoid chdir() when resolving path (neovim#28799)

Use uv_fs_realpath() instead.

It seems that uv_fs_realpath() has some problems on non-Linux platforms:
- macOS and other BSDs: this function will fail with UV_ELOOP if more
  than 32 symlinks are found while resolving the given path.  This limit
  is hardcoded and cannot be sidestepped.
- Windows: while this function works in the common case, there are a
  number of corner cases where it doesn't:
  - Paths in ramdisk volumes created by tools which sidestep the Volume
    Manager (such as ImDisk) cannot be resolved.
  - Inconsistent casing when using drive letters.
  - Resolved path bypasses subst'd drives.

Ref: https://docs.libuv.org/en/v1.x/fs.html#c.uv_fs_realpath

I don't know if the old implementation that uses uv_chdir() and uv_cwd()
also suffers from the same problems.
- For the ELOOP case, chdir() seems to have the same limitations.
- On Windows, Vim doesn't use anything like chdir() either. It uses
  _wfullpath(), while libuv uses GetFinalPathNameByHandleW().

* feat(api): broadcast events to ALL channels neovim#28487

Problem:
`vim.rpcnotify(0)` and `rpcnotify(0)` are documented as follows:

    If {channel} is 0, the event is broadcast to all channels.

But that's not actually true. Channels must call `nvim_subscribe` to
receive "broadcast" events, so it's actually "multicast".

- Assuming there is a use-case for "broadcast", the current model adds
  an extra step for broadcasting: all channels need to "subscribe".
- The presence of `nvim_subscribe` is a source of confusion for users,
  because its name implies something more generally useful than what it
  does.

Presumably the use-case of `nvim_subscribe` is to avoid "noise" on RPC
channels not expected a broadcast notification, and potentially an error
if the channel client reports an unknown event.

Solution:
- Deprecate `nvim_subscribe`/`nvim_unsubscribe`.
  - If applications want to multicast, they can keep their own multicast
    list. Or they can use `nvim_list_chans()` and `nvim_get_chan_info()`
    to enumerate and filter the clients they want to target.
- Always send "broadcast" events to ALL channels. Don't require channels
  to "subscribe" to receive broadcasts. This matches the documented
  behavior of `rpcnotify()`.

* vim-patch:9.1.0414: Unable to leave long line with 'smoothscroll' and 'scrolloff'

Problem:  Unable to leave long line with 'smoothscroll' and 'scrolloff'.
          Corrupted screen near the end of a long line with 'scrolloff'.
          (Ernie Rael, after 9.1.0280)
Solution: Only correct cursor in case scroll_cursor_bot() was not itself
          called to make the cursor visible. Avoid adjusting for
          'scrolloff' beyond the text line height (Luuk van Baal)

vim/vim@b32055e

vim-patch:9.1.0416: some screen dump tests can be improved

Problem:  some screen dump tests can be improved (after 9.1.0414)
Solution: Make sure screen state changes properly and is captured in the
          screen dumps (Luuk van Baal)

vim/vim@2e64273

* fix(vim.iter): enable optimizations for arrays (lists with holes) (neovim#28781)

The optimizations that vim.iter uses for array-like tables don't require
that the source table has no holes. The only thing that needs to change
is the determination if a table is "list-like": rather than requiring
consecutive, integer keys, we can simply test for (positive) integer
keys only, and remove any holes in the original array when we make a
copy for the iterator.

* ci: change label `backport` to `target:release`

`backport` is too similar `ci:backport release-x.y` and causes
confusion.

* fix(move): half-page scrolling with resized grid at eob (neovim#28821)

* vim-patch:9.1.0418: Cannot move to previous/next rare word (neovim#28822)

Problem:  Cannot move to previous/next rare word
          (Colin Kennedy)
Solution: Add the ]r and [r motions (Christ van Willegen)

fixes: vim/vim#14773
closes: vim/vim#14780

vim/vim@8e4c4c7

Co-authored-by: Christ van Willegen - van Noort <[email protected]>

* vim-patch:cf78d0df51f2

runtime(sshdconfig): add basic ftplugin file for sshdconfig (vim/vim#14790)

vim/vim@cf78d0d

Co-authored-by: Yinzuo Jiang <[email protected]>

* vim-patch:94043780196c (neovim#28831)

runtime(matchparen): fix :NoMatchParen not working (vim/vim#14797)

fixes: neovim#28828

vim/vim@9404378

* refactor(path.c): add nonnull attributes (neovim#28829)

This possibly fixes the coverity warning.

* refactor!: remove `nvim` and `provider` module for checkhealth

The namespacing for healthchecks for neovim modules is inconsistent and
confusing. The completion for `:checkhealth` with `--clean` gives

```
nvim
provider.clipboard
provider.node
provider.perl
provider.python
provider.ruby
vim.lsp
vim.treesitter
```

There are now three top-level module names for nvim: `nvim`, `provider`
and `vim` with no signs of stopping. The `nvim` name is especially
confusing as it does not contain all neovim checkhealths, which makes it
almost a decoy healthcheck.

The confusion only worsens if you add plugins to the mix:

```
lazy
mason
nvim
nvim-treesitter
provider.clipboard
provider.node
provider.perl
provider.python
provider.ruby
telescope
vim.lsp
vim.treesitter
```

Another problem with the current approach is that it's not easy to run
nvim-only healthchecks since they don't share the same namespace. The
current approach would be to run `:che nvim vim.* provider.*` and would
also require the user to know these are the neovim modules.

Instead, use this alternative structure:

```
vim.health
vim.lsp
vim.provider.clipboard
vim.provider.node
vim.provider.perl
vim.provider.python
vim.provider.ruby
vim.treesitter
```

and

```
lazy
mason
nvim-treesitter
telescope
vim.health
vim.lsp
vim.provider.clipboard
vim.provider.node
vim.provider.perl
vim.provider.python
vim.provider.ruby
vim.treesitter
```

Now, the entries are properly sorted and running nvim-only healthchecks
requires running only `:che vim.*`.

* fix(diagnostic): show backtrace for deprecation warnings

Problem: On nvim 11.0-dev, deprecation warnings due to an use of
hard-deprecated APIs such as:
- `vim.diagnostic.disable()`
- `vim.diagnostic.is_disabled()`
etc. are not accompanied by backtrace information. It makes difficult
for users to figure out which lines or which plugins are still using
deprecated APIs.

Solution: use `backtrace = true` in vim.deprecate() call.

* vim-patch:df859a36d390

runtime(sql): set commentstring for sql files in ftplugin

closes: vim/vim#14800

vim/vim@df859a3

Co-authored-by: Riley Bruins <[email protected]>

* vim-patch:36e974fdf3f5

runtime(graphql): basic ftplugin file for graphql

closes: vim/vim#14801

vim/vim@36e974f

Co-authored-by: Riley Bruins <[email protected]>

* vim-patch:4d7892bfb1db

runtime(dart): add basic dart ftplugin file

fixes vim/vim#14793
closes vim/vim#14802

vim/vim@4d7892b

Co-authored-by: Riley Bruins <[email protected]>

* vim-patch:9.1.0421: filetype: hyprlang files are not recognized

Problem:  filetype: hyprlang files are not recognized
Solution: recognize 'hypr{land,paper,idle,lock}.conf' files
          as 'hyprlang' filetype, add hyprlang ftplugin
          (Riley Bruins)

closes: vim/vim#14803

vim/vim@5f1b115

Co-authored-by: Riley Bruins <[email protected]>

* Update CMakeLists.txt

* Create health.lua

---------

Co-authored-by: Justin M. Keyes <[email protected]>
Co-authored-by: vanaigr <[email protected]>
Co-authored-by: dundargoc <[email protected]>
Co-authored-by: bfredl <[email protected]>
Co-authored-by: Lewis Russell <[email protected]>
Co-authored-by: Jongwook Choi <[email protected]>
Co-authored-by: MoonFruit <[email protected]>
Co-authored-by: zeertzjq <[email protected]>
Co-authored-by: Luuk van Baal <[email protected]>
Co-authored-by: Gregory Anders <[email protected]>
Co-authored-by: Christ van Willegen - van Noort <[email protected]>
Co-authored-by: Christian Clason <[email protected]>
Co-authored-by: Yinzuo Jiang <[email protected]>
Co-authored-by: Riley Bruins <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci:skip-news performance issues reporting performance problems treesitter
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Poor performance with Treesitter highlighting on deep if-else in c
8 participants