Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Router creation performance degradation when using many routes in comparison to v3 #2132

Open
BobbieGoede opened this issue Feb 6, 2024 · 12 comments
Labels
enhancement New feature or request

Comments

@BobbieGoede
Copy link

Reproduction

https://stackblitz.com/edit/github-afjnf8?file=src%2Frouter.ts

Steps to reproduce the bug

  • Run project and test the performance using ab or siege and compare the results.
    • I tested with the following command siege -t 10s -c 100 "http://localhost:3000/" to see the amount of requests within 10s using 100 concurrent users.
  • Change the dirCount variable to adjust the amount of routes generated/used (default 100, which generates 1101 routes). and test again.

Expected behavior

Expected the amount of routes to have an impact on performance comparable to router v3. The performance difference is noticeable for SSR as a new router instance is created on each request.

Actual behavior

Router creation doesn't scale as well as it did in v3 (presumably, I have no clean reproduction with v3 Router alone), please view the result comparison in this issue nuxt/nuxt#25612 which compares Nuxt 2 and Nuxt 3.

If you have a clean/plain SSR project using router v3 I would be happy to include a reproduction with a performance comparison between v3 and v4, I tried making one but had no luck.

Additional information

While debugging performance issues for @nuxtjs/i18n narrowed it down to the difference in performance between Nuxt 2 and Nuxt 3, specifically the performance impact of having a high amount of routes. Related issue here: nuxt/nuxt#25612

Due to the way i18n routing works (same as in i18n for Nuxt 2), routes are generated/multiplied by the amount of configured languages. Of course I will look into alternative ways to make routing more efficient/performant when using i18n, but it's worth looking into why the performance of router creation has gotten worse and how we can improve it.

My first guess would be that insertMatcher doesn't scale very well, I couldn't find its equivalent in v3.

@posva
Copy link
Member

posva commented Feb 6, 2024

Thanks for the investigation.

It would be interesting to see an actual perf benchmark (without nuxt). The difference in creation shouldn't be that big. That said, Vue Router 3 had no path sorting like v4. It would also allow to have a flame graph but I think you are right with your guess of insertMatcher 😄

Initially, I designed the router to accept a custom matcher implementation (e.g. a trie) that would be more performant but with fewer features than v3. I never got the time to actually work on it, though.

I also thought (but never implemented it) of a way to pre-compile the routes so the matcher can be created instantly. I think this could be an interesting feature, especially for Nuxt where this can be automated. Combined with the ability to pass a custom route matcher, this should work out pretty well!

@BobbieGoede
Copy link
Author

I was overthinking how to create a perf comparison😅.. created one here https://github.com/BobbieGoede/vue-router-perf-bench which tries to compare both router versions and the time it takes to create the router 100 times with X routes. Maybe it can be changed to make the comparison fairer, but I think it demonstrates the performance difference.

I got these results on my machine: (I also checked flamegraph and didn't see insertMatcher for some reason)

version/routes 11 110 550 1100 2200
RouterV3 6.23ms 22.64ms 100.24ms 197.53ms 403.65ms
RouterV4 9.95ms 42.49ms 652.33ms 2616.43ms 10295.62ms

Custom matchers and/or pre-compilation definitely sound like good features to allow optimizing router/matcher creation. I'm not sure if I have the skills (and time) to add such functionality by myself but if I can help and contribute let me know 😄

@posva
Copy link
Member

posva commented Feb 6, 2024

I checked on https://lahmatiy.github.io/cpupro/ with the cpuprofile file and the insertMatcher is what takes the most time!

Screenshot 2024-02-06 at 22 07 53

Feel free to give the topic some thought and try to raise a PR. No rush at all

@BobbieGoede
Copy link
Author

BobbieGoede commented Feb 8, 2024

I did some research into what it would take to make a custom matcher for the router, and looked into tries as well. A custom matcher using a prefix (or radix) trie could definitely improve performance, hopefully I have time to look into making a matcher that fully matches the functionality/features of the current matcher. I came across https://github.com/unjs/radix3, if I can't wrap it in a custom matcher at least I will be able to use it as reference.

Since route sorting seems to be the current performance bottleneck (and after seeing unjs/radix3 export method) I tried making a sort cache (#2136), I figured this would be easier to implement to improve the performance in the short term.

It definitely works in speeding up router creation, though not entirely sure if this implementation holds up for sorting accuracy.

version/routes 11 110 550 1100 2200
RouterV3 5.88ms 21.90ms 100.11ms 200.38ms 405.75ms
RouterV4 10.72ms 40.35ms 601.14ms 2386.12ms 9649.10ms
RouterV4Cached 5.26ms 32.52ms 141.46ms 275.60ms 553.46ms

While it does speed up router creation in general, it looks like matcher creation remains a general bottleneck (tokesToParser mostly), and after creation it looks like resolve takes up some time as well (https://lahmatiy.github.io/cpupro/ is very useful 😄). I will look into expanding my performance comparison bench to also test how resolve scales with the amount of routes.

Copy link
Member

posva commented Feb 8, 2024

Using radix3 for a custom matcher would be nice. Note it doesn't cover all features (it can't as a trie).
It's really great to see that there is no performance issue anymore with the cache 🙂 . Thanks a lot for taking the time to do it. I haven't been able to check the PR code yet but hopefully I can give you some feedback soon.

tokensToParser should be quite fast already. The difference you see might be because router v3 will cache path-to-regexp results while v4 doesn't cache the results of tokensToParser. When I tested (a while ago), it was faster than path-to-regexp and caching wasn't necessary, so I wouldn't spend too much time into that part 😄

@BobbieGoede
Copy link
Author

tokensToParser should be quite fast already. ... When I tested (a while ago), it was faster than path-to-regexp and caching wasn't necessary, so I wouldn't spend too much time into that part 😄

You're right, it's fast 😅. Maybe the regex creation could be cached but it's easy to get carried away when looking for optimizations. As long as we can get the performance similar to that of Router v3 I'm happy 😄.

Using radix3 for a custom matcher would be nice. Note it doesn't cover all features (it can't as a trie).

From the perspective of i18n a trie just sounds appealing since most if not all routes are prefixed with a language. I suppose a separate matcher per language could be used for Nuxt I18n to achieve something similar (quick access to prefixed routes) while keeping it functionally the same.

@skirtles-code
Copy link
Contributor

I've attempted a different approach to speeding up insertMatcher in #2137.

It uses a binary search to determine the insertion point. It yields a slightly different sort order, but I think it will be functionally equivalent. It's quite possible that I've failed to consider some of the edge cases, but the existing unit tests all pass, so I'm optimistic it can be made to work. If nothing else it may help to identify some missing test cases.

@BobbieGoede
Copy link
Author

@skirtles-code
Your approach is much better! This essentially makes use of matchers already being sorted right?

In my tests it performs better than my caching approach, I guess this probably means my caching is more expensive than it should/could be. I still think it's interesting to explore caching but it would probably need to cover more of the router matcher creation (not just the sorting) for it to be worth caching after your optimization.

@skirtles-code
Copy link
Contributor

This essentially makes use of matchers already being sorted right?

Correct.

In my tests it performs better than my caching approach

Same for me, but my PR isn't currently handling all the edge cases correctly. Fixing that will likely slow it down a bit, which may account for some of the difference we're seeing.

I still think it's interesting to explore caching

Agreed. Caching should be able to improve various aspects of the performance, not just the insertion.


I've just opened another PR, #2148. This explores some other ideas. I've done a long write up on the PR, but roughly speaking it uses some of the ideas already discussed earlier in this thread (not including caching and binary search) to try to squeeze out extra performance.

@skirtles-code
Copy link
Contributor

In #2166, I've explored the idea of making the matcher pluggable. This allows a custom matcher to be passed when creating the router. More importantly, it allows most of the work of creating a matcher to be done upfront, rather than creating it afresh for each request.

This is essentially a version of the caching/pre-compilation idea that has already been discussed.

#2166 will require more work, but it should be complete enough to explore the viability of the ideas and decide which way to proceed.

@meteorlxy
Copy link
Member

We also found vue-router could be a perf bottleneck when the routes amount is large. Will put an eye on this issue 👀 cc @Mister-Hope

@FlorianWerndl
Copy link

Is this topic still being worked on? Experienced performance issues within an multi-language application with about 1100 routes as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: In progress
Development

No branches or pull requests

5 participants