This repository has been archived by the owner on Oct 16, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 6
scanRanges is extremely slow on dense match input #14
Comments
This might be a good thing for someone to look into for #hacktoberfest |
I'd be surprised if anyone seriously participates in that, seems like it's more of a spam-fest.. |
That's an interesting article, and it's probably true for much of the github ecosystem, but I have a feeling Haskell repos are likely shielded at least a bit from poor quality submissions (Haskell doesn't even show up in the list on the main site). A quick glance over closed hacktoberfest issues in Haskell shows a surprising amount of legitimate work happening 😄 Granted, this isn't so much of a "quick win" |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Referenced in ChrisPenner/lens-regex-pcre#10 with benchmarks here: https://github.com/jamesdbrock/replace-benchmark#results
We can see that pcre-heavy is having a bit of trouble with substitutions with LOTS of matches, I did a quick profile on it and the VAST majority of the time is spent inside
rawMatch
inscanRanges0
. On large strings; it clocks in at OVER 10 minutes, whereas the next longest time is just over one second.After a cursory look, it seems it is serially calling out to C for each consecutive match, and just iterating on that, there's also a lot of fiddly math going on, but nothing that seems to be the "obvious" culprit. Perhaps there's some unknown copying going on too, but I didn't dig deep enough to see.
I can dig around a bit more, but if anyone has suggestions or tips on a good way to speed this up, or even ideas of where the cost centers might be I'm open to ideas!
I understand that the benchmark in question represents a sort of "perfect storm" in terms of triggering bad performance, but it would be nice if there's some way we can get this into at least the same ballpark :)
The text was updated successfully, but these errors were encountered: