Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster browsing in Explorer (obtain file metadata from parent directories) #214

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

mehrdadn
Copy link

@mehrdadn mehrdadn commented Mar 18, 2018

The goal of this change is to use directory entries instead of querying individual file modification times GetFileModifyTime, which would again make viewing directories faster in Explorer.

However, there are caveats:

  • Reparse point & symbolic link handling might need a review. I may have missed something. Or if there are any other edge cases where parent directory entries and direct queries give different results, those would probably need addressing as well.

  • This may use considerably more memory for repositories with large work trees, so it might make sense to make this an option that the user can turn on rather than enabling it by default. Ideally this wouldn't be necessary—higher-level changes would allow us to enumerate directories in batch and only keep the entries that we need—but addressing that might require some less trivial surgery in the code.

  • I'm not entirely sure when exactly the cache should be invalidated, or if there are any race conditions or threading issues that I'm unaware of, but I tried to be "safe" about these by clearing the at some point just to make sure entries don't get too stale.

@csware
Copy link
Member

csware commented Mar 18, 2018

Please have a look hoiw TGitCache does it. There we fetch a list with all data and then check the files based on that list. I don't think that we should have such a big cache inside explorer.exe, that's why one should use tgitcache,

@mehrdadn
Copy link
Author

Right, I did take a quick glance actually... I should have probably mentioned this, but I've found TGitCache is just unforgivably slow, so I've given up on it entirely. Even deleting a directory that normally occurs at 2000+ items/second with Shell Extended slows down to < 80 entries/second with TGitCache (and dare I say I have seen it drop as low as 8 entries/second, but I can't reproduce that now). It seemed harder to debug but maybe I'll look at that when I get the chance.

There's also another difference (which may be a good thing or a bad thing depending on how you look at it), which is that TGitCache is an actual (persistent) cache. I don't quite yet understand when exactly TGitCache invalidates its entries, but it seems to me that either it must serve stale results, or somehow monitor for changes (and I would not be surprised if the latter is the cause of the slowness above). By contrast, despite my naming here, what I have here isn't quite a cache—it's more like a batching mechanism, and it shouldn't really go stale at all, though this would probably need vouching from a reviewer because I don't know the codebase well enough to be very confident about it.

Signed-off-by: Mehrdad
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants