Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue with bad characters in file paths #3386

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

tobyjwebb
Copy link
Contributor

@tobyjwebb tobyjwebb commented Mar 31, 2024

This PR fixes an issue where the app would be left in an inconsistent state when a filename in the repo is present with strange characters in it (for example "\n").

To reproduce the issue, run the following command from a bash terminal inside a repo that is being watched by GitButler:

touch oh_no$'\n'.txt

Should file names contain \n in them? Probably not. Can they? It would seem so. If they can, it would be better that the app doesn't error out.

Note that I haven't tested all the code that this PR modifies, and there may be some path.display() which needs replacing - the places I searched for were:

  • Any struct which is Serialized with serde and contains a PathBuf
  • Most places where file_path.display() where being used were changed

@@ -215,7 +215,7 @@ export class RemoteFile {
}

get filename(): string {
return this.path.replace(/^.*[\\/]/, '');
return this.path.replace(/^.*[/]/, '');
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that elsewhere in the code, to obtain the filename, the path is being split by "/", and the last item taken - this change is consistent with that - I haven't tested this on Windows.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like / is used as a separator on Windows also (at least the paths that are passed to this part of the code):

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the handling in the frontend, we should definitely be checking which platform we're running on and switching the regex based off that, not assuming. That's really the only 'proper' solution here since any character except for / and \0 is a valid pathname character on *nix.

@Qix- FYI :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, makes sense. Would it be possible if we go one step further and add a Tauri function that returns the path separator for the platform, and incorporate that into the path splitting function?

I take it you are referring to the change I introduced in this line (if not, please correct me 😄 ):
image
As far as I can tell from testing on Windows, \ is not used as a path separator in the UI. (If you check my screenshot from above, which was taken on a Windows VM, the file path that gets logged to the console is Sub/Folder/With space/hello world.txt)

If I were to add a patch to split by \ on Windows and on / on Linux, it would actually break things. Windows would start considering whole file paths a single file name.

Also note that there are more places already in the frontend codebase that use / for path separator, which do not differentiate by OS:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Qix- (sorry, forgot to mention you)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right but they're not really correct; the app was written with only MacOS and Linux in mind, we're in the process of making it cross platform. We should be doing the right thing here and doing proper path parsing that is platform-aware, instead of standardizing on forward slashes.

The biggest reason why is that there might be paths coming from the frontend (user input, etc.) that we need to parse correctly regardless of the separator(s) supported. On windows it's [\\/] and on *nix it's /, I don't really see any other way around it than to do a platform check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're in the process of making it cross platform

Aaaah, right, I see now. I managed to run it on a Windows VM without WSL (I can't test with WSL because of virtualization), and it seemed to work pretty well. Occasionally I had to right-click and refresh, and it was a bit sluggish (I think that may be related to the VM, though, it took forever to compile).

that there might be paths coming from the frontend (user input, etc.)

Has anything like this already been implemented? I personally can't foresee having to input a path manually in a text box or something similar... (But then again, I don't know what the plan is for GitButler.)

On windows it's [\\/] and on *nix it's /, I don't really see any other way around it than to do a platform check.

My recommendation in this case would be to stick to the / separator - this is what is already working on Windows. Otherwise, you might have to rewrite a whole load of stuff which would be unneeded. So, when (if?) anyone enters a path manually in an input field, convert it to *nix format before passing it to the backend.

Anyway, if the need to use different path formats based on environment were required, I wouldn't add it in this PR, it's better to keep PRs small and related if possible (I only added that small change because it was actually fixing a visualization issue, it didn't break anything on Windows, and has small impact - it was only being used in two components). It would be better to change all the path handling logic in a separate PR all at once. However, I'm not 100% sure if this will be needed, TBH.

Adding support for detecting the OS from the frontend is trivial - the navigator.userAgent string can be queried:

image
image

Please let me know if there is an issue anywhere which requires OS detection so I can have a look into it (I can test on Linux and Windows inside VM without WSL).

@Qix-

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has anything like this already been implemented?

Yes, namely when setting up a new project.

My recommendation in this case would be to stick to the / separator - this is what is already working on Windows.

I'm less inclined to say this really solves the issue as IMO we should be matching the platform paths as closely as possible, despite what works in some shells (cmd.exe notably does not accept the / separator in all cases, and many Windows utilities do not either).

Adding support for detecting the OS from the frontend is trivial - the navigator.userAgent string can be queried:

Yes, that should be sufficient for now. If it ends up not being so then we can add a Tauri command for it later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, namely when setting up a new project.

That seems to be working fine in Windows and in Linux.

I'm less inclined to say this really solves the issue

What issue?

cmd.exe notably does not accept <...>

At any point in your code are you launching another process in cmd.exe and passing in paths?

@krlvi krlvi requested a review from Qix- April 1, 2024 16:17
@Qix-
Copy link
Contributor

Qix- commented Apr 1, 2024

I'm not fully understanding what the rust portions of this PR are fixing. Could you perhaps explain a bit why a custom serializer that wraps JSON's default Path formatting is necessary here?

As for the handling in the frontend, we should definitely be checking which platform we're running on and switching the regex based off that, not assuming. That's really the only 'proper' solution here since any character except for / and \0 is a valid pathname character on *nix.

@tobyjwebb
Copy link
Contributor Author

tobyjwebb commented Apr 1, 2024

@Qix- Sure:

The problem was that when the files in the repo contain \n characters (I imagine other characters could also be problematic, but \n was what was creating an issue for me), when the meta/ownership file is populated, we would end up with something like this:

ohno
.txt:1-17-fcb6d34f0431c0b8aff8066dc9a17cfd-1711990013953

Which would then cause the UI to break:

image

What this PR achieves is that the file path is escaped:

ohno\n.txt:1-17-fcb6d34f0431c0b8aff8066dc9a17cfd-1711990013953

and so the ownership lines can be properly parsed back and understood by the UI (and backend).

[edit] - lemme check how this behaves in Windows...

@Qix-
Copy link
Contributor

Qix- commented Apr 2, 2024

Thanks, makes sense. Would it be possible if we go one step further and add a Tauri function that returns the path separator for the platform, and incorporate that into the path splitting function?

@krlvi krlvi mentioned this pull request Apr 4, 2024
@Qix-
Copy link
Contributor

Qix- commented Apr 4, 2024

By the way, changes should probably be rebased on top of #3405 which is going to be merged here shortly to unblock some of the other changes we want to make this week. It's just a rename with minimal other changes so it should rebase cleanly.

Just wanted to give you a heads up.

@tobyjwebb
Copy link
Contributor Author

By the way, changes should probably be rebased on top of #3405 which is going to be merged here shortly to unblock some of the other changes we want to make this week. It's just a rename with minimal other changes so it should rebase cleanly.

Updated my branch (again).

@krlvi
Copy link
Member

krlvi commented Apr 16, 2024

@Qix- since the code moved quite a bit, would you mind doing the rebase / conflicts on this one? I can also attempt to do it tomorrow morning as I am catching up after my vacation

@Byron Byron force-pushed the Fix-issue-with-bad-characters-in-file-paths branch from 7450cdb to 84959e8 Compare April 25, 2024 13:52
Copy link
Collaborator

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And it works again :)!

Screen.Recording.2024-04-25.at.15.53.41.mov

Something that came to mind is where the escaping happens. When communicating between backend and frontend, I'd think that JSON is good enough to handle this, while the frontend would probably be the one responsible for assuring it can display paths properly. Another part of me thinks that core::serde::hash_to_hex already knows that that binary hash digests should be in hex for the frontend, so it's not really an argument and it's probably OK for now that core types *when serializingusingserde` will have the frontend in mind.

Besides that, we'd only have to take care of the internal file formats which also are sensitive to newlines, so all file-paths need to be cleaned up. But there, we wouldn't necessarily have to use the json-based quoting, but for a lack of alternatives, it's probably totally fine.

@mtsgrd
Copy link
Contributor

mtsgrd commented Jun 4, 2024

Is this pr still relevant?

@Byron Byron added the feedback requested Feedback was requested to help resolve the issue label Jun 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feedback requested Feedback was requested to help resolve the issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants