Spaces in filename cause ReleaseEntry Regex to fail #1283

frederikolsen88 · 2018-04-11T06:37:28Z

I'm in a situation where Squirrel updating fails when iterating over one particular release entry

E6651B768DB82076E7113086558548B4DA4F8F6F Artifact - BoxHeader - e7ee795c-99e1-4ecb-a72c-9a41461d8198.xml.shasum 8669

using this Regex:

^([0-9a-fA-F]{40})\s+(\S+)\s+(\d+)[\r]*$

The culprit appears to be that (\S+) will only match non-whitespace characters. I've played a bit around with the Regex Tester at http://regexstorm.net/tester, and my suggestion would be to change the Regex to something similar to

^([0-9a-fA-F]{40})\s+(.*).shasum[\s+](\d+)[\r]*$

We don't really want to be forced to change the filenames or escape the spaces, so I'm interested in a solution where we can avoid that. Thanks in advance.

The text was updated successfully, but these errors were encountered:

frederikolsen88 · 2018-05-18T16:11:32Z

@paulcbetts, would you accept a PR incorporating the suggestion above, or would there be caveats to that approach?

anaisbetts · 2018-05-18T16:14:09Z

I'm confused, why is this file in your RELEASES file?

frederikolsen88 · 2018-05-18T16:50:49Z

Sorry, I realize now I was a bit unclear.

E6651B768DB82076E7113086558548B4DA4F8F6F Artifact - BoxHeader - e7ee795c-99e1-4ecb-a72c-9a41461d8198.xml.shasum 8669

are the contents of the .shasum corresponding with the file in question in one of our delta packages. It is when trying to update using a delta package that our users get an "Invalid release entry" exception. If there are changes to apply to a file with spaces in the filename, that is.

So while the RELEASES file isn't in play, I guess the error message is worded like that is because you're using the same Regex to parse both .shasum files and the RELEASES file.

JohnThomson · 2018-10-23T20:15:05Z

Having the same problem. Any time Squirrel decides to use a .diff (or .bsdiff) to represent the difference between old and new versions of a file, it generates a corresponding .shasum file along with the .diff. Here's an example of the shasum contents for one of our packages:

09E36653DEFCAFEDA55115AD81AC84E1B8D4A1FF Install Bloom Literacy Fonts.exe.shasum 12992

When applying the patch in DeltaPackage.applyDiffToFile, Squirrel calls VerifyPatchedFile(), which reads the .shasum file and passes the contents to ReleaseEntry.ParseReleaseEntry(), which fails to parse it because it can't match the spaces in the file name.

Squirrel then concludes that the patch is faulty, aborts the incremental update, and downloads the whole package, defeating the incremental update feature entirely.

I agree that Frederik's fix (or even just changing (\S+) to (.+) would fix the problem.

JohnThomson · 2018-10-23T20:19:02Z

I suppose to be sure there is some filename present, and that if there are multiple spaces at the end none of them are put in the filename group, (\S.*shasum) would be better.

JohnThomson · 2018-10-23T20:26:20Z

No, that won't work for the existing usages with .nupkg. We can however expect at least two non-space characters, since all usages involve either .shasum or .nupkg. (\S.*\S) should work, then. (At least, it passes all the existing ReleaseEntry unit tests.)

cbenard · 2018-10-23T20:31:45Z

Wouldn't (\w+)\s(.+)\s(\d+) be much simpler to match the hash at the beginning and the length at the end?

JohnThomson · 2018-10-23T20:40:23Z

(a) Paul's regex requires exactly 40 of the expected characters in the sha, a better validation.
(b) ^ and $ are needed to ensure it matches the whole file content for validation
(c) If there is more than one white-space character surrounding the file name, your version will include the extra white space in the filename, which is probably not good.
(d) Your version will match something with nothing but white space between the sha and the length, as long as there are at least three spaces
(e) A minimal change to fix the problem is usually wise if one is not fully familiar with a piece of code and all its purposes.

cbenard · 2018-10-23T20:44:10Z

I assumed we were talking about matching per-line. You're right about the 3 spaces thing. I didn't even see the original regex which is a bit more complicated than it needs to be.

It does seem like a pretty easy fix, especially if you know each line begins with 40 upper-case hex characters for the hash and ends with a set of numbers for length and has a filename in the middle delineated with spaces before/after. I'm surprised it's still open. I was subscribed to it for some reason, but I can't even remember if I had the issue.

cbenard · 2018-10-23T20:51:03Z

How about adding \S inside the filename group to ensure there's at least one non-space character and then allow multiple spaces before/after like this? Also, I made the filename match non-greedy so it won't match extra spaces between the filename and the hash/length.

^([0-9a-fA-F]{40})\s+(\S.+?)\s+(\d+)\r*$

I did a bunch of tests in The Regex Coach and it seems to work well, and it's very similar to the original.

JohnThomson · 2018-10-23T20:59:28Z

It will probably work for much the same reason as mine: filenames that will really occur here will have at least two characters. However, in the case of a single-character file name followed by two spaces, it will include one of the spaces in the filename. Using * after the dot would be even better.
Both your version and one with * instead also pass all the existing unit tests. But I think the * variation, which will allow a single character file name, is the best suggestion yet.

cbenard · 2018-10-23T21:07:51Z

\S.? means one non-space and zero or one single character. You probably meant .*? or something, like a non-greedy zero-to-any length match.

This one allows single character filenames, at the expense of requiring a single space before/after the filename: ^([0-9a-fA-F]{40})\s(\S[\S ]*?)\s(\d+)\r*$. It also stays closer to the original's definition of what a filename is (\S) plus spaces.

I don't have a dev env setup for this project, so I'm just trying to help with regex. :) I'll probably let this be my last comment on the matter.

JohnThomson · 2018-10-23T21:11:58Z

No, I tried to write openparen-backslash-S-dot-star-questionmark-closeparen and markdown interpreted it as a request for italics. So I edited the comment, but unfortunately you saw the original. Sorry.

cbenard · 2018-10-23T21:20:32Z

If you surround your "code" with backticks, you don't have to deal with the escapes. It'll also do a mono-space font with a gray background instead of italics like you're using.

Like this `some \code here` makes it look like: some \code here

JohnThomson · 2018-10-23T21:22:32Z

OK, so what I intended to suggest was (\S.*?), or in full ^([0-9a-fA-F]{40})\s+(\S.*?)\s+(\d+)\r*$
Thanks.

JohnThomson · 2018-10-23T21:27:39Z

Your idea of allowing only spaces, not other whitespace, is also a good one. Could still use \s+ before and after, counting on the ? to prevent a trailing space being included in the filename. So ^([0-9a-fA-F]{40})\s+(\S.[\S ]*?)\s+(\d+)\r*$ would result...which also passes unit tests.

cbenard · 2018-10-23T21:30:58Z

Yours has an error that makes it not work for single character filenames. You have an errant period inside the filename match after (\S. That requires at least 2 characters for a filename and the 2nd character can be anything, including whitespace like tabs/vertical tab/etc.

JohnThomson · 2018-10-23T21:38:35Z

Bother. A careless mistake. The version I actually unit-tested is ^([0-9a-fA-F]{40})\s+(\S[\S ]+?)\s+(\d+)[\r]*$

cbenard · 2018-10-23T21:53:29Z

That one doesn't match single character filenames either. Your [\S ]+ in the middle should be *, not +. If you use +, it requires at least one character, in addition to the \S which is another character to begin.

JohnThomson · 2018-10-24T13:26:02Z

Must have been really sleepy yesterday. ^([0-9a-fA-F]{40})\s+(\S[\S ]*?)\s+(\d+)[\r]*$.

cbenard · 2018-10-24T13:39:32Z

That one looks good. Consider it co-signed.

^([0-9a-fA-F]{40})\s+(\S[\S ]*?)\s+(\d+)[\r]*$

Now if we can only get @paulcbetts to look at this issue still open from April... :)

cbenard · 2018-10-24T14:22:16Z

Got the PR ready for you @paulcbetts after @JohnThomson and I worked through the regex. I hope you're able to accept the PR.

Thieum · 2019-05-03T14:15:26Z

@shiftkey duplicate of #713, this can be closed.

frederikolsen88 changed the title ~~Spaces in filename causes ReleaseEntry Regex to fail~~ Spaces in filename cause ReleaseEntry Regex to fail Apr 11, 2018

cbenard added a commit to cbenard/Squirrel.Windows that referenced this issue Oct 24, 2018

Fix Squirrel#1283 (Spaces in filename cause ReleaseEntry Regex to fail)

0019d35

cbenard added a commit to cbenard/Squirrel.Windows that referenced this issue Oct 24, 2018

Fix Squirrel#1283 (Spaces in filename cause ReleaseEntry Regex to fail)

d6b581e

cbenard mentioned this issue Oct 24, 2018

Fix #1283 (Spaces in filename cause ReleaseEntry Regex to fail) #1401

Open

frederikolsen88 mentioned this issue Jan 9, 2019

entryRegex throws Exception and update fails when filename contains spaces #1420

Closed

shiftkey closed this as completed May 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spaces in filename cause ReleaseEntry Regex to fail #1283

Spaces in filename cause ReleaseEntry Regex to fail #1283

frederikolsen88 commented Apr 11, 2018 •

edited

frederikolsen88 commented May 18, 2018

anaisbetts commented May 18, 2018

frederikolsen88 commented May 18, 2018

JohnThomson commented Oct 23, 2018

JohnThomson commented Oct 23, 2018

JohnThomson commented Oct 23, 2018

cbenard commented Oct 23, 2018

JohnThomson commented Oct 23, 2018

cbenard commented Oct 23, 2018

cbenard commented Oct 23, 2018

JohnThomson commented Oct 23, 2018 •

edited

cbenard commented Oct 23, 2018

JohnThomson commented Oct 23, 2018

cbenard commented Oct 23, 2018

JohnThomson commented Oct 23, 2018

JohnThomson commented Oct 23, 2018

cbenard commented Oct 23, 2018

JohnThomson commented Oct 23, 2018

cbenard commented Oct 23, 2018

JohnThomson commented Oct 24, 2018

cbenard commented Oct 24, 2018

cbenard commented Oct 24, 2018

Thieum commented May 3, 2019

Spaces in filename cause ReleaseEntry Regex to fail #1283

Spaces in filename cause ReleaseEntry Regex to fail #1283

Comments

frederikolsen88 commented Apr 11, 2018 • edited

frederikolsen88 commented May 18, 2018

anaisbetts commented May 18, 2018

frederikolsen88 commented May 18, 2018

JohnThomson commented Oct 23, 2018

JohnThomson commented Oct 23, 2018

JohnThomson commented Oct 23, 2018

cbenard commented Oct 23, 2018

JohnThomson commented Oct 23, 2018

cbenard commented Oct 23, 2018

cbenard commented Oct 23, 2018

JohnThomson commented Oct 23, 2018 • edited

cbenard commented Oct 23, 2018

JohnThomson commented Oct 23, 2018

cbenard commented Oct 23, 2018

JohnThomson commented Oct 23, 2018

JohnThomson commented Oct 23, 2018

cbenard commented Oct 23, 2018

JohnThomson commented Oct 23, 2018

cbenard commented Oct 23, 2018

JohnThomson commented Oct 24, 2018

cbenard commented Oct 24, 2018

cbenard commented Oct 24, 2018

Thieum commented May 3, 2019

frederikolsen88 commented Apr 11, 2018 •

edited

JohnThomson commented Oct 23, 2018 •

edited