-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix whitespace in number parsing #3195
base: master
Are you sure you want to change the base?
Conversation
7671872
to
f6fdfee
Compare
Handle malformed and overflowing arguments. Also, forbid leading and trailing spaces to match the behavior of tonumber from commit ce0e788 (improve tonumber/0 performance by parsing input as number literal, 2024-03-02).
`jvp_strtod` skips leading whitespace, but its decnum counterpart `decNumberFromString` (called within `jv_number_with_literal`) does not. Those two are called interchangeably, so it leads to inconsistent behavior depending on whether the decnum feature is enabled. Additionally, `classify`, used in the token scanner, only considers [ \t\n\r] to be whitespace, but `jvp_strtod` consumes the larger set [ \t\n\v\f\r], so those extra characters are considered literals. Changing this deviates from the behavior of `strdod` from <stdlib.h> and is technically a breaking API change, since it is a public symbol.
f6fdfee
to
f1bfe95
Compare
["1", "2a", "3", " 4 ", "5.67", ".89", "-876", "+5.43", 21] | ||
[1, 3, 5.67, 0.89, -876, 5.43, 21] | ||
["1", "2a", "3", " 4", "5 ", "6.7", ".89", "-876", "+5.43", 21] | ||
[1, 3, 6.7, 0.89, -876, 5.43, 21] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there some tests testing the error case somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's right here. The inputs " 4"
and "5 "
test the whitespace behavior (tonumber
fails, so they are not in the output array). The test is run with and without decnum. Before, it was " 4 "
, which masked the bug since it would always fail for leading whitespace and trailing whitespace wasn't tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah doh, how did i miss that 👍
This is two separate number parsing bug fixes:
Do not skip leading whitespace in
jvp_strtod
jvp_strtod
skips leading whitespace, but its decnum counterpartdecNumberFromString
(called withinjv_number_with_literal
) does not. Those two are called interchangeably, so it leads to inconsistent behavior depending on whether the decnum feature is enabled. Additionally,classify
, used in the token scanner, only considers[ \t\n\r]
to be whitespace, butjvp_strtod
consumes the larger set[ \t\n\v\f\r]
, so those extra characters are considered literals.Commit ce0e788 (improve tonumber/0 performance by parsing input as number literal, 2024-03-02) changed
tonumber
to not accept leading or trailing whitespace. However, this case was missed.Changing this deviates from the behavior of
strdod
from <stdlib.h> and is technically a breaking API change, since it is a public symbol.Handle input errors for --indent
Handle malformed and overflowing arguments. Also, forbid leading and trailing spaces to match the behavior of
tonumber
.This will produce a merge conflict in conjunction with PR#3194 Parse short options in order given, because they modify the same area. However, they are logically independent and can be merged in either order. After the first merge, I will rebase the second onto the first.Done