Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Escaped whitespace is treated as as whitespace, rather than literals #284

Open
Xophmeister opened this issue Jan 9, 2025 · 1 comment
Open

Comments

@Xophmeister
Copy link

The grammar, as of d1a1a3f, swallows any escaped horizontal whitespace, treating them as actual whitespace rather than literals. For example:

# This should output:
# <x>
# < >
# <	>
# <x>
printf "<%s>\n" x \  \	 x

The parse tree for the above looks like this:

program [0, 0] - [1, 0]
  command [0, 0] - [0, 25]
    name: command_name [0, 0] - [0, 6]
      word [0, 0] - [0, 6]
    argument: string [0, 7] - [0, 15]
      string_content [0, 8] - [0, 14]
    argument: word [0, 16] - [0, 17]
    argument: word [0, 24] - [0, 25]

We see only three arguments -- the printf format string and the two xs -- rather than five, including the escaped space and tab between the xs.

Whitespace literals are semantically different from whitespace used in tokenisation. For example:

echo \ # this is not a comment
@Xophmeister
Copy link
Author

Note: This only seems to occur when the escaped whitespace literal is at the beginning of a token. This parses correctly, for example:

# This should output:
# <a b>
# <c >
printf "<%s>\n" a\ b c\ 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant