Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get the value of a string literal #101

Closed
verhovsky opened this issue Sep 18, 2021 · 2 comments
Closed

Get the value of a string literal #101

verhovsky opened this issue Sep 18, 2021 · 2 comments

Comments

@verhovsky
Copy link
Collaborator

If I try this

import Parser from 'web-tree-sitter';

await Parser.init();
const Bash = await Parser.Language.load('./tree-sitter-bash.wasm');
const parser = new Parser();
parser.setLanguage(Bash);

const result = parser.parse("echo 'hello world'");
console.log(result.rootNode.toString())
console.log(result.rootNode.firstChild.children.map(x => x.text));

I see

(program (command name: (command_name (word)) argument: (raw_string)))
[ 'echo', "'hello world'" ]

.text returns "'hello world'", the string still has its single quotes. I would like a way to get the actual value of the text, i.e. "hello world" from "'hello world'". This is a trivial example, but it also needs to handle escaped newlines and quotes (and probably other things I haven't thought of)

// returns
// [ 'echo', `'hello \\\nwor\\"ld'` ]
// instead of 
// [ 'echo', `'hello wor"ld'` ]
const result = parser.parse("echo 'hello \\\nwor\\\"ld'");

and the parsing logic for ANSI-C strings is even more complicated.

ahlinc added a commit to ahlinc/tree-sitter-bash that referenced this issue Sep 22, 2021
ahlinc added a commit to ahlinc/tree-sitter-bash that referenced this issue Sep 22, 2021
ahlinc added a commit to ahlinc/tree-sitter-bash that referenced this issue Sep 22, 2021
@ahlinc
Copy link
Contributor

ahlinc commented Sep 22, 2021

@verhovsky I've added a #103 PR that helps to resolve your case but I'm not sure that it would be merged. And if not you can copy a grammar.js into your repo and maintain a fork of the grammar that would feet your needs. It's not hard to incorporate tree-sitter grammar building into a CI.

Screenshot from 2021-09-22 04-09-02

@Freed-Wu
Copy link

Can I get string content from single quotes not only double quotes?

echo "a"
echo 'b'
# "a"
>>> tree.root_node.children[0].children[1].children
[<Node type=""", start_point=(0, 5), end_point=(0, 6)>, <Node type=string_content, start_point=(0, 6), end_point=(0, 7)>, <Node type=""", start_point=(0, 7), end_point=(0, 8)>]
# 'b'
>>> tree.root_node.children[1].children[1].children
[]

Why not

# 'b'
>>> tree.root_node.children[1].children[1].children
[<Node type="'", start_point=(1, 5), end_point=(1, 6)>, <Node type=string_content, start_point=(1, 6), end_point=(1, 7)>, <Node type="'", start_point=(1, 7), end_point=(1, 8)>]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants