Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Position of the parse result offset in case of status_bad_start_element #606

Open
lo-asys opened this issue Jan 23, 2024 · 0 comments
Open

Comments

@lo-asys
Copy link

lo-asys commented Jan 23, 2024

Hi,
according to the docs, the offset field of a parse result points to the last successfully parsed character in the input data. In case of a status_bad_start_element, this seems to be the last scan position of the parser at the point the error was thrown.
I'd like to suggest a change here: The offset should point to the position of the opening '<' of the bad tag instead.
The specific use-case where this would be helpful is receiving a stream of XML messages over the network, where a single message may be split across multiple network packages like so:

P1: '<a x="y" /><b foo="bar" '
P2: ' baz="blob"></b><c />'

In this case, the receiver wants to store the substring containing the incomplete element b in package 1 and prepend it to the content of package 2 on the next iteration to fully parse it there.
Doing this would be much easier if pugixml reported the offset of the opening '<' here.

I'm not currently aware of other common usecases of the offset value in this error scenario (it's my first project using this library 😉), but if other users might find this helpful too, I'd be glad if you considered it.

Greetings, and thanks for your good work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant