-
-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XML to JSON - IOException when parsing XML with XMLMapper #226
Comments
Exception does not make sense, and I'll see what can be done. But beyond that this usage can not be supported all that well since it is so-called "mixed content" -- that is, contains both text ("cdata" in XML lingo) and elements. Such structure can not be expressed in useful sense with But I don't think there should be such low level exception. |
After upgrading from 2.7.8 to 2.8.8 I am also seeing this issue that didn't exist previously. This is the xml that fails for me
|
This comment doesn't provide much information helping the issue but might save someone's time. At the first glance input does not contain mixed-type elements. However, XML contained zero-width character Whether Jackson should interpret it as a whitespace or not - arguable. Probably not worth the effort. |
I'm not sure I understand. In which XML did you find this character? The problem in this thread was related to "mixed content" in XML. |
In XML that was being parsed - the input. It contained zero width character which got interpreted as CDATA rather than a whitespace. So, Jackson saw that element containing other elements + CDATA (that invisible character) meaning it interpreted it as mixed content element. |
The way you commented I thought you were speaking about my initial problem and got confused because I have absolutely no zero-width characters. So you mean you found a case where, if some hypothetical XML contains a zero-width character, it is considered as CDATA and we get the same exception? Maybe that's another problem, worth its own thread. |
@pijusn For what it is worth zero-width character is not considered white space from XML specification perspective. Jackson's handling could choose to interpret it differently, but the challenge is that there are probably a few similar non-letter/number codepoints that are similar so logic can get unwieldy. @dbories Yes, this is bit different issue, although I can see similarities (i.e same challenge but only once you realize that what looks like nothing/whitespace is actually considered character data). |
I am still experiencing this error with 2.8.10. |
Still an issue with 2.9.6. Does anyone have a work around? |
@kchelluri Mixed content can not be handled with this module at this point; there is no simple fix for that, nor active plans for adding something to work around it. |
Created #402 as catch-all for ideas to improve mixed content handling. |
# Conflicts: # release-notes/VERSION-2.x # src/test/java/com/fasterxml/jackson/dataformat/xml/failing/MixedContentTreeRead226Test.java
Hello,
I found a weird error when trying to parse XML documents with an
com.fasterxml.jackson.dataformat.xml.XmlMapper
. If an XML element has no attribute and contains both text nodes and sub-elements, the XML parser fails with the following exception:Example of XML document that produces the error
If the parent element has at least one attribute, the parsing is OK.
Example of XML document that does not produce any error:
The java code used:
Reproduced with 2.8.0 and 2.8.7
Not reproduced with 2.7.8
The text was updated successfully, but these errors were encountered: