THIS IS A TEST INSTANCE. ALL YOUR CHANGES WILL BE LOST!!!!
...
- tunable name: xmlOutputStyle
- values is a whitespace separated list of tokens drawn from this set.
- "default" (Current behavior - ok if data is not being pretty printed, or will not be re-read in, or if whitespace is fungible in the actual data format),
- "prettyPrintSafe" - preserves the XML Infoset exactly including whitespace characters. This XML can be pretty printed without indentation changes modifying element values.
- other values are reserved for future use.
Assumptions & Limitations
We assume pretty printers must obey only a small set of constraints on how they inject whitespace for indenting, or line breaking:
...
It follows from that, if all significant whitespace is within CDATA regions, the data can be pretty printed and the significant whitespace is unaffected.
For example: this reformatting is not allowed. These are not equivalent.
Code Block |
---|
<foo><![CDATA[some stuff]]></foo>
<!-- reformatted to -->
<foo>
<![CDATA[some stuff]]>
</foo> |
Algorithm
- assumes text is all XML-legal characters
- so remapping of things like NUL -> E000 and Ctrl-A -> E001 is already done.
- see: https://daffodil.apache.org/infoset/ section "XML Illegal Characters"
- see also: Daffodil source code object XMLUtils.remapXMLIllegalCharToPUA and other methods that invert this conversion.
- assumes we know what is a string and what is not a string, where whitespace around the value can be fungible.
- requires the infoset outputter to have access to the primtive type at the time it it outputting the string.
- ex: <someHexBinary xsi:type="xs:hexBinary"> AF29B3 </someHexBinary> where the whitespace should/does not matter.
- ex: <someDouble xsi:type="xs:double"> 6.847 </someDouble> again the whitespace does not matter.
- NOTE: should verify that infoset inputters do not trip over such whitespace around non-string simple values.
- NOTE: consider DAFFODIL-182 could also be addressed in this same change set - by adding another token to the xmlOutputStyle 'addXSITypes' in which case the infoset outputter would then also add the xsi:type attributes to the simple elements.
- requires the infoset outputter to have access to the primtive type at the time it it outputting the string.
...