Details
-
Type:
Bug
-
Status:
Open
-
Priority:
Minor
-
Resolution: Unresolved
-
Affects Version/s: None
-
Fix Version/s: None
-
Labels:None
-
Number of attachments :
Description
When implementing WSTX-167, I discovered that:
(a) to be able to fix invalid characters, it is sometimes necessary to enable P_OUTPUT_CHECK_CONTENT
(b) even with that enabled, there are cases where no checking is done.
Specifically, when using BufferingXmlWriter (which is used for encodings other than Latin1/Ascii/EBCDIC/UTF-32/UTF-16, most notably UTF-8...), no checking is done for COMMENT or CDATA contents, or PROCESSING_INSTRUCTION data. What is checked are just invalid combinations (CDATA end marker), but not character codes.
On positive note, CHARACTERS and ATTRIBUTE content are properly handled, so this is not a major issue. Nonetheless it is a deficiency.
So, proper handling should be added; and this should not be extraordinarily difficult. Should probably start using character code validity tables instead of coding in checks.