Details
-
Type:
Bug
-
Status:
Open
-
Priority:
Major
-
Resolution: Unresolved
-
Affects Version/s: 0.9.5
-
Fix Version/s: None
-
Component/s: XML
-
Labels:None
-
Environment:Operating System: Windows 2000
Platform: PC
-
Bugzilla Id:1373
-
Number of attachments :5
Description
When handling mixed types:
<a>foo<b>value</b>bar</a>
Castor will return
<a>foobar<b>value</b></a> which is correct from a databinding perspective but
not from a usability perspective.
Since more and more people use Castor as an XML Serializer, this feature seems
needed.
-
- patch-C513-20060620.txt
- 20/Jun/06 5:05 PM
- 102 kB
- Ralf Joachim
-
- patch-C513-20061015.txt
- 15/Oct/06 12:45 PM
- 101 kB
- Ralf Joachim
-
- patch-C513-20070120.txt
- 19/Jan/07 5:47 PM
- 206 kB
- Ralf Joachim
-
Hide
- patch-C513-castorMixedTypeFix.zip
- 19/Jun/06 9:22 AM
- 73 kB
- Ralf Joachim
-
- castorMixedTypeFix/build.xml 5 kB
- castorMixedTypeFix/.../DescriptorSourceFactory.java 22 kB
- castorMixedTypeFix/.../FieldInfo.java 21 kB
- castorMixedTypeFix/.../MixedCollectionInfo.java 32 kB
- castorMixedTypeFix/.../SourceFactory.java 77 kB
- castorMixedTypeFix/.../Marshaller.java 74 kB
- castorMixedTypeFix/.../MixedXMLFieldHandler.java 3 kB
- castorMixedTypeFix/.../UnmarshalHandler.java 102 kB
- castorMixedTypeFix/.../placeholder.ignoreme 0.0 kB
- castorMixedTypeFix/m.xsd 0.8 kB
- castorMixedTypeFix/m3.xml 0.3 kB
- castorMixedTypeFix/src/.../Untitled1.java 1 kB
-
Hide
- patch-C513-castorMixedTypeFixDiff.zip
- 19/Jun/06 9:22 AM
- 12 kB
- Ralf Joachim
-
- builder.DescriptorSourceFactory.txt 2 kB
- builder.FieldInfo.txt 3 kB
- builder.MixedCollectionInfo-(newFile).txt 32 kB
- builder.SourceFactory.txt 3 kB
- xml.Marshaller.txt 5 kB
- xml.MixedXMLFieldHandler-(newFile).txt 3 kB
- xml.UnmarshalHandler.txt 3 kB
Issue Links
- is duplicated by
-
CASTOR-498
Castor can't (properly) handle mixed types
-
Activity
Hello,
I'm surprised to see this issue relegated to an "enhancement request" rather
than a full-fledged bug. One (of many) uses that we have for Castor is a
processor of XML-formatted input from external editors. We read in the XML
using Castor and traverse the Java structures to update an online database. The
problem, however, is that some of the embedded documentation is in the form of
HTML snippets, and the current incarnation of Castor mauls it. Given input in
the form:
<revision>
<entry code="A11175>
<designation>A code for something fancy</designation>
<documentation>This code is a really <i>neato</i> example of something
useful</documentation>
</entry>
</revisions>
The documentation ends up in the database as
"This code is a really example of something useful<i>neato</i>"
This problem is serious enough that we would have had to abandon Castor had we
not had access to the source. A solution (albeit imperfect) was submitted along
with the problem report, and I strongly urge you to reconsider the decision not
to use it. Castor is a really great tool, but it will be crippled as long as it
can't process something as simple as an italicized word in a block of text.
(BTW - we'd be happy to show off some of the tools that we've built using Castor
if anyone is interested - we are quite pleased with what we've been able to do
with it)
Harold Solbrig
Mayo Clinic
I agree with those sentiments. I've changed this to bug status.
Dan, do you have a "diff -u" of your changes instead of the actual files?
I'll try to get one generated today. Should I generate them off of a current
daily build of castor, or off of the source that I origionally modified? I'm
guessing I may have some new issues to work out, if I go off of the current
build, since part of the mixed type issue has been checked in since we modified
the code.
Hi daniel,
The best is to give a diff -u againt the current CVS.
Some notes on how to fix the issue: the content instead of being stored in a
mere string could be stored in an array of string. The order of apparition will
determine the position in the array.
Created an attachment (id=306)
Here are the diffs for 5 changed classes, and the contents of the two new classes in a zip file.
Created an attachment (id=307)
Here is a copy of the fully patched sources with an ant build script as a zip file
I posted the diffs (hopefully in the right format, I'm new to posting diffs) for
the changes we made to castor. These were generated off of a daily build from
yesterday (or maybe the day before?). I also posted an updated version (works
with the current daily build) of the zip file that I originally posted in bug
1358 that contains the classes as I intended them to look after patching, and
runs a basic test on the changes.
We may have taken the long way around in patching this bug, but what we ended up
doing was having to hold the contents of a mixedType in a vector, where and the
contents of the vector were either the text, or the next item.
Some of the code that was added to the Marshaler and UnmarshallHandler to fix
bug 1358 may now be unnecessary, but I did not remove it.
I've just discovered that I have a bug in the changes I made trying to come up
to date with the current daily build. It will probably be tomorrow before I get
it tracked down, and fixed. It works for the test case I posted, but its
failing for a more advanced use case. I'll post new diffs after I get it
figured out...
Created an attachment (id=310)
Updated ant script fix package (zip file)
The new but that I ran into was that Castor was misbehaving (in the daily build)
with Groups. It simply wasn't generating all of the fields and methods in the
java files when it ran into something that was in a group.
Someone added the if statment denoted in capital letters dealing with the
ParticleCount to two places in builder.SourceFactory.java.
The two cases it was added to were Structure.GROUP and Structure.MODELGROUP.
This is also all in the diff's.
//-- create class member,if necessary
if (!( (contentModel instanceof ComplexType)||
(contentModel instanceof ModelGroup)) )
{
if (contentModel.getParticleCount() > 0)
} else
{ //--else we just flatten the group processContentModel(group, state); }break;
I submitted new diffs, and a new ant script fix. This one passes all of our
(current) use cases.
According to comments on CASTOR-498 it has been closed because it duplicates CASTOR-513. As CASTOR-498 is closed we should continue discussion here and leave the closed one as is. Will add comments from the other one here.
Example from Jessica commented on CASTOR-498
Here are the files:
1) the schema:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="product">
<xs:complexType>
<xs:sequence>
<xs:element name="description" type="descriptionType"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:complexType name="descriptionType" mixed="true">
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element ref="sub" minOccurs="0" maxOccurs="unbounded"/>
<xs:element ref="sup" minOccurs="0" maxOccurs="unbounded"/>
</xs:choice>
</xs:complexType>
<xs:element name="sub" type="xs:string">
<xs:annotation>
<xs:documentation>html tag</xs:documentation>
</xs:annotation>
</xs:element>
<xs:element name="sup" type="xs:string">
<xs:annotation>
<xs:documentation>html tag</xs:documentation>
</xs:annotation>
</xs:element>
</xs:schema>
2) the original xml file to be unmarshalled:
<?xml version="1.0" encoding="iso-8859-1"?>
<product xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="castortest.xsd">
<description>Containing less than 65 percent available diphosphorus pentaoxide (P<sub>2</sub>O<sub>5</sub>) equivalents. </description>
</product>
3) the marshalled file without any operation after marshalling the above file:
<?xml version="1.0" encoding="ISO-8859-1"?>
<product xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="usa.xsd">
<description>Containing less than 65 percent available diphosphorus pentaoxide (PO) equivalents. <sub>2</sub><sub>5</sub></description>
</product>
Will try to look at this but am not in a position to promis anything yet.
Many thanks to Daniel Armbrust for sending me his original patch against 0.9.5 after such a long time. I'll give it a try if I am able to prepare an updated one against current codebase. Will keep anyone interested in this informed on the progress.
I've tried to update patch to current SVN. Having said that i only run a basic test yet and did not fix any issue that apeared:
- you could run sourcegen but generated source contained errors which i could easy fix
in generated source by hand - i could unmarshall attached example but parts of the internal datastructur seams to be wrong
- did not try to marshall anything yet
Updated patch to current svn. Still the problems mentioned with my previous comment exist. Hopefully I have not introduced new ones when merging with the latest changes from svn.
Again updated patch to current SVN. This still isn't working but seams to be the right approch to me to go.
Having said that this will be the last update from my side as I will throw that project out as it is to much work to always merge with ongoing changes and I will focus on other things.
There is also an implementation of this feature request in bug number 1358.