castor
  1. castor
  2. CASTOR-2228

wrong charset bad regular expression

    Details

    • Type: Bug Bug
    • Status: Open Open
    • Priority: Blocker Blocker
    • Resolution: Unresolved
    • Affects Version/s: 1.1.2.1
    • Fix Version/s: None
    • Component/s: XML code generator
    • Labels:
      None
    • Environment:
      Eclipse 3.2.2
      Castor 1.1.2.1
      Java 1.4.2_14
    • Testcase included:
      yes
    • Number of attachments :
      2

      Description

      have an Xsd where some field restriction is based on regular expression that uses special character such as "([-a-zA-Z\- '])*"

      In this case in package descriptors the objectDescriptors contain the next line

      typeValidator.addPattern("([À-ža-zA-Z\\- '])*");

      where the pattern is not the same.

      What I wrong?

      Thank's in advance

      In the attached file there are two sample.
      The ant castor_xml_saia_build.1.1.xml contains two targets: GeneratorTestCastor and GeneratorRichiesta
      The first is a small sample
      The second is a complex sample that has another problem. With the version 0.9.5.4 the generated class was compilable but with that new version none.
      With the old version there where problem with collision but source code was complilable.
      Thank's in advance

        Activity

        Hide
        Werner Guttmann added a comment -

        Raffaele can you please create a separate issue for the second problem ? We'd like to keep things as much separated and self-contained as possible. Or let's first have a discussion about this problem on the mailing lists.

        Show
        Werner Guttmann added a comment - Raffaele can you please create a separate issue for the second problem ? We'd like to keep things as much separated and self-contained as possible. Or let's first have a discussion about this problem on the mailing lists.
        Hide
        Werner Guttmann added a comment -

        In addition, when posting an issue, please have a look at http://castor.org/how-to-submit-an-xml-bug.html first. In addition, please avoid attaching complete projects when I have asked you (on the mailing lists) to attach a minimal XML schema that highlights the problem at hand and just the problem (and nothing else). Can you please delete the ZIP archive from this issue and attach a small XML schema that exposes the issue. Thanks in advance.

        Show
        Werner Guttmann added a comment - In addition, when posting an issue, please have a look at http://castor.org/how-to-submit-an-xml-bug.html first. In addition, please avoid attaching complete projects when I have asked you (on the mailing lists) to attach a minimal XML schema that highlights the problem at hand and just the problem (and nothing else). Can you please delete the ZIP archive from this issue and attach a small XML schema that exposes the issue. Thanks in advance.
        Hide
        Raffaele Fabbri added a comment -

        Excuse Me.

        I hope that this file is what you want.

        Thank's

        Show
        Raffaele Fabbri added a comment - Excuse Me. I hope that this file is what you want. Thank's
        Hide
        Raffaele Fabbri added a comment -

        I downloaded the castor source code and I found that the problem is in class
        org.exolab.castor.builder.SourceGenerator

        the method invoked by the main of SourceGeneratorMain
        sgen.generateSource(schemaFilename, options.getProperty(ARGUMENT_PACKAGE));

        public final void generateSource(
        final String filename, final String packageName) throws
        IOException {
        final File schemaFile;
        if (filename.startsWith("./"))

        { schemaFile = new File(filename.substring(2)); }

        else

        { schemaFile = new File(filename); }

        FileReader reader = new FileReader(schemaFile);

        try

        { InputSource source = new InputSource(reader); source.setEncoding("UTF-8"); // added by Raffaele Fabbri source.setSystemId(toURIRepresentation(schemaFile.getAbsolutePath())); generateSource(source, packageName); }

        finally {
        try

        { reader.close(); }

        catch (java.io.IOException iox)

        { // ignore }

        }
        } //-- generateSource

        This is an specific solution that solves our situation. It would be better
        that a general solution could be found.
        I would like to ask you if you could upload the solution.
        If you have any dobts please call or send me a mail.

        Thank's very much.

        Raffaele Fabbri

        Show
        Raffaele Fabbri added a comment - I downloaded the castor source code and I found that the problem is in class org.exolab.castor.builder.SourceGenerator the method invoked by the main of SourceGeneratorMain sgen.generateSource(schemaFilename, options.getProperty(ARGUMENT_PACKAGE)); public final void generateSource( final String filename, final String packageName) throws IOException { final File schemaFile; if (filename.startsWith("./")) { schemaFile = new File(filename.substring(2)); } else { schemaFile = new File(filename); } FileReader reader = new FileReader(schemaFile); try { InputSource source = new InputSource(reader); source.setEncoding("UTF-8"); // added by Raffaele Fabbri source.setSystemId(toURIRepresentation(schemaFile.getAbsolutePath())); generateSource(source, packageName); } finally { try { reader.close(); } catch (java.io.IOException iox) { // ignore } } } //-- generateSource This is an specific solution that solves our situation. It would be better that a general solution could be found. I would like to ask you if you could upload the solution. If you have any dobts please call or send me a mail. Thank's very much. Raffaele Fabbri
        Hide
        Raffaele Fabbri added a comment -

        The previus solution run only if the pattern is in the root xsd

        If the pattern is defined in an include file it doesn't work.

        I try to find the problem but I need some help

        Show
        Raffaele Fabbri added a comment - The previus solution run only if the pattern is in the root xsd If the pattern is defined in an include file it doesn't work. I try to find the problem but I need some help
        Hide
        Edward Kuns added a comment -

        I expect there is more than one location in the source generator that needs to set a charset. The general solution would be to allow providing a charset in the configuration file, and in the absence of one being provided, either not set a charset or set a default charset.

        Show
        Edward Kuns added a comment - I expect there is more than one location in the source generator that needs to set a charset. The general solution would be to allow providing a charset in the configuration file, and in the absence of one being provided, either not set a charset or set a default charset.
        Hide
        Edward Kuns added a comment -

        Hmm, are you sure that the xsd is a valid UTF-8 file? It looks like it may be ISO-8859-1 or something else.

        Show
        Edward Kuns added a comment - Hmm, are you sure that the xsd is a valid UTF-8 file? It looks like it may be ISO-8859-1 or something else.
        Hide
        Edward Kuns added a comment -

        Raffaele, try marking the XSD with

        <?xml version="1.0" encoding="ISO-8851"?>

        or some other appropriate character set. When I load your XSD in a text editor, I don't see the proper characters. If I cut-and-paste from a browser into the xsd file and re-save it, and then run everything, it works for me. When the input XSD is read in, the encoding specified in the <?xml statement is used to figure out the encoding.

        Show
        Edward Kuns added a comment - Raffaele, try marking the XSD with <?xml version="1.0" encoding="ISO-8851"?> or some other appropriate character set. When I load your XSD in a text editor, I don't see the proper characters. If I cut-and-paste from a browser into the xsd file and re-save it, and then run everything, it works for me. When the input XSD is read in, the encoding specified in the <?xml statement is used to figure out the encoding.
        Hide
        Raffaele Fabbri added a comment -

        I can try but the xsd I use is not mine. It is a specific delivered by Italian Ministero Interni and I can't change the encoding.
        Default Eclipse settings don't use a correct encoding to open the xsd file.
        I add a file type in Windows->Preferences->General->Content Types->Text->XML: File Associations *.xsd, Default encoding UTF-8
        and with these settings Amateras or default editor loads correctly the file.
        Question: if the encoding is wrong why if I change encoding as I explained with a minimal XDS schema it works fine?

        Sorry for my bad english but I'm italian

        Show
        Raffaele Fabbri added a comment - I can try but the xsd I use is not mine. It is a specific delivered by Italian Ministero Interni and I can't change the encoding. Default Eclipse settings don't use a correct encoding to open the xsd file. I add a file type in Windows->Preferences->General->Content Types->Text->XML: File Associations *.xsd, Default encoding UTF-8 and with these settings Amateras or default editor loads correctly the file. Question: if the encoding is wrong why if I change encoding as I explained with a minimal XDS schema it works fine? Sorry for my bad english but I'm italian
        Hide
        Raffaele Fabbri added a comment -

        If I change character set with ISO-8859-1my editor tell me: "Some Characters cnnot be mapped using "ISO-8859-1" character encoding. Either change the encodinga or remove the character which are not supported by the "ISO-8859-1" character encoding.

        Show
        Raffaele Fabbri added a comment - If I change character set with ISO-8859-1my editor tell me: "Some Characters cnnot be mapped using "ISO-8859-1" character encoding. Either change the encodinga or remove the character which are not supported by the "ISO-8859-1" character encoding.
        Hide
        Raffaele Fabbri added a comment -

        I send another sample like prova.xsd but with the type definitions in a separated file named tipi.xsd

        Show
        Raffaele Fabbri added a comment - I send another sample like prova.xsd but with the type definitions in a separated file named tipi.xsd
        Hide
        Edward Kuns added a comment -

        I am still looking into this. You're right – I looked further and this file is not ISO-8859-1.

        Show
        Edward Kuns added a comment - I am still looking into this. You're right – I looked further and this file is not ISO-8859-1.
        Hide
        Werner Guttmann added a comment -

        So what is the current understanding of this issue ? It is a genuine bug ?

        Show
        Werner Guttmann added a comment - So what is the current understanding of this issue ? It is a genuine bug ?
        Hide
        Raffaele Fabbri added a comment -

        I think it is a genuinie bug but at this time I'am not be able to find the right fix. Can somebody help me?

        Thank's in advance

        Show
        Raffaele Fabbri added a comment - I think it is a genuinie bug but at this time I'am not be able to find the right fix. Can somebody help me? Thank's in advance
        Hide
        Edward Kuns added a comment -

        I still need to do some research to understand more fully the interaction of locales and input streams and parsers, but I am still looking into this.

        Show
        Edward Kuns added a comment - I still need to do some research to understand more fully the interaction of locales and input streams and parsers, but I am still looking into this.
        Hide
        Edward Kuns added a comment -

        If you add the following line

        source.setEncoding("UTF-8");

        to both places in the source file that make "new InputSource()", does that help the import problem? If I do this, the CTF master test suite runs successfully.

        Show
        Edward Kuns added a comment - If you add the following line source.setEncoding("UTF-8"); to both places in the source file that make "new InputSource()", does that help the import problem? If I do this, the CTF master test suite runs successfully.
        Hide
        Raffaele Fabbri added a comment -

        I add the line in all places in the source file "SourceGernerator.java" but when I try to run this utility with the sample that include tipi.xsd the generated regular expression is wrong.
        Please send me your modified source?
        Thanks in advance

        Show
        Raffaele Fabbri added a comment - I add the line in all places in the source file "SourceGernerator.java" but when I try to run this utility with the sample that include tipi.xsd the generated regular expression is wrong. Please send me your modified source? Thanks in advance

          People

          • Assignee:
            Unassigned
            Reporter:
            Raffaele Fabbri
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:

              Time Tracking

              Estimated:
              Original Estimate - 3 days
              3d
              Remaining:
              Remaining Estimate - 3 days
              3d
              Logged:
              Time Spent - Not Specified
              Not Specified