Jetty
  1. Jetty
  2. JETTY-1340

Utf8StringBuffer incorrectly handles characters outside of the basic multilingual plane

    Details

    • Type: Bug Bug
    • Status: Resolved Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 6.1.25
    • Fix Version/s: 6.1.27
    • Component/s: None
    • Labels:
      None
    • Testcase included:
      yes
    • Patch Submitted:
      Yes
    • Number of attachments :
      3

      Description

      In the Apache Solr project, we had some problems with jetty supporting Unicode characters outside
      of the basic multilingual plane (codepoints greater than 0xffff).

      The problem boils down to Utf8StringBuffer, all of the utf8 logic looks generally correct, but at
      the end when it is time to append to the underlying StringBuffer, it needs to append the two
      UTF-16 surrogates for characters in this range... its a rather simple fix.

      The patch is against http://svn.codehaus.org/jetty/jetty/branches/jetty-6.1 (I hope this is correct?)

      1. AbstractGenerator.patch
        10 kB
        Bernd Fehling
      2. JETTY-1340.patch
        4 kB
        Robert Muir
      3. jettyUnicode.patch
        2 kB
        Robert Muir

        Issue Links

          Activity

          Hide
          Robert Muir added a comment -

          updated patch, there was a similar bug in the UTF8 writer as well (I added a new test for this one)

          Now everything appears to work with all of unicode in jetty.

          Show
          Robert Muir added a comment - updated patch, there was a similar bug in the UTF8 writer as well (I added a new test for this one) Now everything appears to work with all of unicode in jetty.
          Hide
          Greg Wilkins added a comment -

          Note that this has been fixed in jetty-7 already.
          I will leave this open until we back port to jetty 6

          Show
          Greg Wilkins added a comment - Note that this has been fixed in jetty-7 already. I will leave this open until we back port to jetty 6
          Hide
          Robert Muir added a comment -

          Thanks Greg, you are right: after the fact, I checked jetty 7 out of curiousity and at a glance, it looks like all of this stuff is correct there.

          Show
          Robert Muir added a comment - Thanks Greg, you are right: after the fact, I checked jetty 7 out of curiousity and at a glance, it looks like all of this stuff is correct there.
          Hide
          Bernd Fehling added a comment -

          The patch for AbstracGenerator.java has to be used with the patched version of AbstractGenerator.java
          from JETTY-1340 patch supplied by Robert.
          This will fix UTF-8 handling for unicode above BMP.

          Show
          Bernd Fehling added a comment - The patch for AbstracGenerator.java has to be used with the patched version of AbstractGenerator.java from JETTY-1340 patch supplied by Robert. This will fix UTF-8 handling for unicode above BMP.
          Hide
          Greg Wilkins added a comment -

          backported jetty-7 UTF handling

          Show
          Greg Wilkins added a comment - backported jetty-7 UTF handling

            People

            • Assignee:
              Greg Wilkins
              Reporter:
              Robert Muir
            • Votes:
              3 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: