Jetty

HttpTester to handle charsets

Details

  • Type: Improvement Improvement
  • Status: Resolved Resolved
  • Priority: Major Major
  • Resolution: Fixed
  • Affects Version/s: 7.0.0.pre5, 6.1.14
  • Fix Version/s: 6.1.15.pre0
  • Component/s: None
  • Labels:
    None
  • Number of attachments :
    2

Description

from the mailing list:

Hensley, Richard wrote:
> I ran into an interesting situation using the HttpTester class. The
> basic problem is that the response from my servlet is encoded in UTF-8,
> and the HttpTester.parse(String rawHTTP) uses ISO-8859 and platform
> encoding for the string character encoding, which may not contain all
> the needed code points. Also, HttpTester.getContent() uses platform
> encoding to interpret the raw bytes of the parsed content, which may not
> be right at all.
>
> In order to accommodate my needs, I had to add the following two methods
> to a derived class. They might be useful in the HttpTester class.
>
> /**
>
> * Ensure that the parsed content is decoded using the correct
> character set. The HTTP parser just dumps raw
>
> * bytes into the _parsedContent. This method simply interprets
> the bytes using the specified character
>
> * encoding.
>
> *
>
> **** @param* charsetName
>
> **** @return*
>
> **** @throws* UnsupportedEncodingException
>
> */
>
> *** public* String getContent(**final String charsetName)
>
> *** throws* UnsupportedEncodingException
>
> {
>
> *** if* (_parsedContent !=*** null*)
>
> { > > *** new* String(_parsedContent.toByteArray(), charsetName); > > }
>
> *** return**** super*.getContent();
>
> }
>
> /**
>
> * This method is used to ensure that the response is parsed as
> bytes. This is in contrast to the parse(String
>
> * rawHttp) method which converts the passed in string into
> bytes using ISO08859 encoding.
>
> *
>
> **** @param* bytes
>
> **** @return*
>
> **** @throws* IOException
>
> */
>
> *** public* String parseBytes(**final*** byte*[] bytes)
>
> *** throws* IOException
>
> { > > ByteArrayBuffer buf =*** new* ByteArrayBuffer(bytes); > > View view =*** new* View(buf); > > HttpParser parser =*** new* HttpParser(view,*** new* PH()); > > parser.parse(); > > *** return* view.toString(); > > }
>

Activity

Hide
Greg Wilkins added a comment -

David,

I think the patch is mostly right, however:

+ can you make the default encoding utf-8 if the () constructor is used.
+ can you make the _charset only the default to be used when there is not a charset
specified in either request and/or response content type headers

cheers

Show
Greg Wilkins added a comment - David, I think the patch is mostly right, however: + can you make the default encoding utf-8 if the () constructor is used. + can you make the _charset only the default to be used when there is not a charset specified in either request and/or response content type headers cheers
Hide
David Yu added a comment -

Is this ( JETTY-807_2008-12-15.patch) what u meant gregw?

Show
David Yu added a comment - Is this ( JETTY-807_2008-12-15.patch) what u meant gregw?
Hide
Greg Wilkins added a comment -

David,

that's much closer, but there are still some problems.

I think you will need a _charset and _defaultCharset, because a tester can be reused and we don't want the charset from
one request/response being used for the next.

So the default charset should be used if no charset has been found in a header. You also have to make sure that the
request charset is not used for the response. So a charset set for a generate should be cleared when you do a parse
to get a response.

Also make sure your patch updates the VERSION.txt

If you fix these things, commit rather than attach a patch and I'll look in the repo.

Show
Greg Wilkins added a comment - David, that's much closer, but there are still some problems. I think you will need a _charset and _defaultCharset, because a tester can be reused and we don't want the charset from one request/response being used for the next. So the default charset should be used if no charset has been found in a header. You also have to make sure that the request charset is not used for the response. So a charset set for a generate should be cleared when you do a parse to get a response. Also make sure your patch updates the VERSION.txt If you fix these things, commit rather than attach a patch and I'll look in the repo.
Hide
David Yu added a comment -

Alrighty ... its now in trunk.

Thanks,
David

Show
David Yu added a comment - Alrighty ... its now in trunk. Thanks, David
Hide
Greg Wilkins added a comment -

Great,

that looks about right.
However, I think parse(String) should also use the charset when getting bytes and remaking the string. It will be a little strange if the charset changes in the headers... and something more maybe needed?

Also can you backport to jetty 6

Show
Greg Wilkins added a comment - Great, that looks about right. However, I think parse(String) should also use the charset when getting bytes and remaking the string. It will be a little strange if the charset changes in the headers... and something more maybe needed? Also can you backport to jetty 6

People

Vote (0)
Watch (0)

Dates

  • Created:
    Updated:
    Resolved: