Plexus Components

Manifest entries containing line feeds are invalid

Details

  • Number of attachments :
    3

Description

The attached test case demonstrates, that the plexus-archiver creates invalid archives, if the values of Manifest attributes contain line feeds. The archiver should either throw an exception and refuse such attributes or convert the values silently.

This issue is related to MJAR-38 : The maven-archiver creates such entries (thus invalid jar files).

  1. BlanksTest.java
    14/Jun/06 4:04 PM
    4 kB
    Jerome Lacoste
  2. BlanksTest.java
    14/Jun/06 5:55 AM
    3 kB
    Jochen Wiedmann
  3. PLX-225.patch
    15/Jun/06 8:31 AM
    12 kB
    Jochen Wiedmann

Issue Links

Activity

Hide
Jerome Lacoste added a comment -

Aren't multiline attributes ('continuations') allowed in Manifest files? See http://java.sun.com/j2se/1.4.2/docs/guide/jar/jar.html
See also PLX-185

Show
Jerome Lacoste added a comment - Aren't multiline attributes ('continuations') allowed in Manifest files? See http://java.sun.com/j2se/1.4.2/docs/guide/jar/jar.html See also PLX-185
Hide
Jochen Wiedmann added a comment -

The quoted specification defines "otherchar" as "any UTF-8 character except NUL, CR and LF". Also note that the specification quotes examples of binary data (digests, and signatures) where white space may very well be converted to blanks.

plexus-archiver does ensure the 72 bytes limit. Nevertheless, the created jar files are invalid.

Show
Jochen Wiedmann added a comment - The quoted specification defines "otherchar" as "any UTF-8 character except NUL, CR and LF". Also note that the specification quotes examples of binary data (digests, and signatures) where white space may very well be converted to blanks. plexus-archiver does ensure the 72 bytes limit. Nevertheless, the created jar files are invalid.
Hide
Jerome Lacoste added a comment -

I am still not convinced (or I misunderstood you):

value: SPACE *otherchar newline *continuation
newline: CR LF | LF | CR (not followed by LF)
continuation: SPACE *otherchar newline
otherchar: any UTF-8 character except NUL, CR and LF

seems to indicate that a value may be coded on multiple lines using continuations. Your test case would imply plexus-archiver should convert \r\n elements to one or more spaces, and I don't see that in the spec.

Show
Jerome Lacoste added a comment - I am still not convinced (or I misunderstood you): value: SPACE *otherchar newline *continuation newline: CR LF | LF | CR (not followed by LF) continuation: SPACE *otherchar newline otherchar: any UTF-8 character except NUL, CR and LF seems to indicate that a value may be coded on multiple lines using continuations. Your test case would imply plexus-archiver should convert \r\n elements to one or more spaces, and I don't see that in the spec.
Hide
Jochen Wiedmann added a comment -

My position is, that when converting a set of continuation lines into the original value, then I have to drop the initial SPACE as well as any sequence
of (newline SPACE) and the terminating newline. In other words, the original value consists of otherchar*, which cannot contain any CR, NUL, or LF.

Show
Jochen Wiedmann added a comment - My position is, that when converting a set of continuation lines into the original value, then I have to drop the initial SPACE as well as any sequence of (newline SPACE) and the terminating newline. In other words, the original value consists of otherchar*, which cannot contain any CR, NUL, or LF.
Hide
Jochen Wiedmann added a comment -

See also the following quote from http://java.sun.com/j2se/1.4.2/docs/guide/jar/jar.html

  1. Limitations:

Because header names cannot be continued, the maximum length of a header name is 70 bytes (there must be a colon and a SPACE after the name).

NUL, CR, and LF can't be embedded in header values, and NUL, CR, LF and ":" can't be embedded in header names.

Implementations should support 65535-byte (not character) header values, and 65535 headers per file. They might run out of memory, but there should not be hard-coded limits below these values.

Show
Jochen Wiedmann added a comment - See also the following quote from http://java.sun.com/j2se/1.4.2/docs/guide/jar/jar.html
  1. Limitations:
Because header names cannot be continued, the maximum length of a header name is 70 bytes (there must be a colon and a SPACE after the name). NUL, CR, and LF can't be embedded in header values, and NUL, CR, LF and ":" can't be embedded in header names. Implementations should support 65535-byte (not character) header values, and 65535 headers per file. They might run out of memory, but there should not be hard-coded limits below these values.
Hide
Jerome Lacoste added a comment -

Fixed test cases that demonstrates that the jar manifest created by plexus can be reread by the offical SDK.

There are 2 issues in your test case:

  • you have a '.' in the attribute value
  • you expect the attribute value to be stored in one line

In a manifest file,

name: value

is the same as

name: v
a
l
u
e

Show
Jerome Lacoste added a comment - Fixed test cases that demonstrates that the jar manifest created by plexus can be reread by the offical SDK. There are 2 issues in your test case:
  • you have a '.' in the attribute value
  • you expect the attribute value to be stored in one line
In a manifest file, name: value is the same as name: v a l u e
Hide
Steven Coco added a comment -

Hi all.

I posted a fixed-up test case to the related MJAR-39 issue, which you might also like to double-check with. (My original posted test was a mixed-up file upload that wasn't pertinent. This one should be useful; and there's an easy README showing what to do with it.)

Thanks...
Steven Coco.

Show
Steven Coco added a comment - Hi all. I posted a fixed-up test case to the related MJAR-39 issue, which you might also like to double-check with. (My original posted test was a mixed-up file upload that wasn't pertinent. This one should be useful; and there's an easy README showing what to do with it.) Thanks... Steven Coco.
Hide
Jochen Wiedmann added a comment -

Ok, after studying the writeValue code, I draw back and admit that the plexus-archiver produces valid manifest values.

Nevertheless, I disagree with the implementation. Look at it this way: A value of

xyz\r\nzyx

is converted to a continuation line. We agree, that the user wanted to have a newline between xyz and zyx. Well we can't give him. But if he can't have that, then he will almost definitely want at least a blank between xyz and zyx, and not xyzzyx, which is what he gets now. I hope we can agree on that.

I change my request in that sense and will soon apply a patch, which does exactly that. I would also recommend to lower the requests priority from major to minor. (Which I cannot do.)

Show
Jochen Wiedmann added a comment - Ok, after studying the writeValue code, I draw back and admit that the plexus-archiver produces valid manifest values. Nevertheless, I disagree with the implementation. Look at it this way: A value of xyz\r\nzyx is converted to a continuation line. We agree, that the user wanted to have a newline between xyz and zyx. Well we can't give him. But if he can't have that, then he will almost definitely want at least a blank between xyz and zyx, and not xyzzyx, which is what he gets now. I hope we can agree on that. I change my request in that sense and will soon apply a patch, which does exactly that. I would also recommend to lower the requests priority from major to minor. (Which I cannot do.)
Hide
Jochen Wiedmann added a comment -

Patch, which changes the current way of eliminating all CR's and LF's to merging consecutive CR's and LF's into a single blank.

Show
Jochen Wiedmann added a comment - Patch, which changes the current way of eliminating all CR's and LF's to merging consecutive CR's and LF's into a single blank.
Hide
Jerome Lacoste added a comment -

I don't think the archiver should try to interpret what the user wanted to do. Example: if the user specifies a value of

xyz\r\abdcef.....abcdef\r\n

with the UTF representation of the abdcef.....abcdef string long of 70 bytes, the 'smart' archiver will replace the \r\n by a space, then cut the abcd...abcdef string, because the full length will be over 70 characters, which may not be what the user wanted...

Show
Jerome Lacoste added a comment - I don't think the archiver should try to interpret what the user wanted to do. Example: if the user specifies a value of xyz\r\abdcef.....abcdef\r\n with the UTF representation of the abdcef.....abcdef string long of 70 bytes, the 'smart' archiver will replace the \r\n by a space, then cut the abcd...abcdef string, because the full length will be over 70 characters, which may not be what the user wanted...
Hide
Jerome Lacoste added a comment -

Or maybe not as the continuation will only appear in the raw Manifest and not in the Manifest attribute values when read by a program.
Still converting characters is strange to me. Why can't the user write "xyz\r\n zyx" instead if he really wants a space?

Show
Jerome Lacoste added a comment - Or maybe not as the continuation will only appear in the raw Manifest and not in the Manifest attribute values when read by a program. Still converting characters is strange to me. Why can't the user write "xyz\r\n zyx" instead if he really wants a space?
Hide
Jochen Wiedmann added a comment -

The user can write xyz\r\n zyx, if he is aware that a conversion occurs. However, most probably, he is unaware. In which case it seems sensible to convert to a value which comes as close to the users input as possible, (And, please note: I am talking about the logical value, which is returned by Attributes.getValue(entryName), as opposed to the physical value, which is written into the Manifest. Obviously, the logical value matters.)

Show
Jochen Wiedmann added a comment - The user can write xyz\r\n zyx, if he is aware that a conversion occurs. However, most probably, he is unaware. In which case it seems sensible to convert to a value which comes as close to the users input as possible, (And, please note: I am talking about the logical value, which is returned by Attributes.getValue(entryName), as opposed to the physical value, which is written into the Manifest. Obviously, the logical value matters.)
Hide
Kristian Rosenvold added a comment -

Plexus archiver now uses jdk manifest class and I believe this problem is gone

Show
Kristian Rosenvold added a comment - Plexus archiver now uses jdk manifest class and I believe this problem is gone

People

Vote (1)
Watch (3)

Dates

  • Created:
    Updated:
    Resolved: