Looking at profile data for 0.95 and 0.96 the top 5 methods for 0.95 are:
61.0 (1.9457735247208932%) Lorg/jikesrvm/jni/VM_JNIFunctions;.ReleaseStringUTFChars (Lorg/jikesrvm/jni/VM_JNIEnvironment;ILorg/vmmagic/unboxed/Address;)V BOOT
67.0 (2.1371610845295055%) Lorg/jikesrvm/runtime/VM_Runtime;.clone (Ljava/lang/Object;)Ljava/lang/Object; BOOT
82.0 (2.615629984051037%) Lorg/jikesrvm/runtime/VM_Runtime;.resolvedNewArray (III[Ljava/lang/Object;IIII)Ljava/lang/Object; BOOT
89.0 (2.8389154704944177%) Lorg/apache/xalan/serialize/SerializerToXML;.characters ([CII)V
114.0 (3.6363636363636362%) Lorg/jikesrvm/scheduler/greenthreads/VM_FileSystem;.writeBytes (I[BII)I BOOT
151.0 (4.81658692185008%) Lorg/apache/xalan/templates/ElemApplyTemplates;.transformSelectedNodes (Lorg/apache/xalan/transformer/TransformerImpl;)V
and 0.96 are:
72.0 (2.187784867821331%) Lorg/jikesrvm/runtime/VM_Runtime;.clone (Ljava/lang/Object;)Ljava/lang/Object; BOOT
94.0 (2.8562746885445156%) Lorg/jikesrvm/scheduler/greenthreads/VM_FileSystem;.writeBytes (I[BII)I BOOT
107.0 (3.2512914007900338%) Lgnu/java/nio/charset/ISO_8859_1$Encoder;.encodeLoop (Ljava/nio/CharBuffer;Ljava/nio/ByteBuffer;)Ljava/nio/charset/CoderResult; BOOT
109.0 (3.3120632026739596%) Lorg/apache/xalan/serialize/SerializerToXML;.characters ([CII)V
159.0 (4.831358249772106%) Lorg/apache/xalan/templates/ElemApplyTemplates;.transformSelectedNodes (Lorg/apache/xalan/transformer/TransformerImpl;)V
my eye is drawn to Lgnu/java/nio/charset/ISO_8859_1$Encoder;.encodeLoop which is a loop that reads from a buffer a char then writes to another buffer a byte. It appears in the 0.96 trace but not the 0.95 trace. There is a TODO item to make this use array copies (I did a patch for this a while ago but never replicated it for other charsets). I'm going to repeat profiling to see if its possible to clearly identify the culprit.
Problem has repeated across a couple of performance runs, so it looks real.