AMD have announced SSE5 with new compare instructions that should allow us to more easily match Java's semantics:
http://developer.amd.com/assets/sse5_43479_BDAPMU_3-00_8-27-07.pdf
I think this makes it more the case that we want to stage the instructions generated for SSE based on the version number.
SSE=0 => use x87 fcomi
SSE=1 or 2 => use comiss for single and fcomi for doubles
SSE=3 or 4 => use comiss and comisd
SSE=5 => use comss and comsd