Details
Description
The identity hashcodes of objects is their initial address. This means the bottom bits of the hashcode will always be 0 and for a generational GC hashcode collisions are quite likely. We should probably do a better job and have a more random identity hash code that will result in fewer hashtable collisions and the like.
This is inspired by analysis performed Jimmy, Jing Lv on the use of the Harmony HashMap:
http://mail-archives.apache.org/mod_mbox/harmony-dev/200707.mbox/raw/%3c5c8e69f0707032218i159b99cbu84b29a5d469469eb@mail.gmail.com%3e
Why anyone would return a raw address as a hashcode I can only guess..
Our address based hashing function is
ADDRESS >>> LOG_BYTES_IN_ADDRESS
As we sometimes 8 byte align objects, the distribution of our hash will be uneven and give more even numbers than odd..
We should improve it by making it:
ADDRESS >>> LOG_MAX_ALIGNMENT
where LOG_MAX_ALIGNMENT is currently 8 for intel I believe.
J9 appears to have made a similar mistake in that all values are even. Perhaps they didn't keep their hash in sync with their minimum alignment?