What is the difference between hashmap implementation in Java 8 and pre-Java 8?
The difference comes with the type of storage used for clashing buckets. The HashMap has always worked in a way where the table contains several buckets.
When a hash is calculated from the key, it then points to which bucket it needs to go to.
However, it is possible that two keys calculate to the same hash or even more likely points to the same bucket.
This is referred to as a hash clash. And this is why a bucket is used - to hold more than just one item in that spot.
Now, prior to J8, this bucket was just a linked list. I.e. if you did have some clashes in your data, then looking between them was a normal linear search on that bucket.
What J8 changed was that this became adaptive - e.g. if there’s only a handful of clashing items in a bucket it’s still the same.
But once there are more than a predetermined amount of clashing items in the same bucket - it is turned into a balanced binary tree instead.
The benefit of this is that if the hashing formula isn’t great and tends to cause many clashes, your average seeks time is still not as bad as an O(N) linear search.
It’s closer to the O(log N) of a binary search instead.
That’s the major difference since J8. There are also some other ancillary changes affecting this.
For E.g. the default hash formula on strings was optimized to be both faster and allow for fewer clashes on average.
Thus if your key is a string (some someone’s name in a contacts list),
it should calculate that hash faster than before and the likelihood that it would calculate a hash pointing to the same bucket is less.
No comments:
Post a Comment