Pre-Allocated Methods for HashMap and HashSet - Sip of Java
Billy Korando on October 24, 2022The classes HashMap
and HashSet
have been part of the JDK since JDK 1.2; however, for much of this time, these classes, and child classes of them, have had a less-than-optimal usage pattern. In this article, we will explore what common mistake developers have been making when using these classes, the impact of this mistake, and how a new factory method added to these classes in JDK 19 aims to address this. Let’s take a look.
Setting Initial Capacity
The classes HashMap
and HashSet
, and child classes, LinkedHashMap
, LinkedHashSet
, and WeakHashMap
, provide constructors where a “capacity” value can be set. In fact, this value is the internal table size for the Map or Set, which needs to be larger than the number of entries the Map or Set is expected to contain. Specifying the wrong value can lead to costly reallocations while the Map or Set is being populated.
Setting Pre-Allocated Maps and Sets
To address this issue in JDK 19 the static factory method new<TypeName>(int numMappings)
(e.g. newHashMap(int numMappings)
) was added to HashMap
, HashSet
and child classes. The new factory method will return a Map or Set with its capacity set based on the default load factory of 0.75
. The result of this change is a small but noticeable improvement in performance, as seen in the below table:
Method used: Elements: Execution time (avg):
Constructor 10 380.079 ns/op
Factory 10 378.929 ns/op
Constructor 100 4135.819 ns/op
Factory 100 3847.426 ns/op
Constructor 1000 39970.521 ns/op
Factory 1000 37568.595 ns/op
Note: See additional reading for a link to the repo of the benchmarking tests the above results were taken. Performance testing is difficult and time-consuming, and results will vary based on usage and system architecture.
Using Pre-Allocated Maps and Sets
When developers use a Map or Set, the number of elements might not be known. In this case, the benefit of using a pre-allocated factory method is minimal or might result in excessive memory usage. The pre-allocated factory method should only be used when you have a very good understanding of the expected collection capacity.
Additional Reading
- HashMap Javadoc
- Need method to create pre-sized HashMap
- Pre-allocated Benchmark tests
- OpenJDK: Where the Magic Happens (Presentation that features the change introducing pre-allocated methods as an example of contributing to OpenJDK)
Happy coding!