How do 'hot and cold' objects behave?


The work presented here is performed as part of the joint research project between Oracle, Uppsala University and KTH. Follow the blog series here at inside.java to read more about the JVM research performed at the Oracle Development office in Stockholm.


My name is Hanna Nyblom. I wrote my master’s thesis at Oracle during the later half of 2019, concluding my Master of Science in Computer Science at KTH (Royal Institute of Technology, Stockholm Sweden).

My thesis ‘An Experimental Study on the Behavioural Tendencies of Objects Classified As Hot and Cold by a Java Virtual Machine Garbage Collector’ examines the behaviour of objects classified as hot (recently referenced) or cold by the ThinGC garbage collector. The main behavioural tendencies I was looking to examine was whether objects tended to stay cold once they had become cold, if cold streaks tend to be longer than hot streaks and if objects tend to be predominantly hot or cold during their lifetime. These behaviours were examined for objects in general but also for objects of different classes to determine if some classes behaved distinctively.

I started out learning about ZGC and ThinGC, I got a great introduction from Albert Mingkung Yang, a PhD student from Uppsala Universitet which I got the opportunity to collaborate with. Albert is the architect of ThinGC, an ‘extension’ of ZGC which separates hot and cold objects into different memory spaces and collects the two spaces using two different garbage collectors.

After I had a better understanding of ZGC and ThinGC, I was able to modify the code of ThinGC slightly to log object information (address, hotness, and class) of each encountered object as well as the information on every object forwarding from one address to another. Since ThinGC is developed from ZGC I could also learn from the ZGC developers at the office and made sure the correct information was logged at the right phases of the GC-cycle. I then worked on a script to summarize the logged information and collect all information pertaining to the same object. This collected object information could be used to calculate metrics that answered my research questions. Here I spent time understanding what metrics would best answer my research questions and choosing what metrics to present. Some examples of chosen metrics are the longest hot/cold streak and the number of ‘reheats’ (transitions from cold storage to hot storage). This again was a collaboration and I took input from Albert.

The results of the thesis finally showed that most objects did have a tendency to stay cold once they had become cold and that cold streaks did tend to be longer than hot streaks. The results also showed that most objects are actually predominantly hot during their lifetime, due to the prevalence of short-lived objects which are consistently hot. When disregarding consistently hot/cold objects however, objects were more likely to be mostly cold than mostly hot. The results also showed distinct behaviour of different classes.

chart

These results could then be used to justify the ‘ThinGC’ approach. The results could also possibly be used to improve GC-tuning, and if calculated concurrently, could support freeze/reheat decisions and anticipate reheats.

If you are interested in learning more, my thesis can be found here.

The fact that I got to collaborate with a PhD Student, Albert Mingkung Yang, I think really elevated my thesis and I’m thankful for the experience. This also gave me a much better chance to actually contribute to research. I’m glad that my results were ultimately useful to Albert who will be presenting his ‘ThinGC: Complete Isolation With Marginal Overhead’ paper, at the International Symposium on Memory Management 2020.

Thanks to the people at Oracle Stockholm, my experience was very educational. The team was very helpful and I’m pleased I got to dive into this personally relatively new area of garbage collection and focus my master’s thesis on such an interesting topic.