JVM Tuning with Machine Learning on Garbage Collection Logs

Yagmur Eren on January 13, 2025

The work presented here is performed as part of the joint research project between Oracle, Uppsala University and KTH.

JVM Tuning with Machine Learning on Garbage Collection Logs

My name is Yagmur, and I recently completed my master’s degree in ICT Innovation—Embedded Systems at KTH. I had a great thesis opportunity at Oracle Stockholm during the Spring 2024 semester. I am truly grateful for this opportunity and all my colleagues at Oracle, especially my supervisors, Ponam Parhar and Marina Shimchenko. In this blog, I highlight the key points of my thesis. I hope you find it engaging!

As a developer or system administrator, you know how crucial the performance of Java applications can be. It’s not just about writing efficient code; the runtime performance, governed by the Java Virtual Machine (JVM), plays a significant role, too. Optimizing the JVM can be difficult, often requiring extensive expertise and effort. But what if machine learning (ML) could help us in this process?

In my thesis project, I explored a novel approach to JVM auto-tuning using ML algorithms by training them with the information gathered from Garbage Collection (GC) logs. This blog will walk you through my thesis project’s key concepts, methodology, and exciting outcomes.

JVM Basics

The JVM is an engine that translates Java programming instructions into bytecode that can run on any operating system. Within the JVM, memory management is handled by the garbage collector, which reclaims the memory used by objects that are no longer needed. This process is critical for maintaining application performance. The JVM has many memory-related options that we can set, such as the size of the heap size, the ratio between allocated size for young and old objects, etc., for optimal performance. We can obtain detailed records of the JVM’s memory management activities that can be printed in a file, namely, the GC log. We can gain valuable insights into an application’s behavior and performance by analyzing these logs. This information can help ML algorithms find the connection between memory allocation in the JVM and its performance.

The Problem and Motivation

The main challenge we addressed was whether ML algorithms could suggest optimal JVM flag values, particularly memory-related flags, to improve application performance based on information extracted from GC logs. Specifically, we focused on tuning two JVM flags related to the young generation size, which significantly impacts garbage collection performance. The research question we aimed to answer was: to what extent can the performance of a Java application be improved by using JVM flag values suggested by a machine learning model trained on GC log data from that application?

The Approach

Our approach involved several steps:

Collecting GC Logs: Initially, we started running Java benchmarks to generate GC logs (we used G1 garbage collector), which served as our raw data for analysis.
Feature Engineering: From these logs, we extracted features such as heap usage, GC pause times, and memory allocation rates. We also engineered new features like the allocation rate (which can be considered the application’s data allocation speed) to better understand memory utilization patterns.
Data Processing: We processed the data using a sliding time window approach, calculating performance-related metrics like net application time (time spent only for application threads), GC frequency, and maximum GC pause times for each time window.
Training ML Models: We selected five ML models: Support Vector Regression (SVR), Gradient Boosting (GB), Random Forest Regression (RFR), Extra Trees (ET), and Gaussian Process Regressor (GPR). These models were trained on the processed data to predict the optimal maximum and minimum young generation size within a given heap size.
Prediction: We prepared data samples with the poorest performance-related metrics within each time window and used the models to maximize these metrics by accordingly predicting the optimal maximum and minimum young generation size values for the JVM flags, namely, -XX:G1MaxNewSizePercent and -XX:G1NewSizePercent to improve the overall application performance.
Applying the Tuned Flags: Finally, we applied the suggested values for JVM flags that we wanted to tune while running the Java application and measured the performance improvements.

Key Findings

Our findings were quite promising. The ML models demonstrated considerable potential to enhance throughput by up to 20% compared to the default JVM flag settings while maintaining acceptable latency levels. First, we significantly improved the application’s throughput by adjusting the minimum young generation size (-XX:G1NewSizePercent). Then, by auto-tuning the maximum young generation size (-XX:G1MaxNewSizePercent), we optimized the latency without compromising the improved throughput of the application. For the benchmarks we used throughout the project, GPR seems the most promising ML model among others.

Conclusion and Future Work

This project was a significant initial step in exploring how ML can automate the tedious and complex task of JVM tuning, making it accessible even to those without deep expertise in JVM internals. In conclusion, it demonstrates that using ML models to effectively auto-tune JVM options based on learning from GC logs can lead to notable performance improvements. However, this is just the beginning. Future work could expand this approach to include more JVM flags by predicting other components with the same method, and different types of garbage collectors can be explored to boost their applicability and effectiveness further. Thank you for joining me on this journey into JVM auto-tuning with ML! I hope you found my thesis project insightful and inspiring for your performance optimization endeavors. Here you can find the thesis and read all the details summarized in this post.