Performance Improvements in JDK 26

by Ana-Maria Mihalceanu, Per-Ake Minborg on June 9, 2026

JDK 26 is the latest feature release of the Java platform and comes with more than 2500 issues fixed, of which more than a thousand were enhancements. To give you a clearer view of the performance work happening across the Java platform, this article highlights a selection of notable performance related improvements in JDK 26, grouped into four major areas: JDK Libraries, Garbage Collectors, Compiler, and Runtime.

Enhancements in JDK Libraries

JEP 526: Lazy Constants (Second Preview)

The preview of the Lazy Constants API introduces java.lang.LazyConstant, an object that holds a single, unmodifiable value which is initialized on demand. After object initialization, the JVM can treat the value as constant, enabling optimizations similar to those available for final fields (constant folding), without the need to perform eager initialization in constructors or class initializers. You keep at-most-once initialization and thread safety, but you can initialize the object later, only if the value is actually needed, improving startup time and reducing unnecessary work.

The API first appeared in JDK 25 as Stable Values (JEP 502 and the version revised in JDK 26 incorporates substantial community feedback gathered from early adopters. The redesign renamed from StableValue to LazyConstant, methods (orElseSet, setOrThrow, trySet) got removed in favor of factories that take value-computing functions, and null is disallowed as a computed value to simplify and speed up the runtime model (aligning it with unmodifiable collections and ScopedValue style semantics). Discoverability is also improved by moving factories for lazy collections into List and Map (List.ofLazy, Map.ofLazy).

As shown in the snippet below, you can replace "mutable + null-check + synchronization" patterns with LazyConstant.of(() -> compute()), and call get() wherever you need the value.

import java.lang.LazyConstant;

final class Application {
private static final LazyConstant<Service> SERVICE = LazyConstant.of(Service::new);

      static Service service() {
          return SERVICE.get();
      }
}

Initialization is guaranteed to occur at most once, even under concurrent access: if multiple threads race, one wins and publishes the value safely. Under the hood, the mechanism still relies on JVM support for "stable" fields, so once the lazy value is set, repeated accesses can be optimized aggressively, provided that the LazyConstant itself is stored in a final field. This feature serves as a middle ground between eager initialization and peak performance: you can defer work out of startup, but once the value exists the JVM may optimize access similarly to a final constant.

JDK-8362893: Faster MemorySegment::getString

In performance discussions, allocation elision usually refers to the JIT compiler proves that a temporary object or array does not need to exist as a real heap allocation, so it removes the allocation or replaces it with cheaper operations. Some edge test cases discovered that MemorySegment::getString creates a temporary array internally, and that allocation is not optimized away.

Starting with JDK 26, string extraction from a MemorySegment benefits from an implementation that reduces intermediate allocation and copying when creating Java strings from memory segments. Early benchmark results showed lower latency across the tested string sizes and a particularly large improvement for short strings.

From a performance perspective, this enhancement means that MemorySegment::getString can exist on hot paths where native or off-heap data is frequently converted into Java strings. Reducing temporary allocation and copying overhead can lower per-call latency, reduce allocation pressure, and indirectly reduce garbage collection activity in workloads that perform many such conversions.

JDK-8366424: Missing Type Profiling in Generated Record Object Methods

JDK 26 also improves the performance of automatically generated hashCode() methods for record classes. Since records are commonly used as Map keys or Set elements, hashCode() performance can have a direct impact on application throughput.

With this change, record hashing is optimized to behave as efficient as manually written implementations. As a result, code that relies heavily on records for frequent lookups, grouping, indexing, or deduplication, can benefit from better throughput.

Cryptography Performance Improvements

JDK 26 includes several targeted performance improvements in cryptographic algorithms, including AES, ML-DSA, and Elliptic Curve P-256. These changes reduce unnecessary work in key setup, improve low-level arithmetic, and add or enhance CPU-specific intrinsics so that supported platforms can execute common cryptographic operations more efficiently. For more details, see issues JDK-8371820JDK-8371820, JDK-8371450, JDK-8371259, and JDK-8365581.

Other JDK Library Performance Enhancements and Bug Fixes

JDK-8374644 improves GZIPInputStream performance when reading single compressed streams, such as data received from a byte array or socket.
JDK-8359119 updates Charset to use the lazy constant approach, replacing older initialization patterns with the newer API and carrying explicit performance labeling in the issue.
JDK-8371319 optimizes java.lang.reflect.Method::equals to immediately return true when passed the same instance. The benefit of this implementation is noticeable in dynamic proxy implementations, where equality checks may happen frequently during method dispatch.

Garbage Collection Improvements

JEP 522: G1 GC Improves Throughput by Reducing Synchronization

G1 tracks cross-region pointer updates in a card table, which is maintained by write barriers injected into the application code. Some workloads update references so frequently that the card table becomes costly to scan during pauses; G1 therefore optimizes it in the background. That optimization requires synchronization with application threads, making write barriers and optimization logic more complex and slower, which hurts throughput and can also affect latency.

The change from JEP 522 introduces a second card table. This improves throughput (and slightly latency) by cutting synchronization between application threads and G1's background card-table optimizer threads without changing G1's overall design or user-facing behavior. Application threads always update the "active" table without synchronization, simplifying and speeding up write barriers. Optimizer threads work independently on the other table (which is initially empty). When G1 predicts that scanning the active table would exceed the pause-time target, it atomically swaps the tables.

As mentioned in the JEP 522, in reference-heavy workloads, results showed 5–15% throughput gains, and up to approximately 5% even when reference updates are light (e.g., x64 write barriers shrink from ~50 to ~12 instructions). Pause times drop slightly, and from a memory cost perspective, an extra card table at ~0.2% of heap (~2MB native per 1GB heap).

JEP 516: Ahead-of-Time Object Caching with Any GC

Starting with JDK 26, the HotSpot ahead-of-time (AOT) cache works with any garbage collector, including ZGC, to improve startup and warmup without forcing a trade-off against low-latency GCs. The AOT cache stores Java heap objects (e.g., Class objects and their referenced Strings and byte arrays) in a GC-specific, in-memory format. That allows the JVM to memory-map cached objects directly into the heap for fast startup.

However, garbage collectors represent object references differently: compressed versus uncompressed pointers, depending on the heap size and region/large-object placement rules (e.g., G1 vs. ZGC). To solve these incompatible reference formats, JEP 516 adds an optional GC-agnostic object format. There are performance trade-offs between the two formats:

GC-specific (mappable) objects can be nearly instant on warm starts because the cache is likely already in the filesystem cache.
GC-agnostic (streamable) objects can better hide disk latency on cold starts, but typically need an extra CPU core for streaming/materialization work.

The JDK ships two baseline AOT caches (one of each type) so the JVM can choose between mapping and streaming even when an application doesn’t provide its own cache. The JVM applies a heuristic to determine which format to generate after training:

Choose the streamable/GC-agnostic format if training used ZGC, the -XX:+UseCompressedOops option, or a heap > 32GB (which signals a less constrained system).
Prefer the mappable/GC-specific format if training used compressed oops (signals a more constrained system). To enforce GC-agnostic streaming, add -XX:+AOTStreamableObjects option, even if you also specify -XX:+UseCompressedOops.

The performance benefit from this change is broader access for applications to achieve faster startup and warmup without changing their garbage-collection strategy.

JDK-8371986: Default Initial Heap Size Now Set to MinHeapSize

JDK 26 improves JVM startup performance by making the default initial Java heap smaller when you do not configure an explicit heap size.

Previously, when users did not set the initial heap size with -Xms or -XX:InitialHeapSize, the JVM derived it from InitialRAMPercentage, whose default value was 1.5625% of physical memory, or roughly 1/64 of system RAM. On machines with large amounts of memory, this could lead to a relatively large heap being prepared at startup.

With this change, the JVM no longer applies a default InitialRAMPercentage. Instead, when you do no specify an initial heap size, the JVM starts with the minimum possible heap size, MinHeapSize. The performance impact of this change is most visible for applications using the default JVM configuration. By initializing less heap metadata up front, the JVM can begin execution sooner while still allowing the heap to grow later as the application needs more memory.

Compiler Improvements

JDK-8325467: Support C2 Compilation of Methods with Large Numbers of Parameters

In HotSpot’s tiered compilation model, code usually starts in the interpreter, may be compiled quickly by C1 for faster warmup, and can later be optimized more aggressively by C2 once it becomes hot. Starting with JDK 26, the C2 JIT compiler can now handle methods with very large parameter lists. This allows more frequently executed code to be compiled with C2 rather than remaining on less-optimized execution paths such as C1 or the interpreter.

As more methods gain access to C2 optimizations, this can improve throughput and reduce CPU overhead without requiring any application changes. If you would like to know more about how C2 JIT Compiler works, we recommend reading Emanuel Peter's blog.

JDK-8340093: C2 SuperWord Implement Cost Model

JDK 26 continues to improve JIT compiler’s cost modeling for loop vectorization, helping it make better decisions about when SIMD-style execution is actually beneficial. Vectorization can speed up loops by processing multiple values at once, but it can also introduce extra work, such as data shuffling, packing etc.

The work delivered in JDK 26 enhances the cost model used to decide whether transforming scalar loop operations into vector operations is worthwhile. Vectorization can greatly improve throughput by processing multiple values at once, but it is only beneficial when the extra work needed to prepare or combine vector data does not outweigh the gain.

To explore these C2 improvements in more depth, Emanuel Peter has written more C2 AutoVectorizer Improvement Ideas. He outlines the broader direction of C2 work, while in Vectorizing Reductions: from JDK 9 to JDK 26 and beyond he provides more background on how reduction vectorization has evolved over time.

Runtime Improvements

JDK-8369238: Virtual Threads Unmount from Carrier When Waiting for Class Initialization

When multiple virtual threads encounter a class that is still being initialized, they may need to wait until that initialization completes. Such waiting could keep the virtual thread attached to its carrier thread, reducing the number of carriers available to run other virtual threads.

By allowing waiting virtual threads to be preempted during common class-initialization paths, JDK 26 reduces unnecessary carrier blocking, improves scalability for applications with many virtual threads, and lowers the risk of throughput loss or carrier starvation during bursts of class loading or initialization.

Closing Thoughts

JDK 26 has been generally available since March 2026, so now is a great time to try it with your own applications and workloads.

As you test or plan your migration, make sure to measure how your application performs on JDK 26 compared with the JDK version you use today. If you notice behavior that looks like a regression, please get involved or raise the issue on the relevant mailing list. Feedback from production-like workloads helps the Java platform to continue to improve.

The JDK performance work continues as we see a healthy set of improvements taking shape for JDK 27, and we look forward to exploring those in more detail in a future update.

Until then… stay on the fast path!