Internships
Java is one of the most popular programming languages in the world and it runs on billions of devices scaling from credit cards to multi-machine servers. Oracle is the main contributor to the Java programming language, developed through the OpenJDK project. The Java Virtual Machine (JVM) is the core piece of technology that enables Java’s “write once, run anywhere” - the ability to run the same Java program on multiple hardware architectures and operating systems without having to recompile the code. The JVM also implements the Java memory management with a garbage collector that handles all the details for you, and just-in-time (JIT) compilers that enable Java performance to be better than what is possible with any statically compiled language.
The Oracle development office in Stockholm hosts a large part of the JVM development team. We have world leading expertise in areas such as garbage collection, managed runtimes, and compilers. The internship will be performed on-site in the Stockholm office.
To apply, or for more information, contact jesper.wilhelmsson@oracle.com.
Current Internship Proposals
JVM Compiler Team: Template-Based Testing
To test the JVM's Just-In-Time (JIT) compilers, we use an internal test framework that allows writing template-based tests, i.e., tests with "holes" that are then filled in by the framework. The goal of this project is to extend the framework with new functionality and write more tests that cover the major optimization phases of the JITs.
As experience and research shows, testing optimizing compilers is challenging. While testing our JVM's JIT compilers, we made the following key observations:
- Black box testing such as fuzzing, which generates random programs to feed into the compiler, is useful but does not scale well. Many of the generated tests are not useful and execute slowly.
- Writing efficient tests requires expert knowledge of the compiler.
- Most of our hand-written and targeted regression tests are fairly simple, consisting of only a few lines of Java code.
- Even slight modifications to existing tests often uncover new bugs.
- There are common code patterns, such as synchronized blocks or volatile field access, that are beneficial to optionally include in tests as they stress optimizations.
- Hand-written tests do not provide sufficient coverage since they often miss boundary values and conditions.
Based on these observations, we developed a test framework that enables generalization of hand-written tests into templates, significantly enhancing their coverage. The framework is still experimental but the goal is to be able to write tests with "holes" that are then filled in by the framework.
The goal of this project is to extend the framework with new functionality, we have a long list of ideas, and write more tests that cover the major optimization phases of the JITs. Depending on the interest and skills of the intern, the focus can be either on Java features or major optimizations of the JITs. This project is also an opportunity to learn more about some new Java features and projects (e.g. VectorAPI, the Foreign Functions and Memory API, Valhalla, etc), and to develop templated tests that provide high test coverage (e.g. combination of all input types, input values, combination of API calls, etc). This approach would be a great contribution to the quality and maturity of these Java features and the corresponding JIT implementations. Alternatively, we can focus on major optimization phases of the JITs, and extract Java code patterns that trigger these optimizations. We can then generalize (i.e. templetize) these code patterns to reach increased test coverage of the JIT optimizations. This would be a great contribution to the quality and stability of the JIT compiler more generally.
Additionally, we could use code coverage tools to assess the extent of coverage these tests achieve on the targeted optimizations.
Requirements
- Must be able to work in Java
- An understanding of C++ is a plus but not required
- Familiarity with the Java ecosystem is a plus but not required
- Compiler design knowledge is a plus but not required
JVM Garbage Collection Team: Partial Compaction in the G1 Collector
Memory fragmentation is an issue in programs using dynamic memory allocation. Java uses garbage collection to mitigate this problem, with G1 being the current default collector. This internship will explore the implementation and integration of a free-list based allocator for local allocation buffers (LABs) in G1.
G1 addresses fragmentation by relocating objects in memory to create regions densely packed with live data, and free, new unfragmented regions. These regions are then used to allocate new LABs using "bump-pointer" allocation.
While allocation of LABs from unfragmented memory is fast, and keeps LABs together to facilitate exploitation of the weak generational hypothesis, it wastes areas of known free memory not inside unfragmented memory. This can cause unnecessary garbage collection effort because unfragmented, new memory is scarce and needs to be regularly generated using garbage collection.
The allocator implemented in this project will be used both for memory areas where evacuation is currently prohibited, and for medium to highly occupied areas not prime targets for evacuation for other reasons, to reduce the need to generate unfragmented, new memory and so decrease overall effort spent in garbage collection.
This LAB allocator should be evaluated through benchmarking, focusing on allocation throughput, fragmentation, and overall performance.
Requirements
- Must be able to work in C++
- Basic understanding of GC and allocation techniques is a plus
JVM Compiler Team: Translation Validation in C2
The C2 optimizing JIT compiler is crucial for the performance of the HotSpot JVM. To achieve good performance, C2 applies a large number of often complex optimizations, and it is therefore challenging to justify C2's correctness. This internship concerns further developing and practically applying a technique that formally verifies the correctness of individual compilations, called translation validation.
The traditional approach for verifying compiler correctness is to check for compiler crashes or miscompilations when executing hand-written or randomly generated tests in the compiler's source language. This approach is powerful and simple to apply, but can never provide complete formal guarantees of correctness. More recent approaches provide formal guarantees of compiler correctness, either by mechanized full formal compiler correctness proofs (see, e.g., CompCert and CakeML), or by applying a technique called translation validation. Translation validation formally proves the correctness of individual compilations, and therefore provides a powerful middle ground between testing and full formal verification.
Within the scope of this internship, the intern will further develop a previously developed translation validation tool for C2, and will also develop fuzzing techniques to generate random inputs to the tool. The overall goal of the internship is to identify C2 compiler bugs.
Requirements
- Must be able to understand and work in Java, C++, and functional programming languages (e.g., Haskell, OCaml, or Lean)
- Must be familiar with basic compiler design
- Must have an interest in and familiarity with formal methods
JVM Sustaining Engineering: Automated Backporting
Making sure that bug fixes from the development release of the JDK are also brought back to earlier, still supported releases is an important part of keeping the JDKs used in production systems stable. This is today in large parts a manual job. This project investigates a technique to automate this work.
Sustaining Engineering's role is to ensure the long-term support (LTS) releases of Java continue to work reliably. Java users rely on us to be able to resolve critical issues when they appear. This team specializes in root cause analysis, and responsible stewardship of Java releases currently being relied on by customers. You will have the chance to see how real-world software deployment is done in an environment where the stakes are high.
Often, a bug discovered in an earlier release has already been fixed in the latest development of the OpenJDK, so the ability to apply these fixes to earlier releases is an important tool. This process is called _Backporting_. Backporting presents several challenges, as the previous release might have diverted substantially from latest development over many years. First, code patches might not apply syntactically: files and methods might have been renamed, moved, merged or split up. Second, even if a patch applies, it might not compile, and even if it compiles, it must still be evaluated for correctness as context and assumptions might have changed between the older release and mainline development. Backporting can be done reactively, to remediate an observed issue, or proactively, in the hope of preventing issues in the future. Backporting always carries the risk of introducing new bugs.
In this project, you will evaluate "[FixMorph](https://dl.acm.org/doi/abs/10.1145/3460319.3464821)", an implementation of some techniques for automated backporting that have shown promise in the Linux C codebase, and you will attempt to apply it to the HotSpot codebase. HotSpot is the OpenJDK's Java Virtual Machine, and it is mostly written in a subset of C++ that is close to C. For example, the C++ standard library is not used. You will explore if FixMorph can be used to assist in backporting HotSpot issues. At every step of the way, you will have the assistance of JVM Sustaining engineers.
Requirements
- Must be able to understand and work with C++ code
- Familiarity with the Java ecosystem or other open source projects is a plus but not required
- Problem solving mindset and proven experience is a plus