Finalizing the Foreign APIs
Maurizio Cimadamore on September 16, 2021Now that the Foreign Memory Access API and the Foreign Linker API have been around for some time, it is time to take a more holistic look at how these APIs are structured and used, and see if there are some final opportunities for simplification, before we make more steps towards finalizing these APIs. In this document, we will focus on the outstanding issues in the current iteration of the APIs and pave a path forward. I’d like to thank Paul Sandoz, John Rose and Brian Goetz who have provided many useful insights in the matters discussed throughout this document.
Memory dereference
When looking at how clients interact with the Foreign Memory Access API (especially when it comes to jextract-generated code), we noted an asymmetry between how memory is allocated and how memory is dereferenced. The code snippet below summarizes the issue:
MemorySegment c_sizet_array = allocator.allocateArray(SIZE_T, new long[] { 1L, 2L, 3L });
// print contents
for (int i = 0; i < 3; i++) {
System.out.println(MemoryAccess.getLongAtIndex(c_sizet_array, i));
}
Above, we can see that the API for allocating a segment (SegmentAllocator::allocateArray
) takes both a layout (namely, SIZE_T
) and a long[]
array. This idiom provides dynamic safety: if there is a mismatch between the size of the array component type and the size of the provided layout, an exception will be thrown. Perhaps surprisingly, the same doesn’t happen for the dereference API (MemoryAccess::getLongAtIndex
), which only takes a segment and an offset; there is no layout argument here which the runtime can use to enforce additional validation.
This inconsistency is not a mere cosmetic issue - but reflects the way in which the Foreign Memory Access API has evolved over time. In the first iterations of the API, the only way to dereference a memory segment was through a memory access var handle. While var handles still play a central role in our dereference story, especially when it comes to structured access (think of C structs and tensors), in subsequent releases of the API we have made some usability concessions, and ended up adding a full set of dereference method in a side class (MemoryAccess
) and, more recently, another set of methods to copy from Java arrays to memory segments and back (MemoryCopy
). But there are problems with this approach:
- These static methods are not consistent with the rest of the API; as seen above, they do not accept a layout parameter, and instead only accept an optional
ByteOrder
parameter. This is not very general, as endianness is merely one dimension which can affect how memory dereference should behave (what about e.g. alignment?) - Adding methods on side classes keeps the
MemorySegment
API simple, but creates a discoverability problem: when using an IDE it might not be obvious that the way to dereference a memory segment is to call a static method on a separate class.
In other words, it is time we look at these ancillary classes again, and see if a better solution is possible.
Attaching carriers to value layouts
A promising move, which we will discuss in the remainder of this document, is that of attaching carrier types to value layouts. That is, if we could express types such as ValueLayout<int>
and ValueLayout<double
, then our dereference API would look something like this:
interface MemorySegment {
...
<Z> Z get(ValueLayout<Z> layout, long offset)
<Z> void set(ValueLayout<Z> layout, long offset, Z value)
}
Note how this is nicely symmetric: assuming that we had constants like JAVA_INT
(whose type would be ValueLayout<int>
), we could now read an int value from a segment in a more straightforward way, as follows:
MemorySegment segment = ...
int i = segment.get(JAVA_INT, 0);
Here, the layout information (alignment, endianness) flows naturally into the dereference operation, thus making it unnecessary to support ByteOrder
-based overloads. The dereference API shown here is also much more discoverable (only one code completion away, when using an IDE) [1].
This seems like a win; not only is the API more usable and succinct, but it is also more extensible: should we add another carrier (Float16
or Long128
), we would only need to define its layout, and no extra API would be required. Finally, attaching carriers to value layouts allow us to significantly simplify the Foreign Linker API (more on that later).
Since we do not have specialized generics yet, how do we approximate the above API with the language we have today? One trick that is available to us is to introduce additional value layout leafs, one for each carrier (e.g. ValueLayout.OfInt
, ValueLayout.OfFloat
, etc.), and then define many dereference overloads, one per layout carrier:
byte get(ValueLayout.OfByte layout, long offset)
short get(ValueLayout.OfShort layout, long offset)
int get(ValueLayout.OfInt layout, long offset)
...
This works remarkably well in practice: it gives us type safety (it is no longer possible for users to use the wrong carrier with the wrong layout) - and, when Valhalla is ready, we can rewire these classes to be parameterized subclasses of ValueLayout
, and, eventually, deprecate them (as ValueLayout<Z>
would be enough). With this API in place, the problematic code snippet with which we started this section would become something like this:
MemorySegment c_sizet_array = allocator.allocateArray(SIZE_T, new long[] { 1L, 2L, 3L });
// print contents
for (int i = 0; i < 3; i++) {
System.out.println(c_sizet_array.get(SIZE_T, i));
}
If SIZE_T
has type ValueLayout.OfLong
, then clients will be forced (by the static compiler) to use a long[]
array when initializing the memory segment. Moreover, the dereference operation now allows clients to specify a layout, whose static type will influence which dereference overload will be selected - meaning that passing SIZE_T
to MemorySegment::get
will be guaranteed to return a long
.
Unsafe dereference
In some cases it would be nice to have dereference helpers for unsafe access too - consider the following case:
MemoryAddress addr = ...
int v = MemorySegment.globalNativeSegment().get(JAVA_INT, addr.toRawLongOffset());
While this code works well, it is also very verbose. In a way, this is by design - that is, clients should dereference memory segments, not plain addresses, as the former are safer (e.g. memory segment feature both spatial and temporal bounds). So, a safer alternative would be to do this:
MemoryAddress addr = ...
int v = addr.asSegment(100).get(JAVA_INT, 0);
But, for casual native off-heap access (especially for one-time upcall stubs), it would be nice for clients to have convenience unsafe dereference routines which work directly on MemoryAddress
instances:
MemoryAddress addr = ...
int v = addr.get(JAVA_INT, 0);
Unlike their counterparts in MemorySegments
the dereference methods in MemoryAddress
would be restricted methods, and using them would require clients to provide the --enable-native-access
flag on the command line.
Linker classification
If carriers are pushed down to value layouts, we can simplify other areas of the foreign API as well. CLinker
provides two main abstractions, to create downcall method handles (method handles targeting native functions) and upcall stubs (native function pointers targeting Java method handles). When linking, users have to provide both a Java MethodType
and a FunctionDescriptor
; the first describes the Java signature that callsites will be dealing with, while the latter describes the classification information that is required by the linker runtime to make it all work:
MethodHandle strlen = CLinker.getInstance().downcallHandle(
strLenAddr, // obtained with SymbolLookup
MethodType.methodType(long.class, MemoryAddress.class),
FunctionDescriptor.of(C_LONG, C_POINTER)
);
If carriers are attached to value layouts, it is fairly easy to see how the linking process would only require one set of information, namely the function descriptor: in fact we could always derive a Java MethodType
from the set of layouts associated with the function descriptor, using the following simple rules:
- if the layout is a value layout with carrier
C
, thenC
will be the carrier associated with that layout - if the layout is a group layout, then
MemorySegment.class
will be used as a carrier
In other words, the additional carrier information attached to value layouts would allow the linker runtime to distinguish between similarly-sized layout (e.g. a 32-bit value layout which can be either a C int or a C float). Moreover, we can always add new carriers to add as much classification as required by the linker runtime. This means that the above linkage request can be expressed more succintly as follows:
MethodHandle strlen = CLinker.getInstance().downcallHandle(
strLenAddr, // obtained with SymbolLookup
FunctionDescriptor.of(C_LONG, C_POINTER)
);
That is, only a function descriptor parameter is required and the Java type of the downcall method handle will be derived accordingly.
Layout attributes and constants
One immediate consequence of doing ABI classification this way is that the linker runtime is no longer reliant on the layout attribute mechanism to distinguish between similarly-sized value layouts; in fact, we propose to completely drop support for layout attributes from the layout API. While we do not expect this functionality to be widely used, we could always decide, at a later point, to allow users to attach custom Map
instances to layouts. Our implementation would not use this metadata, but would merely pass it along (e.g. when altering a ValueLayout
with one of the wither methods provided by the API).
Another important thing to note: since value layouts are sharply typed, typing of certain C layout constants, such as C_INT
becomes ambiguous (it would be ValueLayout.OfInt
on Windows/x64 and ValueLayout.OfLong
on Linux/x64). Instead of defining these constants with a less sharp type, we will opt to completely remove platform-dependent C layout constants from CLinker
: after all, it is the job of extraction tools, not the linker, to come up with a set of layout constants which work for a given extraction unit. Clients not using jextract can either define custom C layouts as static constants, or they can simply use JAVA_INT
, JAVA_LONG
, etc. which is not too different from using types such as jint
and jdouble
in JNI code. This observation allows us to remove most of the clutter from the CLinker
API, and to return a much simpler interface.
Linker Safety
Another issue that we wanted to address more explicitly by the Foreign Linker API is the one of safety of foreign calls: in other words, when passing structs by-reference to native calls, what happens if the scope associated with the struct is closed before the native call has completed? This can happen both in the confined and in the shared case, albeit to reproduce the issue with a confined scope we need at least to use upcalls (e.g. close the scope from a Java upcall).
The issue here is that the linker API forces clients of downcall method handles to erase by-reference parameters down to MemoryAddress
instances, and then pass those instances instead. This creates some tension in the API: either we also make MemoryAddress
a scoped abstraction (so that they keep track of the scope from which they originated), or we lose safety. But making MemoryAddress
a scoped abstraction (as we did in 17) has drawbacks: often MemoryAddress
is used when interacting with native code, to model native pointers coming out of downcall method handles; as such, it is attractive to think of MemoryAddress
as a simple wrapper around a long value (a machine address), which can be converted, at the user request, to a fuller segment (by providing custom size and scope). But if MemoryAddress
already has a scope, things get murkier, and we have to define what happens when clients happen to (maybe accidentally) override the existing scope.
We propose to address this issue with the following moves:
CLinker
no longer erases by-reference parameters toMemoryAddress
- theAddressable
carrier is used instead;- The
Addressable
interface also gets a resource scope accessor; this scope will be used by the linker runtime to keep the by-reference parameter alive throughout the call; MemoryAddress
is anAddressable
implementation whose scope is always the global scope.
With these changes, when we link strlen
as above, the type of the resulting downcall method handle won’t be (MemoryAddress)long
but (Addressable)long
. This means that clients can pass memory segments directly, and have the linker runtime pass them by-reference, as follows:
MemorySegment str = ...
long length = strlen.invokeExact((Addressable)str);
Or, w/o invokeExact
:
MemorySegment str = ...
long length = strlen.invoke(str);
The presence of the additional cast with the invokeExact
semantics is unfortunate, but, after evaluating many alternatives, it also seems the lesser evil. In most cases, tools will just be happy with the Addressable
type - in fact that’s exactly what jextract needs to generate its wrappers:
long strlen(Addressable x1) {
try {
return strlen_handle.invokeExact(x1);
} ...
}
Note that no cast is required in the above code, as the jextract wrapper is already generic. When not using jextract, the user has a choice: either to add a cast, like above (which is not much more verbose than to add a trailing .address()
call), or to convert the method handle type, as follows:
MethodHandle strlen_segment = CLinker.getInstance().downcallHandle(
strLenAddr, // obtained with SymbolLookup
FunctionDescriptor.of(C_LONG, C_POINTER)
).asType(long.class, MemorySegment.class);
...
MemorySegment str = ...
long length = strlen_exact.invokeExact(str);
Since we can tweak the method type associated with the downcall method handle with MethodHandle::asType
it is easy to inject sharper types into the downcall method handle, and drop the cast at the callsite, even when using invokeExact
.
Resource scopes
There are currently different kinds of resource scopes, partially overlapping with each other. Looking at the ResourceScope
class we find three main factories, to create confined, shared and implicit scopes. The first two are said to be explicit scopes - that is, clients can (deterministically) close such scopes using the close()
method. Implicit scopes, on the other hand, cannot be closed - attempting to do so will result in an exception. As such, the only way to dispose of resources associated with an implicit scope is to let the scope become unreachable.
In reality the picture is a bit more convoluted, since the API also allows creating explicit scopes that are associated with cleaner objects; such scopes can be closed via the close()
method (as any other explicit scope), but they also allow the scope to be cleaned up when it becomes unreachable. In some way, these scopes are both implicit and explicit.
While the resource scope API itself is relatively simple, the amount of different, and subtly overlapping factories it provides can be jarring. We propose to address this issue, by always registering a resource scope against a cleaner; after all, scopes are long-lived entities, and the overhead for registering scopes with an internal cleaner is minimal. Since now all scopes feature both explicit and implicit deallocation, the API can provide only two kinds of scopes, namely confined and shared, respectively, and drop implicit scopes. The resulting API is safer, because it is no longer possible for a client to forget to call close()
(the cleaner will kick in, and perform the associated cleanup). The API is also more uniform, since now all scopes (but the global scope, which is a singleton) can be closed, and used in a try-with-resources [2].
A last simplification we propose has been first discussed here and replaces the resource scope handle mechanism with a more direct way to express dependencies between scopes. With this mechanism in place, the following code:
void accept(MemorySegment segment1, MemorySegment segment2) {
try {
var handle1 = segment1.scope().acquire();
var handle2 = segment2.scope().acquire();
<critical section>
} finally {
segment1.scope().release(handle1);
segment2.scope().release(handle2);
}
}
Can be expressed, more succinctly, as follows:
void accept(MemorySegment segment1, MemorySegment segment2) {
try (ResourceScope scope = ResourceScope.newConfinedScope()) {
scope.keepAlive(segment1.scope());
scope.keepAlive(segment2.scope());
<critical section>
}
}
Finally, we would like to make ResourceScope
implement the SegmentAllocator
interface. It is not uncommon to have to call a method which requires a segment allocator from a context in which only a scope is available. The implementation of the ResourceScope
interface already implements SegmentAllocator
, but this implementation is not exposed in the public API, which instead allows clients to convert from scopes to allocators using the SegmentAllocator::ofScope
method. We believe that making the relationship between resource scope and allocators public would help to reduce the number of conversions required between the different abstractions provided by the foreign API.
Preview reshuffling
In preparation for the API to become a preview API, we plan to move all the classes in the jdk.incubator.foreign
package under the java.lang.foreign
package [3] in the java.base
module. Additionally, we plan to make the following changes (this work might take place on a separate branch, to avoid conflicts):
- The
MemoryHandles
class will be dropped and all its contents will be moved underMethodHandles
; this makes sense since this class contains a general factory for memory access var handle, plus a set of general var handle combinators. - Remove the
SymbolLookup
abstraction; to lookup symbols loader symbols, we plan to add a lookup method in theClassLoader
class. RemovingSymbolLookup
now does not prevent us from adding a more powerful lookup mechanism in the future; neither does it prevent clients from defining custom chained lookups, e.g. using aFunction<String, MemoryAddress>
. - Rename
ResourceScope
. It has been noted that theResourceScope
name is slightly misleading, as the wordscope
is sometimes interpreted in the context of lexical scopes. While it is true thatResourceScope
can provide, via the try-with-resource construct, a lexical scope within which allocation occurs, some uses of theResourceScope
abstraction have nothing to do with lexical scopes (e.g. shared segments stored in fields). For this reason, a more specific name might be chosen.
Since the changes described in the previous sections already lead to the removal of many of the ancillary classes such as MemoryAccess
, MemoryCopy
and MemoryLayouts
, no further adjustment will be necessary.
Summing up
Overall, the changes described here makes the Foreign APIs much tighter, simpler and safer too. Attaching carriers to value layouts allow dereference operation to be more general, uniform and statically safe; it also allows us to simplify the linker classification story, as there’s no need to redundantly provide the same information using a separate MethodType
argument when constructing downcall method handles. And, since downcall method handles no longer require clients to erase by-reference parameters to MemoryAddress
, clients can just pass any subtype of Addressable
(most notable memory segments) - and the linker API will keep the scope of by-reference parameters alive for the duration of the call. The role of MemoryAddress
becomes much simpler, as MemoryAddress
now becomes a simple wrapper around a long, which is used to model native pointers (in other words, obtaining a MemoryAddress
from an on-heap segment is no longer allowed). Finally, associating scopes with cleaners by default allows us to greatly simplify the API and to make it safer when it comes to preventing accidental memory leaks.
A javadoc which summarizes the proposed API changes can be found here; the corresponding code changes can be found in this experimental branch which also contains the required adjustments for the jextract
tool to work with the new API.
-
A similar idiom can also be used to enhance usability and static safety of bulk memory operations as well (not shown here) ↩
-
We will likely provide overloaded scope factories which allow clients to opt out of cleaners, in case scope allocation performance is critical. That said, this should be an advanced option, and we do expect most clients to be happy with the defaults provided by the simpler factories. ↩
-
We might decide to split functionalities in different packages - e.g.
java.lang.foreign
for the memory access API, andjava.lang.foreign.invoke
for the foreign linker API. ↩