Foreign Memory Access - Pulling all the threads

Maurizio Cimadamore on January 25, 2021

TL;DR

This restacking described in this doc enhances the Foreign Memory Access API in many different ways, and allows clients to approach the API in increasing degrees of complexity (depending on needs):

for smoother transition, coming from the ByteBuffer API, users can only have swap ByteBuffer::allocateDirect with MemorySegment::allocateNative - not much else changes, no need to think about lifecycles (and ResourceScope); GC is still in charge of deallocation
users that want tighter control over resources, can dive deeper and learn how segments (and other resources) are attached to a resource scope (which can be closed safely, if needed)
for the native interop case, the NativeScope abstraction is retconned to be both a ResourceScope and a NativeAllocator - so it can be used whenever an API needs to know how to allocate or which lifecycle should be used for a newly created resource
scopes can be locked, which allows clients to write critical sections in which a segment has to be operated upon w/o fear of it being closed
the idiom described here can be used to e.g. enhance the ByteBuffer API and to add close capabilities there

All the above require very little changes to the clients of the memory access API. The biggest change is that a MemorySegment no longer supports the AutoCloseable interface, which is instead moved to ResourceScope. While this can get a little more verbose in case you need a single segment, the code scales a lot better in case you need multiple segments/resources. Existing clients using jextract-generated APIs, on the other hand, are not affected much, since they are mostly dependent on the NativeScope API, which this proposal does not alter (although the role of a NativeScope is now retconned to be allocator + scope).

In detail…

As you know, I’ve been looking at both internal and external feedback on usage of the memory access API, in an attempt to understand what the problem with the API are, and how to move forward. As discussed here [1], there are some things which work well, such as structured access, or recent addition to shared segment support (the latter seem to have enabled a wide variety of experiments which allowed us to gather more feedback - thanks!). But there are still some issues to be resolved - which could be summarized as “the MemorySegment abstraction is trying to do too many things at once” (again, please refer to [1] for a more detailed description of the problem involved).

In [1] I described a possible approach where every allocation method (MemorySegment::allocateNative and MemorySegment::mapFile) return a “allocation handle”, not a segment directly. The handle is the closeable entity, while the segment is just a view. While this approach is workable (and something very similar has indeed been explored here [2]), after implementing some parts of it, I was left not satisfied with how this approach integrates with respect to the foreign linker support. For instance, defining the behavior of methods such as CLinker::toCString becomes quite convoluted: where does the allocation handle associated with the returned string comes from? If the segment has no pointer to the handle, how can the memory associated to the string be closed? What is the relationship between an allocation handle and a NativeScope? All these questions led me to conclude that the proposed approach was not enough, and that we needed to try harder.

The above approach does one thing right: it splits memory segments from the entity managing allocation/closure of memory resources, thus turning memory segments into dumb views. But it doesn’t go far enough in this direction; as it turns out, what we really want here is a way to capture the concept of the lifecycle that is associated to one or more (logically related) resources - which, unsurprisingly, is part of what NativeScope does too. So, let’s try to model this abstraction:

interface ResourceScope extends AutoCloseable {
   void addOnClose(Runnable) // adds a new cleanup action to this scope
   void close() // closes the scope

   static ResourceScope ofConfined() // creates a confined resource scope
   static ResourceScope ofShared() // creates a shared resource scope
   static ResourceScope ofConfined(Cleaner) // creates a confined resource scope - managed by cleaner
   static ResourceScope ofShared(Cleaner) // creates a shared resource scope - managed by cleaner
}

It’s a very simple interface - you can basically add new cleanup actions to it, which will be called when the scope is closed; note that ResourceScope supports implicit close (via a Cleaner), or explicit close (via the close method) - it can even support both (not shown here).

Armed with this new abstraction, let’s try to see if we can shine new light onto some of the existing API methods and abstractions.

Let’s start with heap segments - these are allocated using one of the MemorySegment::ofArray() factories; one of the issues with heap segments is that it doesn’t make much sense to close them. In the proposed approach, this can be handled nicely: heap segments are associated with a global scope that cannot be closed - a scope that is always alive. This clarifies the role of heap segments (and also of buffer segments) nicely.

Let’s proceed to MemorySegment::allocateNative/mapFile - what should these factories do? Under the new proposal, these method should accept a ResourceScope parameter, which defines the lifecycle to which the newly created segment should be attached to. If we want to still provide ResourceScope-less overloads (as the API does now) we can pick a useful default: a shared, non-closeable, cleaner-backed scope. This choice gives us essentially the same semantics as a byte buffer, so it would be an ideal starting point for developers coming from the ByteBuffer API trying to familiarize with the new memory access API. Note that, when using these more compact factories, scopes are almost entirely hidden from the client - so no extra complexity is added (compared e.g. to the ByteBuffer API).

As it turns out, ResourceScope is not only useful for segments, but it is also useful for a number of entities which need to be attached to some lifecycle, such as:

upcall stubs
va lists
loaded libraries

The upcall stub case is particularly telling: in that case, we have decided to model an upcall stub as a MemorySegment not because it makes sense to dereference an upcall stub - but simply because we need to have a way to release the upcall stub once we’re done using it. Under the new proposal, we have a new, powerful option: the upcall stub API point can accept an user-provided ResourceScope which will be responsible for managing the lifecycle of the upcall stub entity. That is, we are now free to turn the result of a call to upcallStub to something other than a MemorySegment (e.g. a FunctionPointer?) w/o loss of functionality.

Resource scopes are very useful to manage group of resources - there are in fact cases where one or more segments share the same lifecycle - that is, they need to be all alive at the same time; to handle some of these use cases, the status quo adds the NativeScope abstraction, which can accept registration of external memory segment (via the MemorySegment::handoff API). This use case is naturally handled by the ResourceScope API:

try (ResourceScope scope : ResourceScope.ofConfined()) {
    MemorySegment.allocateNative(layout, scope):
    MemorySegment.mapFile(… , scope);
    CLinker.upcallStub(… , scope);
} // release all resources

Does this remove the need for NativeScope ? Not so fast: NativeScope is used to group logically related resources, yes, but is also used as a faster, arena-based allocator - which attempts to minimize the number of system calls (e.g. to malloc) by allocating bigger memory blocks and then handing over slices to clients. Let’s try to model the allocation-nature of a NativeScope with a separate interface, as follows:

@FunctionalInterface
interface NativeAllocator {
    MemorySegment allocate(long size, long align);
    default allocateInt(MemoryLayout intLayout, int value) { … }
    default allocateLong(MemoryLayout intLayout, long value) { … }
    … // all allocation helpers in NativeScope
}

At first, it seems this interface doesn’t add much. But it is quite powerful - for instance, a client can create a simple, malloc-like allocator, as follows:

NativeAllocator malloc = (size, align) -> 
     MemorySegment.allocateNative(size, align, ResourceScope.ofConfined());

This is an allocator which allocates a new region of memory on each allocation request, backed by a fresh confined scope (which can be closed independently). This idiom is in fact so common that the API allows client to create these allocators in a more compact fashion:

NativeAllocator confinedMalloc = NativeAllocator.ofMalloc(ResourceScope::ofConfined);
NativeAllocator sharedMalloc = NativeAllocator.ofMalloc(ResourceScope::ofConfined);

But other strategies are also possible:

arena allocation (e.g. the allocation strategy currently used by NativeScope)
recycling allocation (a single segment, with given layout, is allocated, and allocation requests are served by repeatedly slicing that very segment) - this is a critical optimization in e.g. loops
interop with custom allocators

So, where would we accept a NativeAllocator in our API? Turns out that accepting an allocator is handy whenever an API point needs to allocate some native memory - so, instead of

MemorySegment toCString(String)

This is better:

MemorySegment toCString(String, NativeAllocator)

Of course, we need to tweak the foreign linker, so that in all foreign calls returning a struct by value (which require some allocation), a NativeAllocator prefix argument is added to the method handle, so that the user can specify which allocator should be used by the call; this is a straightforward change which greatly enhances the expressive power of the linker API.

So, we are in a place where some methods (e.g. factories which create some resource) takes an additional ResourceScope argument - and some other methods (e.g. methods that need to allocate native segments) which take an additional NativeAllocator argument. Now, it would be inconvenient for the user to have to create both, at least in simple use cases - but, since these are interfaces, nothing prevents us from creating a new abstraction which implements both ResourceScope and NativeAllocator - in fact this is exactly what the role of the already existing NativeScope is!

interface NativeScope extends NativeAllocator, ResourceScope { … }

In other words, we have reconnected the existing NativeScope abstraction, by explaining its behavior in terms of more primitive abstractions (scopes and allocators). This means that clients can, for the most part, just create a NativeScope and then pass it whenever a ResourceScope or a NativeAllocator is required (which is what is already happening in all of our jextract examples).

There are some additional bonus points of this approach.

First, ResourceScope features some locking capabilities - e.g. you can do things like:

try (ResourceScope.Lock lock = segment.scope().lock()) {
    <critical operation on segment>
}

Which allows clients to perform segment critical operations w/o worrying that a segment memory will be reclaimed while in the middle of the operation. This solves the problem with async operation on byte buffers derived from shared segments (see [3]).

Another bonus point is that the ResourceScope interface is completely segment-agnostic - in fact, we have now a way to describe APIs which return resources which must be cleaned by the user (or, implicitly, by the GC). For instance, it would be entirely reasonable to imagine, one day, the ByteBuffer API to provide an additional factory - e.g. allocateDirect(int size, ResourceScope scope) - which gives you a direct buffer attached to a given (closeable) scope. The same trick can probably be used in other APIs as well where implicit cleanup has been preferred for performance and/or safety reasons.

Resources

You can find a branch which implements some of the changes described above (except the changes to the foreign linker API) here, while an initial javadoc of the API described in this email can be found here.