Security and Sandboxing Post SecurityManager

Disclaimer: This post represents the author’s views only

Last week, JEP 411 proposed deprecating Java’s Security Manager for eventual removal through a process of gradual functional degradation. The Security Manager should be removed because the high cost of maintaining it is no longer justified by its benefits, which have dropped drastically over time as the deployment and threat environment changed. In this post I’d like to go over some of the Security Manager’s use cases and present superior alternatives. It is a happy irony that even though few use the Security Manager, the proposal to remove it has made people consider Java’s security, and the increased attention will hopefully make it easier to understand our current security focus, which relies, at least in part, on the module system.

Security measures are deployed to defend against a certain threat landscape. As a security measure, the Security Manager was designed to defend against threats posed by untrusted code — code you believe might be malicious. This is a primary threat that browsers running JavaScript on web sites face, as did Java Applets, the technology for which the Security Manager was originally designed as a sandbox. A sandbox is a mechanism that restricts the operations available for a piece of code, and we’ll discuss it at length later, but the threat landscape facing server-side applications is quite different. Servers tend to mostly run trusted code — code assumed to be benign — and the threats facing it are exploits of vulnerabilities that cause trusted code to perform unintended operations through carefully manipulated inputs.

Java’s security mechanisms for trusted code include a suite of cryptographic protocols, secure XML processing, JAR signing, and serialization filters, but also inherent VM properties, like memory safety — which precludes array overflows and use-after-free — and will increasingly rely on the module system’s encapsulation. Not every feature that assists in security looks like a dedicated security feature; even Loom’s virtual threads help prevent vulnerabilities caused by secrets leaked through ThreadLocals accidentally shared by multiple unrelated tasks. The Security Manager is not a central component for securing trusted server-side code, a good thing, too, because few systems use it and it doesn’t defend against some of the most common and dangerous exploits. In fact, the Security Manager is already crippled on the common pool (and on Loom’s virtual threads) because setting up appropriate security contexts would defeat the performance requirements of those constructs, and using it with CompletableFutures or any asynchronous (or “reactive”) context requires the developer to carefully capture and reestablish security contexts as operations travel from one thread to another. Nevertheless, a sandbox can serve as an additional effective protection layer for trusted code by blocking unintended operations triggered by exploits, but the Security Manager, despite its powerful theoretical capabilities, has been found over the years to be an ineffective sandbox for trusted code.

To better understand why, I’d like to introduce a taxonomy of sandboxes. A sandbox could restrict which API elements are directly available to sandboxed code; I’ll call such a sandbox shallow, because it performs access checks close in the call stack to the sandboxed code. In contrast, a deep sandbox blocks operations further away in the call-stack, close to where the operation is actually performed, perhaps when interacting with the OS or with the hardware. Say a call to foo or to bar might result in writing to a certain file. A shallow sandbox might disallow calling foo , bar, or both, while a deep sandbox will allow calling them but might block the actual file writing operation. Within the category of deep sandboxes, let’s introduce a further distinction. A simple deep sandbox blocks certain operations, like writing to a particular file, regardless of how they’re performed; in contrast, a path-dependent (or stack-dependent) deep sandbox might block or allow a particular operation depending on the code path taken to perform it, by combining the different permissions granted to different layers of the call stack. For example, if foo and bar write to a system configuration file, a simple deep sandbox will either allow both of them to perform the write or block both, while a path-dependent sandbox might block foo’s attempt but allow bar’s, presumably because it trusts bar to perform the write operation in a safe manner. We can also envision sandboxes that are somewhere between simple and path-dependent, such as a thread-dependent sandbox that allows or blocks a controlled operation depending on the identity of the thread performing it.

Some have suggested that treating third-party libraries as untrusted or semi-trusted code could be an effective security measure against hidden malicious code or inadvertent vulnerabilities. While I’m skeptical of the assumption implied by the proposal that the application itself — possibly comprising millions of lines of code — can be assumed to be safer, the notion of sandboxing the application, along with its dependencies, as a defence against exploits has merit.

For applications and libraries that make use of a large set of complex APIs, a deep sandbox can provide better security because only specific operations — such as interaction with the file systems — need to be analysed for their security implications and then restricted, rather than possible hundreds of thousands of API elements. But herein lies the problem with employing the Security Manager for that purpose: it is a path-dependent deep sandbox, which means it is very complex, and complexity is an enemy to security. For one, the set of permissions a complex application requires can be very large, and it is hard to evaluate whether it truly provides the requisite measure of security; Amazon uses formal methods to analyse policy files even for their simple sandboxes. For another, that set depends on internal implementation details of both the application and its dependencies, so it needs to be recomputed and re-analysed with every update to the application or any of its dependencies, greatly increasing the maintenance burden. Finally, the Security Manager’s path-dependence complicates things further, requiring judicious use of AccessController.doPrivileged; if a library doesn’t make use of doPrivileged, the permissions need to be granted to all of its callers on the call stack as well. The result is that the Security Manager is so complex that few applications use it, and those that do, more often than not do it incorrectly. There is little doubt that the Security Manager is one of the most sophisticated sandboxes in existence, but its theoretically powerful flexibility is what renders it ineffective in practice, either by making it difficult to use correctly or by deterring people from using it in the first place; a security device that is unused at all is not secure, and one that is used incorrectly is even worse — it gives a false sense of security. And its sophistication is also what makes its continued maintenance so costly.

A superior alternative for applications is a simple deep sandbox, such as the ones provided by OS-level containers and virtual machines, which also have the added benefit of restricting operations done by native code. Such sandboxes should be combined with deep monitoring that can detect and alert of suspicious activity by the application or its dependencies, achieved by streaming appropriate JFR events to a watchdog service. This isn’t straightforward today (without resorting to bytecode instrumentation), as the JDK is not yet instrumented to emit many interesting JFR events, such as on socket connect and accept, but it will be, certainly by the time Security Manager is degraded or removed.

Another use case for the Security Manager — this one completely unrelated to security and merely exploits the Security Manager as instrumentation callbacks in JDK code — is testing code behaviour as part of unit-testing; for example, asserting that a certain code unit does or does not perform some IO operation. This, too, can be better achieved with JFR, and there is even a library, JfrUnit, especially made for that purpose. JFR allows observing more kinds of interesting code behaviour, including memory allocation, and it is a component much more suited to this purpose than the Security Manager.

Shallow Java Sandboxes

Some have pointed out another use case, that of sandboxing server-side plugins. Plugins are normally trusted (even a popular IDE like VSCode treats plugins as trusted code) but sandboxing them is not intended to defend against malicious code so much as to protect the functional integrity of the application by restricting the APIs available to a plugin as a means of controlling its operation. This use case is too narrow and too rare to justify the high cost of the continued maintenance of the Security Manager, and mixing trusted and untrusted code in the same process is a hard problem, but I would argue that it is better served by a shallow sandbox, anyway, allowing the plugin to interact with its environment only through a very limited set of APIs.

Shallow sandboxes can be supplied by relatively simple third-party libraries. They can rely on the module system, whose strong encapsulation can be thought of as a very simple shallow sandbox. The plugin can be loaded into a module layer using a service loader, but that is insufficient to create a restrictive sandbox of the kind we want because the base module (java.base) already grants far too much power than we’d like the plugin to have. This can be refined by assigning a custom class loader to the layer that blocks the plugin from loading certain classes. It can then be refined still by having the class loader transform the classes it loads — the plugin’s classes — using, say, the ASM library, to replace calls to dangerous methods on classes it does not block we with stubs, that will either throw some illegal-access exception or filter the method’s argument, allowing some values to go through. Finally, the StackWalker can be used to inspect the stub caller’s class as part of the decision if the call should be allowed or not.