Model Data, the Whole Data, and Nothing but the Data - Data-Oriented Programming v1.1

It should come as no surprise that data-oriented programming (DOP) centers around modeling data as closely as possible and so a core principle of DOP is to model the data, the whole data, and nothing but the data. This goal is best achieved with a mix of records and sealed types as well as some programming practices that may seem odd to the object-oriented developer.

We’ll explore all that in this article, the third in a series that refines the four DOP principles in a version 1.1. After the previous article discussed how to model data immutably and transparently with records, we can now focus on sealed types and the aforementioned programming practices. We’ll continue to use the example of a simple sales platform that sells books, furniture, and electronic devices, which are each modeled by a simple record.

Sealed Types

Once we have created the records Book, Furniture, and ElectronicItem, the central domain data is modeled. Not completely, though, because there exists a relationship between them that has not yet been captured: Every item in our shop is either a Book, a (piece of) Furniture, or an ElectronicItem. To represent that relationship, we use sealed types.

Sealed types have been finalized in Java 17. A class or an interface is marked as sealed by the keyword sealed and then only the types listed in the permits clause can inherit from it - other types are forbidden to do so under penalty of compilation error. This mechanism is perfect for modeling alternatives. The sentence “an item is either a book, a piece of furniture, or an electronic device” becomes:

sealed interface Item permits Book, Furniture, ElectronicItem {
	// ...
}

Sealed types are particularly useful when the system cannot be expected to simply work when a new implementation is added. Another List implementation? No problem, this will work seamlessly. Another Item implementation? Now VAT rates have to be checked, dedicated views such as the apartment planner or the display of the table of contents have to be adjusted and maybe new delivery methods have to be introduced.

There are many other situations where just adding an interface implementation isn’t going to work. Authentication providers or payment methods, for example: It doesn’t suffice to just code up CreditCardPayment implements Payment because at least the associated payment system must also be implemented plus probably a mechanic that collects the payment in the right place in the code and ferries it to the suitable payment system. We’ll see how this works elegantly with sealed types in the article on operations.

First, a few properties of sealed types:

  • Allowed subtypes must be in the same module or (if the code is not compiled as a module) in the same package as the sealed type.
  • If the sealed type and permitted subtypes are contained in the same source code file, the permits clause can be omitted.
  • Allowed subtypes must inherit directly from the sealed type.
  • Allowed subtypes must be final, sealed or explicitly non-sealed (Java’s first hyphenated keyword!).

While it is certainly possible and sometimes useful to seal classes, dealing with sealed interfaces is much more pleasant in one very specific aspect, which we’ll get to when we discuss operations. That is why I generally recommend focusing on sealed interfaces and so this article series is doing that as well.

Model Nothing but the Data

Records make it easy to aggregate data, while sealed types make it easy to express alternatives. In combination, these two mechanisms are very powerful and allow even complicated structures to be modelled well.

Tailored Aggregates and Alternatives

The easy definition of records invites us to create tailor-made and possibly numerous types. Instead of User getting components for street, ZIP code, city, and country, these are probably better stored in an Address record, of which the user then has an instance.

And if the address is optional and a user can also optionally store an email address and a telephone number, instead of having a possibly-absent field for each contact information, you can give the type a List<ContactInfo> contacts field with sealed interface ContactInfo permits Address, Email, Phone. Is at least one contact information required? Have a ContactInfo primaryContact field and rename the list to additionalContacts.

The goal is to use these features to tailor the types to the actual domain data. This makes the code easier to understand for developers as it closely resembles the data they need to know anyway and it also makes it easier to maintain because illegal data is more easily rejected - more on that when we examine how to represent only legal states.

Equality (and Type Patterns)

A central part of modeling data is the definition of equality. As described in the article on records, they come with an equals (and hashCode) implementation that uses all components. This is fine in many cases, but especially in systems that work with users and items, IDs are ubiquitous and most objects that have one should probably use it to determine equality. That’s one of many reasons why it’s common to override equals (and hashCode).

In our example, it makes sense to define the equality of Book based on the ISBN. We can do this very elegantly with the help of a feature that will become much more important later: type patterns, standardized in Java 16, in this case with instanceof.

record Book(String title, ISBN isbn, List<Author> authors) {

	@Override
	public boolean equals (Object other) {
		return this == other
			|| other instanceof Book book
			&& Objects.equals(isbn, book.isbn);
	}

	@Override
	public int hashCode() {
		return Objects.hash(isbn);
	}

}

The type pattern is found in other instanceof Book book. It accomplishes three tasks:

  • checks whether other is an instance of type Book
  • defines a new variable Book book that is visible (“in scope”) wherever the test returns true
  • assigns book = (Book) other

Since the book variable is visible exactly where the type check was positive, you can use it directly after the && to compare the desired fields.

(Note: Implementing equals with instanceof is not always correct, but no problem here because Book is final.)

Methods

You can implement arbitrary methods on records, but as transparent carriers of data, they prefer some methods over others:

  • Methods without parameters are best because they can’t do anything other than return the record’s data (unless they reference global variables, which is extremely rarely a good idea). For example, email.tld() could identify and return the top level domain of the email address or book.byline() could combine the book title and authors into a string.
  • Methods that accept the type itself as the only parameter are also welcome. For example, this could be compareTo if you implement Comparable, or Book could have a method commonAuthors(Book) that returns a list of authors who were involved in both books.
  • Methods that accept other records (preferably those that are already used as a component type) are usually OK as well: Because they’re also supposed to be immutable data carriers, it can be assumed that no states are changed and all results are communicated via the return value. However, in this situation it becomes important to avoid implementing non-trivial domain logic. According to the principle separate operations from data, such operations should be reserved for external systems.
  • Methods with arbitrary parameters, particularly mutable ones, have a high chance of turning the record from data that is being processed as part of an operation into the executor of these operations, which should generally be avoided.

Note that these aren’t hard-and-fast rules but rather guidelines that can be suspended if the situation demands it, but then you should have a good reason for doing so.

Interface Contracts

If, in data-oriented programming, records mostly just offer access to data with little to no additional operations, you may ask yourself how to use interfaces in such a design - after all, we mostly employ them to model contracts for behavior. And indeed, this role is much less important here. (Sealed) interfaces implemented by records do not primarily define what a type does but rather what it is:

  • Books, electronic devices, and furniture are items.
  • Addresses, emails, and telephone numbers are contact information.

As can be seen in these examples, the types united under an interface often have very little overlap. While items probably at least all have an item number, the different contact information are entirely distinct. Accordingly, interfaces like ContactInformation may end up without a single method. This is unusual and “looks wrong”, but that’s just a matter of familiarity. The contract that is defined here does not describe behavior (which is no meaningful category for data) but grouping (which data are alternatives to each other in the context of the interface) and no methods are needed for that.

Summary

Use records to aggregate data into meaningful, tailored types and sealed interfaces to express alternatives between such types. Because data doesn’t come with behavior, such records usually declare few or no methods that don’t just return the data in a different form. Consequently, the sealed interfaces they implement may declare few or no methods, which can be novel but is expected as the contract they describe is about what the data is (not what it does).

Learn more about version 1.1 of data-oriented programming in this article series: