Chapter 4. Immutability

Dealing with data structures—​constructs dedicated to storing and organizing data values—​is a core task of almost any program. In OOP, this usually means dealing with a mutable program state, often encapsulated in objects. For a functional approach, however, immutability is the preferred way to handle data and is a prerequisite for many of its concepts.

In functional programming languages like Haskell or even multiparadigm but more functionally inclined ones like Scala, immutability is treated as a prevalent feature. In those languages, immutability is a necessity and often strictly enforced, not just an afterthought to their design. Like most other principles introduced in this book, immutability isn’t restricted to functional programming and provides many benefits, regardless of your chosen paradigm.

In this chapter, you will learn about immutable types already available in the JDK and how to make your data structures immutable to avoid side effects, either with the tools provided by the JDK or with the help of third-party libraries.

Note

The term “data structure” used in this chapter represents any construct that stores and organizes data, like Collections or custom types.

Mutability and Data Structures in OOP

As an object-oriented inclined language, typical Java code encapsulates an object’s state in a mutable form. Its state is usually mutable by using “setter” methods. This approach makes the program state ephemeral, meaning any change to an existing data structure updates its current state in-place, which also affects anyone else who references it, and the previous state is lost.

Let’s take a look at the most common forms used to handle mutable state in OOP Java code as discussed in Chapter 2: JavaBeans and plain old Java objects (POJO). A lot of confusion exists about those two data structures and their distinct properties. In a sense, they are both ordinary Java objects supposed to create reusability between components by encapsulating all relevant states. They have similar goals, although their design philosophies and rules differ.

POJOs don’t have any restrictions regarding their design. They are supposed to “just” encapsulate the business logic state, and you can even design them to be immutable. How you implement them is up to you and what matches your environment best. They usually provide “getters” and “setters” for their fields to be more flexible in an object-oriented context with a mutable state.

JavaBeans, on the other hand, are a special kind of POJO that allows easier introspection and reusability, which requires them to follow certain rules. These rules are necessary because JavaBeans were initially designed to be a standardized shareable machine-readable state between components, like a UI widget in your IDE1. The differences between POJOs and JavaBeans are listed in Table 4-1.

Table 4-1. POJOs versus JavaBeans
POJO JavaBean

General Restrictions

Imposed by Java language rules

Imposed by JavaBean API specification

Serialization

Optional

Must implement java.io.Serializable

Field Visibility

No restrictions

private only

Field Access

No restrictions

Accessible only via getters and setters

Constructors

No restrictions

No-arg constructor must exist

Many of the available data structures in the JDK, like the Collections framework⁠,2 are mostly built around the concept of mutable state and in-place changes. Take List<E> for an example. Its mutating methods, like add(E value) or remove(E value), only return a boolean to indicate that a change occurred, and they change the Collection in place, so the previous state is lost. You might not need to think much about it in a local context, but as soon as a data structure leaves your direct sphere of influence, it’s no longer guaranteed to remain in its current state as long as you hold a reference to it.

Mutable state breeds complexity and uncertainty. You must include all possible state changes in your mental model at any time to understand and reason with your code. This isn’t restricted to a single component, though. Sharing mutable state increases the complexity to cover the lifetime of any components having access to such shared state. Concurrent programming especially suffers under the complexities of shared state, where many problems originate in mutability and require intricate and often misused solutions like access synchronization and atomic references.

Ensuring the correctness of your code and shared state becomes a Sisyphean task of endless unit tests and state validation. And the required additional work multiplies as soon as mutable state interacts with more mutable components, resulting in even more verification of their behavior.

That’s where immutability provides another approach to handling data structures and restoring reasonability.

Immutability (Not Only) in FP

The core idea of immutability is simple: data structures can no longer change after their creation. Many functional programming languages support it by design at their core. The concept isn’t bound to functional programming per se, and it has many advantages in any paradigm.

Note

Immutability provides elegant solutions to many problems, even outside of programming languages. For example, the distributed version control system Git3 essentially uses a tree of pointers to immutable blobs and diffs to provide a robust representation of historical changes.

Immutable data structures are persistent views of their data without a direct option to change it. To “mutate” such a data structure, you must create a new copy with the intended changes. Not being able to mutate data “in place” can feel weird in Java at first. Compared to the usually mutable nature of object-oriented code, why should you take the extra steps necessary to simply change a value? Such creation of new instances by copying data incurs a particular overhead that accumulates quickly for naive implementations of immutability.

Despite the overhead and initial weirdness of not being able to change data in place, the benefits of immutability can make it worthwhile even without a more functional approach to Java:

Predictability

Data structures won’t change without you noticing because they simply can’t. As long as you reference a data structure, you know it is the same as at the time of its creation. Even if you share that reference or use it in a concurrent fashion, no one can change your copy of it.

Validity

After initialization, a data structure is complete. It needs to be verified only once and stays valid (or invalid) indefinitely. If you need to build a data structure in multiple steps, the builder pattern, shown later in “Step-by-step creation”, decouples the building and initialization of a data structure.

No hidden side effects

Dealing with side effects is a really tough problem in programming—​besides naming and cache invalidation4. A byproduct of immutable data structures is the elimination of side effects; they’re always as-is. Even if moved around a lot through different parts of your code or using them in a third-party library out of your control, they won’t change their values or surprise you with an unintended side effect.

Thread safety

Without side effects, immutable data structures can move freely between thread boundaries. No thread can change them, so reasoning about your program becomes more straightforward due to no more unexpected changes or race conditions.

Cacheability and optimization

Because they are as-is right after creation, you can cache immutable data structures with ease of mind. Optimization techniques, like memoization, are possible only with immutable data structures, as discussed in Chapter 2.

Change tracking

If every change results in a whole new data structure, you can track their history by storing the previous references. You no longer need to intricately track single property changes to support an undo feature. Restoring a previous state is as simple as using a prior reference to the data structure.

Remember, all these benefits are independent of the chosen programming paradigm. Even if you decide that a functional approach might not be the right solution for your codebase, your data handling can still benefit immensely from immutability.

The State of Java Immutability

Java’s initial design didn’t include immutability as a deeply integrated language feature or provide a variety of immutable data structures. Certain aspects of the language and its types were always immutable, but it was nowhere close to the level of support offered in other more functional languages. This all changed when Java 14 was released and introduced Records, a built-in language-level immutable data structure.

Even if you might not know it yet, you’re already using immutable types in all your Java programs. The reasons behind their immutability might differ, like runtime optimizations or ensuring their correct usage, but regardless of their intentions, they’ll make your code safer and less error-prone.

Let’s take a look at all the different immutable parts available in the JDK today.

java.lang.String

One of the first types every Java developer learns about is the String type. Strings are everywhere! That’s why it needs to be a highly optimized and safe type. One of these optimizations is that it’s immutable.

String is not a primitive value-based type, like int or char. Still, it supports the + (plus) operator to concatenate a String with another value:

String first = "hello, ";
String second = "world!";

String result = first + second;
// => "hello, world!"

Like any other expression, concatenating strings creates a result and, in this case, a new String object. That’s why Java developers are taught early not to overuse manual String concatenation. Each time you concatenate strings by using the + (plus) operator, a new String instance is created on the heap, occupying memory, as depicted in Figure 4-1. These newly created instances can add up quickly, especially if concatenation is done in a loop statement like for or while.

String memory allocation
Figure 4-1. String memory allocation

Even though the JVM will garbage-collect instances that are no longer needed, the memory overhead of endless String creation can be a real burden on the runtime. That’s why the JVM uses multiple optimization techniques behind the scenes to reduce String creation, like replacing concatenations with a java.lang.StringBuilder, or even using the opcode invokedynamic to support multiple optimization strategies.5

Because String is such a fundamental type, it is sensible to make it immutable for multiple reasons. Having such a base type being thread-safe by design solves issues associated with concurrency, like synchronization, before they even exist. Concurrency is hard enough without worrying about a String to change without notice. Immutability removes the risk of race conditions, side effects, or a simple unintended change.

String literals also get special treatment from the JVM. Thanks to string pooling, identical literals are stored only once and reused to save precious heap space. If a String could change, it would change for everyone using a reference to it in the pool. It’s possible to allocate a new String by explicitly calling one of its constructors instead of creating a literal to circumvent pooling. The other way around is possible, too, by calling the intern method on any instance, which returns a String with the same content from the string pool.

String equality

The specialized handling of String instances and literals is why you should never use the equality operator == (double-equal) to compare Strings. That’s why you should always use either the equals or equalsIgnoreCase method to test for equality.

However, the String type isn’t “completely” immutable, at least from a technical point of view. It calculates its hashCode lazily due to performance considerations because it needs to read the whole String to do so. Still, it’s a pure function: the same String will always result in the same hashCode.

Using lazy evaluation to hide expensive just-in-time calculations to achieve logical immutability requires extra care during the design and implementation of a type to ensure it remains thread-safe and predictable.

All these properties make String something between a primitive and an object type, at least from a usability standpoint. Performance optimization possibilities and safety might have been the main reasons for its immutability, but the implicit advantages of immutability are still a welcome addition to such a fundamental type.

Immutable Collections

Another fundamental and ubiquitous group of types that benefit significantly from immutability is Collections, like Set, List, Map, etc.

Although Java’s Collection framework wasn’t designed with immutability as a core principle, it still has a way of providing a certain degree of immutability with three options:

  • Unmodifiable Collections

  • Immutable Collection factory methods (Java 9+)

  • Immutable copies (Java 10+)

All options aren’t public types that you can simply instantiate using the new keyword. Instead, the relevant types have static convenience methods to create the necessary instances. Also, they’re only shallowly immutable, meaning that you cannot add or remove any elements, but the elements themselves aren’t guaranteed to be immutable. Anyone holding a reference to an element can change it without the knowledge of the Collection it currently resides in.

Shallow immutability

Shallowly immutable data structures provide immutability only at their topmost level. This means that the reference to the data structure itself can’t be changed. The referenced data structure, however—in the case of a Collection, its elements—can still be mutated.

To have a fully immutable Collection, you need to have only fully immutable elements, too. Nevertheless, the three options still provide you with helpful tools to use against unintended modification.

Unmodifiable Collections

The first option, unmodifiable Collections, is created from an existing Collection by calling one of the following generic static methods of java.util.Collections:

  • Collection<T> unmodifiableCollection(Collection<? extends T> c)

  • Set<T> unmodifiableSet(Set<? extends T> s)

  • List<T> unmodifiableList(List<? extends T> list)

  • Map<K, V> unmodifiableMap(Map<? extends K, ? extends V> m)

  • SortedSet<T> unmodifiableSortedSet(SortedSet<T> s)

  • SortedMap<K, V> unmodifiableSortedMap(SortedMap<K, ? extends V> m)

  • NavigableSet<T> unmodifiableNavigableSet(NavigableSet<T> s)

  • NavigableMap<K, V> unmodifiableNavigableMap(NavigableMap<K, V> m)

Each method returns the same type as the one provided for the method’s single argument. The difference between the original and the returned instance is that any attempt to modify the returned instance will throw an UnsupportedOperationException, as demonstrated in the following code:

List<String> modifiable = new ArrayList<>();
modifiable.add("blue");
modifiable.add("red");

List<String> unmodifiable = Collections.unmodifiableList(modifiable);

unmodifiable.clear();
// throws UnsupportedOperationException

The obvious downside of an “unmodifiable view” is that it’s only an abstraction over an existing Collection. The following code shows how the underlying Collection is still modifiable and affects the unmodifiable view:

List<String> original = new ArrayList<>();
original.add("blue");
original.add("red");

List<String> unmodifiable = Collections.unmodifiableList(original);

original.add("green");

System.out.println(unmodifiable.size());
// OUTPUT:
// 3

The reason for still being modifiable via the original reference is how the data structure is stored in memory, as illustrated in Figure 4-2. The unmodified version is only a view of the original list, so any changes directly to the original circumvent the intended unmodifiable nature of the view.

Memory layout of unmodifiable Collections
Figure 4-2. Memory layout of unmodifiable Collections

The common use for unmodifiable views is to freeze Collections for unwanted modification before using them as a return value.

Immutable Collection factory methods

The second option—immutable Collection factory methods—has been available since Java 9 and isn’t based on preexisting Collections. Instead, the elements must be provided directly to the static convenience methods available on the following Collection types:

  • List<E> of(E e1, …​)

  • Set<E> of(E e1, …​)

  • Map<K, V> of(K k1, V v1, …​)

Each factory method exists with zero or more elements and uses an optimized internal Collection type based on the number of elements used.

Immutable copies

The third option, immutable copies, is available in Java 10+ and provides a deeper level of immutability by calling the static method copyOf on these three types:

  • Set<E> copyOf(Collection<? extends E> coll)

  • List<E> copyOf(Collection<? extends E> coll)

  • Map<K, V> copyOf(Map<? extends K, ? extends V> map)

Instead of being a mere view, copyOf creates a new container, holding its own references to the elements:

// SETUP ORIGINAL LIST
List<String> original = new ArrayList<>();
original.add("blue");
original.add("red");

// CREATE COPY
List<String> copiedList = List.copyOf(original);

// ADD NEW ITEM TO ORIGINAL LIST
original.add("green");

// CHECK CONTENT
System.out.println(original);
// [blue, red, green]
System.out.println(copiedList);
// [blue, red]

The copied Collection prevents any addition or removal of elements through the original list, but the actual elements are still shared, as illustrated in Figure 4-3, and open to changes.

Memory layout of copied Collections
Figure 4-3. Memory layout of copied Collections

Which option of immutable Collections you choose depends on your context and intentions. If a Collection can’t be created in a single call, like in a for-loop, an unmodifiable view or immutable copy is a sensible approach. Use a mutable Collection locally and “freeze” it by returning an unmodifiable view or copy it when the data leaves your current scope. Immutable Collection factory methods don’t support an intermediary Collection that might get modified; they require you to know all the elements beforehand.

Primitives and Primitive Wrappers

So far, you’ve learned mostly about immutable object types, but not everything in Java is an object. Java’s primitive types—byte, char, short, int, long, float, double, boolean—are handled differently from object types. They are simple values that are initialized by either a literal or an expression. Representing only a single value, they are practically immutable.

Besides the primitive types themselves, Java provides corresponding object wrapper types, like Byte or Integer. They encapsulate their respective primitives in a concrete object type to make them usable in scenarios where primitives aren’t allowed (yet), like generics. Otherwise, autoboxing—the automatic conversion between the object wrapper types and their corresponding primitive type—could lead to inconsistent behavior.

Immutable Math

Most simple calculations in Java rely on primitives like int or long for whole numbers, and float or double for floating-point calculations. The package java.math, however, has two immutable alternatives for safer and more precise integer and decimal calculations, which are both immutable: java.math.BigInteger and java.math.BigDecimal.

Note

In this context, “integer” means a number without a fractional component and not Java’s int or Integer type. The word integer comes from Latin and is used in mathematics as a colloquial term to represent whole numbers in the range from - to + , including zero.

Just like with String, why should you burden your code with the overhead of immutability? Because they allow side-effect-free calculations in a greater range with higher precision.

The pitfall of using immutable math objects, though, is the possibility of simply forgetting to use the actual result of a calculation. Even though method names like add or subtract suggest modification, at least in an OO context, the java.math types return a new object with the result, as follows:

BigDecimal theAnswer = new BigDecimal(42);

BigDecimal result = theAnswer.add(BigDecimal.ONE);

// RESULT OF THE CALCULATION
System.out.println(result);
// OUTPUT:
// 43

// UNCHANGED ORIGINAL VALUE
System.out.println(theAnswer);
// OUTPUT:
// 42

The immutable math types are still objects with the usual overhead and use more memory to achieve their precision. Nevertheless, if calculation speed is not your limiting factor, you should always prefer the BigDecimal type for floating-point arithmetic due to its arbitrary precision.6

The BigInteger type is the integer equivalent to BigDecimal, also with built-in immutability. Another advantage is the extended range of at least7 from -22,147,483,647 up to 22,147,483,647 (both exclusive), compared to the range of int from -231 to 231.

Java Time API (JSR-310)

Java 8 introduced the Java Time API (JSR-310), which was designed with immutability as a core tenet. Before its release, you only had three8 types in the package java.util at your disposal for all your date- and time-related needs: Date, Calendar, and TimeZone. Performing calculations was a chore and error-prone. That’s why Joda Time library became the de facto standard for date and time classes before Java 8 and subsequently became the conceptual foundation for JSR-310.

Note

Like with immutable math, any calculation with methods such as plus or minus won’t affect the object they’re called on. Instead, you have to use the return value.

Rather than the previous three types in java.util, there now are multiple date- and time-related types with different precisions, with and without timezones, available in the java.time package. They are all immutable, giving them all the related advantages like no side effects and safe use in concurrent environments.

Enums

Java enums are special types consisting of constants. And constants are, well, constant, and therefore immutable. Besides the constant values, an enum can contain additional fields which aren’t implicitly constant.

Usually, final primitives or Strings are used for these fields, but no one stops you from using a mutable object type or a setter for a primitive. It will most likely lead to problems, and I strongly advise against it. Also, it’s considered a code smell.⁠9

The final Keyword

Since Java’s inception, the final keyword provides a certain form of immutability depending on its context, but it’s not a magic keyword to make any data structure immutable. So, what exactly does it mean for a reference, method, or class to be final?

The final keyword is similar to the const keyword of the programming language C. It has several implications if applied to classes, methods, fields, or references:

  • final classes cannot be subclassed.

  • final methods cannot be overridden.

  • final fields must be assigned exactly once—either by the constructors or on declaration—and can never be reassigned.

  • final variable references behave like a field by being assignable exactly once—at declaration. The keyword affects only the reference itself, not the referenced variable content.

The final keyword grants a particular form of immutability for fields and variables. However, their immutability might not be what you expect because the reference itself becomes immutable but not the underlying data structure. That means you can’t reassign the reference but can still change the data structure, as shown in Example 4-1.

Example 4-1. Collections and final references
final List<String> fruits = new ArrayList<>(); 1

System.out.println(fruits.isEmpty());
// OUTPUT:
// true

fruits.add("Apple"); 2

System.out.println(fruits.isEmpty());
// OUTPUT:
// false

fruits = List.of("Mango", "Melon"); 3
// => WON'T COMPILE
1

The final keyword affects only the reference fruits, not the actually referenced ArrayList.

2

The ArrayList itself doesn’t have any concept of immutability, so you can freely add new items to it, even if its reference is final.

3

Reassigning a final reference is prohibited.

As discussed in “Effectively final”, having effectively final references is a necessity for lambda expressions. Making every reference in your code final is an option; however, I wouldn’t recommend it. The compiler detects automatically if a reference behaves like a final reference even without adding an explicit keyword. Most problems created by the lack of immutability come from the underlying data structure itself and not reassigned references anyway. To ensure a data structure won’t change unexpectedly as long as it’s in active use, you must choose an immutable data structure from the get-go. The newest addition to Java to achieve this goal is Records.

Records

In 2020, Java 14 introduced a new type of class with its own keyword to complement or even replace POJOs and JavaBeans in certain instances: Records.

Records are “plain data” aggregates with less ceremony than POJOs or Java Beans. Their feature set is reduced to an absolute minimum to serve that purpose, making them as concise as they are:

public record Address(String name,
                      String street,
                      String state,
                      String zipCode,
                      Country country) {
  // NO BODY
}

Records are shallowly immutable data carriers primarily consisting of their state’s declaration. Without any additional code, the Address record provides automatically generated getters for the named components, equality comparison, toString and hashCode methods, and more.

Chapter 5 will do a deep dive into Records on how to create and use them in different scenarios.

How to Achieve Immutability

Now that you know about the immutable parts the JVM provides, it’s time to look at how to combine them to achieve immutability for your program state.

The easiest way to make a type immutable is by not giving it a chance to change in the first place. Without any setters, a data structure with final fields won’t change after creation because it can’t. For real-world code, though, the solution might not be as simple as that.

Immutability requires a new way of thinking about data creation because shared data structures are seldom created in one fell swoop. Instead of mutating a single data structure over time, you should work with immutable constructs along the way, if possible, and compose a “final” and immutable data structure in the end. Figure 4-4 depicts the general idea of different data components contributing to a “final” immutable Record. Even if the individual components aren’t immutable, you should always strive to wrap them in an immutable shell, Record or otherwise.

Records as data aggregators
Figure 4-4. Records as data aggregators

Keeping track of the required components and their validation might be challenging in more complicated data structures. In Chapter 5, I’ll discuss tools and techniques that improve data structure creation and reduce the required cognitive complexity.

Common Practices

Like the functional approach in general, immutability doesn’t have to be an all-or-nothing approach. Due to their advantages, having only immutable data structures sounds intriguing, and your key goal should be to use them and immutable references as your default approach. Converting existing mutable data structures to immutable ones, though, is often a pretty complex task requiring a lot of refactoring or conceptual redesign. Instead, you could introduce immutability gradually by following common practices like those listed below and treating your data as if it were already immutable:

Immutability by default

Any new data structure, like data transfer objects, value objects, or any kind of state, should be designed as immutable. If the JDK or another framework or library you’re using provides an immutable alternative, you should consider it over a mutable type. Dealing with immutability right from the start with a new type will influence and shape any code that will use it.

Always expect immutability

Assume all data structures are immutable unless you created them or it’s stated explicitly otherwise, especially when dealing with Collection-like types. If you need to change one, it’s safer to create a new one based on the existing one.

Modifying existing types

Even if a preexisting type isn’t immutable, new additions should be, if possible. There might be reasons for making it mutable, but unnecessary mutability increases the bug surface, and all the advantages of immutability vanish instantly.

Break immutability if necessary

If it doesn’t fit, don’t force it, especially in legacy codebases. The main goal of immutability is providing safer, more reasonable data structures, which requires their environment to support them accordingly.

Treat foreign data structures as immutable

Always treat any data structure not under your scope’s control as immutable. For example, Collection-like types received as method arguments should be considered immutable. Instead of manipulating it directly, create a mutable wrapper view for any changes, and return an unmodifiable Collection type. This approach keeps the method pure and prevents any unintended changes the callee hasn’t expected.

Following these common practices will make it easier to create immutable data structures from the start or gradually transition to a more immutable program state along the way.

Takeaways

  • Immutability is a simple concept, but it requires a new mindset and approach to handling data and change.

  • Lots of JDK types are already designed with immutability in mind.

  • Records provide a new and concise way to reduce boilerplate for creating immutable data structures but deliberately lack a certain amount of flexibility in order to be as transparent and straightforward as possible.

  • You can achieve immutability with the built-in tools of the JDK, and third-party libraries can provide simple solutions to the missing pieces.

  • Introducing immutability into your code doesn’t have to be an all-or-nothing approach. You can gradually apply common immutability practices to your existing code to reduce state-related bugs and ease refactoring efforts.

1 JavaBeans are specified in the official JavaBeans API Specification 1.01, which is over a hundred pages long. For the scope of this book, however, you don’t need to know all of it, but you should be familiar with the mentioned differences to other data structures.

2 Since Java 1.2, the Java Collections framework provides a multitude of common reusable data structures, like List<E>, Set<E>, etc. The Oracle Java documentation has an overview of the available types included in the framework.

3 Git is a free and open source distributed version control system. Its website provides ample documentation about its inner workings.

4 Phil Karton, an accomplished software engineer who for many years was a principal developer at Xerox PARC, Digital, Silicon Graphics, and Netscape, said, “There are only two hard things in Computer Science: cache invalidation and naming things.” It became a mainstream joke in the software community over the years and is often amended by adding “one-off errors” without changing the count of two.

5 The JDK Enhancement Proposal (JEP) 280, “Indify String Concatenation”, describes the reasoning behind using invokedynamic in more detail.

6 Arbitrary-precision arithmetic—also known as bignum arithmetic, multiple-precision arithmetic, or sometimes infinite-precision arithmetic—performs calculations on numbers whose digits of precision are limited only by the available memory, not a fixed number.

7 The actual range of BigInteger depends on the actual implementation of the used JDK, as stated in an implementation note in the official documentation.

8 Technically there’s a fourth type, java.sql.Date, which is a thin wrapper to improve JDBC support.

9 A code smell is a known code characteristic that might indicate a deeper problem. It’s not a bug or error per se, but it might cause trouble in the long run. These smells are subjective and vary by programming language, developer, and paradigms. Sonar, the well-known company that develops open source software for continuous code quality and security, lists mutable enums as rule RSPEC-3066.

Get A Functional Approach to Java now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.