Composing Objects – Java Concurrency in Practice

Hi, Java dudes! In my previous post, Sharing Objects – Java Concurrency in Practice, I reviewed how objects can be published and accessed from multiple threads in a safe manner. Today’s topic is how thread-safe components can be combined and enriched with new functionality.

While it is possible to write a thread-safe application that stores all its state in public static fields, but it is much easier if we do it by combining thread-safe components. By doing so we can delegate every thread-safety issue to the right application tier where it needs to be handled. Let the thread-safe class deal with how access to its inner state has to be synchronized. While on a higher level, we can think about how these thread-safe components can be used to represent the application state properly.

In order to prevent concurrency issues, we have to think about state ownership. This is something that is not part of the language itself but it is defined by class design. Therefore programmers have a great autonomy to make the right choices – and the bad ones too. A class usually does not own the objects passed to its methods. In many cases, ownership and encapsulation go together – the object encapsulates the state it owns and owns the state it encapsulates. Collections often share ownership of the contained objects with the client code that inserted them into the collection.

The Java monitor pattern is an example for ownership by encapsulation.

An object following the Java monitor pattern encapsulates all its mutable state and guards it with the object’s own intrinsic lock [JCiP]

Synchronized collections offered by the standard JDK follow the Java monitor pattern. Synchronization is implemented as a wrapper layer that controls all access to the underlying collection. A synchronized list of Strings can be created like this:

import java.util.Collections;
...
List syncStrings = Collections.synchronizedList(new ArrayList());

It is critical that the reference to the backing collection must not escape otherwise it can undermine thread-safety. Operations over a synchronized collection are mutually exclusive – only one thread at a time can work with the collection. In most cases, this is too restrictive and result in poor application performance. There are other alternatives but more on that later.

Now let’s suppose we have a class that is already thread-safe but misses an operation that we need. What can we do in such a situation? We have multiple options, but not all of them are correct:

  • put new synchronized code in a helper class, ☠
  • extend the original class, ☠
  • modify the original class to support the desired operation, or
  • add new functionality in a class encapsulating the original one.

Using a synchronized method of a helper class is probably the worst possible approach. Whatever lock the parameter object uses it definitely won’t be the intrinsic lock of the helper method. It only gives us the illusion of safety while other threads can still modify the state of the passed in object.

Extending the original class – if possible – could work, but it’s fragile because the underlying class might silently change its synchronization policy and jeopardize thread-safety of the extending class. Another problem is that not all classes expose enough of their state to make this approach possible (private lock).

The safest way to add a new atomic operation is to modify the original class consistently with its original design. In this case, all code that implements the synchronization policy is contained in a single source file – easier to understand and maintain.

If changing the existing class is not an option then using composition is the best alternative. This is how standard synchronized collections are implemented. And just as with synchronized collections, it’s crucial that all access to the underlying object must go through the wrapper class.

Don’t force clients or other developers to make risky guesses about the thread-safety of your code.

Document a class’s thread safety guarantees for its clients; document its synchronization policy for its maintainers. [JCiP]

Did you know that java.text.SimpleDateFormat is not thread-safe? Okay, maybe you did, but did you know that it wasn’t explicitly documented until JDK 1.4? And this is the worst that can happen: you assume that something is thread-safe that is not. This is why you should always document your code.

...
@ThreadSafe
public class ImprovedList<T> {

    @GuardedBy("this")
    private final List<T> innerList;

    public ImprovedList(List<T> list) {
        this.innerList = list;
    }

    public synchronized void putIfAbsent(T elem) {
        if (!innerList.contains(elem)) {
            innerList.add(elem);
        }
    }
}

Annotations @ThreadSafe and @GuardedBy are not part of the standard JDK.

That was all for now. Next week we are going to look at what building blocks Java provides that we can use as Lego™ pieces to build our application. Best wishes until we meet again here 😉

P.S.: instead of java.text.SimpleDateFormat use its thread-safe Java 8 alternative the java.time.format.DateTimeFormatter – where applicable.

Resource
[JCiP] Java Concurrency in Practice by Brian Goetz, ISBN-10: 0321349601

Sharing Objects – Java Concurrency in Practice

Welcome, Java enthusiast! Today I’m going to deal with issues around sharing objects in a multi-threaded environment. In last week’s post, Java Concurrency in Practice – Thread Safety, I was focusing on preventing multiple threads from accessing a shared state at the same time. In this post, I’m going to deal with a more subtle aspect of synchronization: visibility. How to publish changes by one thread so they can be safely read by other threads.

public class NotVisible {

  private static class WorkerThread extends Thread {
    public int value = 1;
    public boolean finished = false;
    public void run() {
      while (!finished)
        Thread.yield();
      System.out.println(value);
    }
  }
  
  public static void main(String[] args) {
    WorkerThread t = new WorkerThread();
    t.start();
    t.value = 2;
    t.finished = true;
  }

}

The class NotVisible demonstrates what can go wrong without proper synchronization. While it seems reasonable to assume that the code above will print 2. Actually, it can loop forever because there is no guarantee that the values set by the main thread will be visible in the “worker” thread. Nor, that they will become visible in the same order. Therefore it can happen that it will finish, but print 1 instead of 2.

“in the absence of synchronization, the Java Memory Model permits the compiler to reorder operations and cache values in registers, and permits CPUs to reorder operations and cache values in processor-specific caches.” [JCiP]

Therefore:

“Attempts to reason about the order in which memory actions ‘must’ happen in insufficiently synchronized multithreaded programs will almost certainly be incorrect.” [JCiP]

In insufficiently synchronized programs reader threads can see out-of-date values. Stale data can cause serious issues and failures like unexpected exceptions, broken computations, corrupted data, or infinite loops.

Intrinsic locking is one of the mechanisms that can guarantee that changes made by one thread will be visible to other threads in a predictable manner.

“Locking is not just about mutual exclusion; it is also about memory visibility. To ensure that all threads see the most up-to-date values of shared mutable variables, the reading and writing threads must synchronize on a common lock.” [JCiP]

Meaning that:

“When thread A executes a synchronized block, and subsequently thread B enters a synchronized block guarded by the same lock, the values of variables that were visible to A prior to releasing the lock are guaranteed to be visible to B upon acquiring the lock.” [JCiP]

There is an alternative way to propagate changes predictably: using volatile variables. When a field is declared volatile the compiler and the JVM is put on notice that this variable is shared and should not be cached nor should be access to it re-ordered with other memory operations. Volatile variables provide visibility but not atomicity. Therefore we cannot use them when a write to the variable depends on its current value.

“Locking can guarantee both visibility and atomicity; volatile variables can only guarantee visibility.” [JCiP]

Publishing an object means to make it available outside of its current scope. Using good OOP practices like encapsulation is not necessary for writing safe concurrent programs. However, it certainly makes easier to reason about correctness. Publishing internal state variables can compromise not just encapsulation but the thread-safety of your application. Other classes or threads can intentionally or carelessly misuse the published state and break your design. Sometimes, publishing is obvious e.g. storing a reference in a public static field. In other cases, it can be more subtle like passing an object to an overrideable – neither private or final – method.

An object is in a predictable, consistent state only after its constructor returns. This is why:

“Do not allow the this reference to escape during construction.” [JCiP]

We can let the this reference escape during construction either explicitly (by passing it) or implicitly by instantiating an inner class of the owning object.

No synchronization is needed if the data is accessed only from a single thread. What an ingenious idea! Actually, it is so important it has its own name: thread-confinement. For example JDBC connection pools use thread confinement to ensure correct program behaviour. The Connection object itself is not thread-safe but the pool won’t dispense the same connection to another thread until it is returned from the owning thread – which acquired the connection in a thread-safe manner from the pool.

ThreadLocal class allows us to maintain a separate copy of a value on a per-thread basis. It provides getter and setter methods to return or set the value for the currently executing thread respectively. When a thread calls get for the first time the initialValue() method is executed to construct the new value.

Stack confinement is a special form of thread confinement. Because local variables only live on the executing thread’s stack. If the data is only reachable through local variables thread-safety is not an issue.

An immutable object is one whose state is cannot be changed after construction.

“Immutable objects are always thread-safe.” [JCiP]

It’s technically possible to have an immutable object without all fields being final. The String class lazily computes the hash code the first time it is actually needed but because this is derived deterministically from an immutable state String is still considered immutable.

So far, we have been focusing how not to share objects among multiple threads. Of course, sometimes we simply have to and then we have t do it safely. If the object we would like to share is not immutable we have to follow the safe publication idioms listed below.

To publish an object safely, both the reference to the object and the object’s state must be made visible to other threads at the same time. A properly constructed object can be safely published by:

  • Initializing an object reference from a static initializer
  • Storing a reference to it into a volatile field or AtomicReference
  • Storing a reference to it into a final field of a properly constructed object
  • Storing a reference to it into a field that is properly guarded by a lock
[JCiP]

The Java Memory Model model offers a special guarantee for immutable objects: they don’t have to be published safely. But, they have to be constructed properly: all fields have to be final and their state must not change. This guarantee does not hold for effectively immutable objects. They have to be published safely but can be later accessed without any further synchronization.

Whoa, this was a lot of information. I’m so glad that I decided to write my own study notes because it helps me a lot to better understand what I’ve read. I also hope that I can help others as well. I recommend to read the book and use my notes to recapitulate 😉

Resource
[JCiP] Java Concurrency in Practice by Brian Goetz, ISBN-10: 0321349601