Object Reuse

As we saw in the last section, objects are expensive to create. Where it is reasonable to reuse the same object, you should do so. You need to be aware of when not to call new. One fairly obvious situation is when you have already used an object and can discard it before you are about to create another object of the same class. You should look at the object and consider whether it is possible to reset the fields and then reuse the object, rather than throw it away and create another. This can be particularly important for objects that are constantly used and discarded: for example, in graphics processing, objects such as Rectangle s, Points, Colors, and Fonts are used and discarded all the time. Recycling these types of objects can certainly improve performance.

Recycling can also apply to the internal elements of structures. For example, a linked list has nodes added to it as it grows, and as it shrinks, the nodes are discarded. Holding on to the discarded nodes is an obvious way to recycle these objects and reduce the cost of object creation.

Pool Management

Most container objects (e.g., Vectors, Hashtables) can be reused rather than created and thrown away. Of course, while you are not using the retained objects, you are holding on to more memory than if you simply discarded those objects, and this reduces the memory available to create other objects. You need to balance the need to have some free memory available against the need to improve performance by reusing objects. But generally, the space taken by retaining objects for later reuse is significant only for very large collections, and you should certainly know which ones these are in your application.

Note that when recycling container objects, you need to dereference all the elements previously in the container so that you don’t prevent them from being garbage collected. Because there is this extra overhead in recycling, it may not always be worth recycling containers. As usual for tuning, this technique is best applied to ameliorate an object-creation bottleneck that has already been identified.

A good strategy for reusing container objects is to use your own container classes, possibly wrapping other containers. This gives you a high degree of control over each collection object, and you can design these specifically for reuse. You can still use a pool manager to manage your requirements, even without reuse-designed classes. Reusing classes requires extra work when you’ve finished with a collection object, but the effort is worth it when reuse is possible. The code fragment here shows how you could use a vector pool manager:

//An instance of the vector pool manager.
public static VectorPoolManager vectorPoolManager =
    new VectorPoolManager(25);

...

public void someMethod( )
{
  //Get a new Vector. We only use the vector to do some stuff
  //within this method, and then we dump the vector (i.e. it
  //is not returned or assigned to a state variable)
  //so this is a perfect candidate for reusing Vectors.
  //Use a factory method instead of 'new Vector( )'
  Vector v = vectorPoolManager.getVector( );

  ... //do vector manipulation stuff

  //and the extra work is that we have to explicitly tell the
  //pool manager that we have finished with the vector
  vectorPoolManager.returnVector(v);
}

Note that nothing stops the application from retaining a handle on a vector after it has been returned to the pool, and obviously that could lead to a classic “inadvertent reuse of memory” bug. You need to ensure that handles to vectors are not held anywhere: these Vector s should be used only internally within an application, not externally in third-party classes where a handle may be retained. The following class manages a pool of Vectors:

package tuning.reuse;

import java.util.Vector;

public class VectorPoolManager
{

  Vector[] pool;
  boolean[] inUse;
  public VectorPoolManager(int initialPoolSize)
  {
    pool = new Vector[initialPoolSize];
    inUse = new boolean[initialPoolSize];
    for (int i = pool.length-1; i>=0; i--)
    {
      pool[i] = new Vector( );
      inUse[i] = false;
    }
  }

  public synchronized Vector getVector( )
  {
    for (int i = inUse.length-1; i >= 0; i--)
      if (!inUse[i])
      {
        inUse[i] = true;
        return pool[i];
      }

    //If we got here, then all the Vectors are in use. We will
    //increase the number in our pool by 10 (arbitrary value for
    //illustration purposes).
    boolean[] old_inUse = inUse;
    inUse = new boolean[old_inUse.length+10];
    System.arraycopy(old_inUse, 0, inUse, 0, old_inUse.length);

    Vector[] old_pool = pool;
    pool = new Vector[old_pool.length+10];
    System.arraycopy(old_pool, 0, pool, 0, old_pool.length);

    for (int i = old_pool.length; i < pool.length; i++)
    {
      pool[i] = new Vector( );
      inUse[i] = false;
    }

    //and allocate the last Vector
    inUse[pool.length-1] = true;
    return pool[pool.length-1];
  }

  public synchronized void returnVector(Vector v)
  {
    for (int i = inUse.length-1; i >= 0; i--)
      if (pool[i] == v)
      {
        inUse[i] = false;
        //Can use clear( ) for java.util.Collection objects
        //Note that setSize( ) nulls out all elements
        v.setSize(0);
        return;
      }
    throw new RuntimeException("Vector was not obtained from the pool: " + v);
  }
}

Because you reset the Vector size to 0 when it is returned to the pool, all objects previously referenced from the vector are no longer referenced (the Vector.setSize( ) method nulls out all internal index entries beyond the new size to ensure no reference is retained). However, at the same time, you do not return any memory allocated to the Vector itself, because the Vector’s current capacity is retained. A lazily initialized version of this class simply starts with zero items in the pool and sets the pool to grow by one or more each time.

(Many JDK collection classes, including java.util.Vector, have both a size and a capacity. The capacity is the number of elements the collection can hold before that collection needs to resize its internal memory to be larger. The size is the number of externally accessible elements the collection is actually holding. The capacity is always greater than or equal to the size. By holding spare capacity, elements can be added to collections without having to continually resize the underlying memory. This makes element addition faster and more efficient.)

ThreadLocals

The previous example of a pool manager can be used by multiple threads in a multithreaded application, although the getVector( ) and returnVector() methods first need to be defined as synchronized . This may be all you need to ensure that you reuse a set of objects in a multithreaded application. Sometimes though, there are objects you need to use in a more complicated way. It may be that the objects are used in such a way that you can guarantee you need only one object per thread, but any one thread must consistently use the same object. Singletons (see Section 4.2.4) that maintain some state information are a prime example of this sort of object.

In this case, you might want to use a ThreadLocal object. ThreadLocals have accessors that return an object local to the current thread. ThreadLocal use is best illustrated using an example; this one produces:

[This is thread 0, This is thread 0, This is thread 0]
[This is thread 1, This is thread 1, This is thread 1]
[This is thread 2, This is thread 2, This is thread 2]
[This is thread 3, This is thread 3, This is thread 3]
[This is thread 4, This is thread 4, This is thread 4]

Each thread uses the same access method to obtain a vector to add some elements. The vector obtained by each thread is always the same vector for that thread: the ThreadLocal object always returns the thread-specific vector. As the following code shows, each vector has the same string added to it repeatedly, showing that it is always obtaining the same thread-specific vector from the vector access method. (Note that ThreadLocals are only available from Java 2, but it is easy to create the equivalent functionality using a Hashtable: see the getVectorPriorToJDK12() method.)

package tuning.reuse;

import java.util.*;

public class ThreadedAccess
  implements Runnable
{
  static int ThreadCount = 0;

  public void run( )
  {
    //simple test just accesses the thread local vector, adds the
    //thread specific string to it, and sleeps for two seconds before
    //again accessing the thread local and printing out the value.
    String s = "This is thread " + ThreadCount++;
    Vector v = getVector( );
    v.addElement(s);
    v = getVector( );
    v.addElement(s);
    try{Thread.sleep(2000);}catch(Exception e){}
    v = getVector( );
    v.addElement(s);
    System.out.println(v);
  }

  public static void main(String[] args)
  {
    try
    {
      //Four threads to see the multithreaded nature at work
      for (int i = 0; i < 5; i++)
      {
        (new Thread(new ThreadedAccess( ))).start( );
        try{Thread.sleep(200);}catch(Exception e){}
      }
    }
    catch(Exception e){e.printStackTrace( );}
  }

  private static ThreadLocal vectors = new ThreadLocal( );
  public static Vector getVector( )
  {
     //Lazily initialized version. Get the thread local object
     Vector v = (Vector) vectors.get( );
     if (v == null)
     {
       //First time. So create a vector and set the ThreadLocal
       v = new Vector( );
       vectors.set(v);
     }
     return v;
  }

  private static Hashtable hvectors = new Hashtable( );
  /* This method is equivalent to the getVector( ) method, 
   * but works prior to JDK 1.2 (as well as after).
   */
  public static Vector getVectorPriorToJDK12( )
  {
     //Lazily initialized version. Get the thread local object
     Vector v = (Vector) hvectors.get(Thread.currentThread( ));
     if (v == null)
     {
       //First time. So create a vector and set the thread local
       v = new Vector( );
       hvectors.put(Thread.currentThread( ), v);
     }
     return v;
  }
}

Reusable Parameters

Reuse also applies when a constant object is returned for information. For example, the preferredSize( ) of a customized widget returns a Dimension object that is normally one particular dimension. But to ensure that the stored unchanging Dimension value does not get altered, you need to return a copy of the stored Dimension. Otherwise, the calling method accesses the original Dimension object and can change the Dimension values, thus affecting the original Dimension object itself.

Java provides a final modifier to fields that allows you to provide fixed values for the Dimension fields. Unfortunately, you cannot redefine an already existing class, so Dimension cannot be redefined to have final fields. The best solution in this case is that a separate class, FixedDimension, be defined with final fields (this cannot be a subclass of Dimension, as the fields can’t be redefined in the subclass). This extra class allows methods to return the same FixedDimension object if applicable, or a new FixedDimension is returned (as happens with Dimension) if the method requires different values to be returned for different states. Of course, it is too late now for java.awt to be changed in this way, but the principle remains.

Note that making a field final does not make an object unchangeable. It only disallows changes to the field:

public class FixedDimension {
  final int height;
  final int width;
  ...
}

//Both the following fields are defined as final
public static final Dimension dim = new Dimension(3,4);
public static final FixedDimension fixedDim = new FixedDimension(3,4);

dim.width = 5;           //reassignment allowed
dim = new Dimension(3,5);//reassignment disallowed
fixedDim.width = 5;      //reassignment disallowed
fixedDim = new FixedDimension(3,5); //reassignment disallowed

An alternative to defining preferredSize() to return a fixed object is to provide a method that accepts an object whose values will be set, e.g., preferredSize(Dimension). The caller can then pass in a Dimension object, which would have its values filled in by the preferredSize(Dimension)method. The calling method can then access the values in the Dimension object. This same Dimension object can be reused for multiple components. This design pattern is beginning to be used extensively within the JDK. Many methods developed with JDK 1.2 and onward accept a parameter that is filled in, rather than returning a copy of the master value of some object. If necessary, backward compatibility can be retained by adding this method as extra, rather than replacing an existing method:

public static final Dimension someSize = new Dimension(10,5);
//original definition returns a new Dimension.
public Dimension someSize( ) {
  Dimension dim = new Dimension(0,0);
  someSize(dim);
  return dim;
}
//New method which fills in the Dimension details in a passed parameter.
public void someSize(Dimension dim) {
  dim.width = someSize.width;
  dim.width = someSize.height;
}

Canonicalizing Objects

Wherever possible, you should replace multiple objects with a single object (or just a few). For example, if you need only one VectorPoolManager object, it makes sense to provide a static variable somewhere that holds this. You can even enforce this by making the constructor private and holding the singleton in the class itself; e.g., change the definition of VectorPoolManager to:

public class VectorPoolManager
{
  public static final VectorPoolManager SINGLETON =
    new VectorPoolManager(10);
  Vector[] pool;
  boolean[] inUse;

  //Make the constructor private to enforce that
  //no other objects can be created.
  private VectorPoolManager(int initialPoolSize)
  {
  ...
}

An alternative implementation is to make everything static (all methods and both the instance variables in the VectorPoolManager class). This also ensures that only one pool manager can be used. My preference is to have a SINGLETON object for design reasons.^[29]

This activity of replacing multiple copies of an object with just a few objects is often referred to as canonicalizing objects. The Boolean s provide an existing example of objects that should have been canonicalized in the JDK. They were not, and no longer can be without breaking backward compatibility. For Booleans, only two objects need to exist, but by allowing a new Boolean object to be created (by providing public constructors), you lose canonicalization. The JDK should have enforced the existence of only two objects by keeping the constructors private. Note that canonical objects have another advantage in addition to reducing the number of objects created: they also allow comparison by identity . For example:

Boolean t1 = new Boolean(true);
System.out.println(t1==Boolean.TRUE);
System.out.println(t1.equals(Boolean.TRUE));

produces the output:

false
true

If Booleans had been canonicalized, all Boolean comparisons could be done by identity: comparison by identity is always faster than comparison by equality, because identity comparisons are simply pointer comparisons.^[30]

You are probably better off not canonicalizing all objects that could be canonicalized. For example, the Integer class can (theoretically) have its instances canonicalized, but you need a map of some sort, and it is more efficient to allow multiple instances, rather than to manage a potential pool of four billion objects. However, the situation is different for particular applications. If you use just a few Integer objects in some defined way, you may find you are repeatedly creating the Integer objects with values 1, 2, 3, etc., and also have to access the integerValue( ) to compare them. In this case, you can canonicalize a few integer objects, improving performance in several ways: eliminating the extra Integer creations and the garbage collections of these objects when they are discarded, and allowing comparison by identity. For example:

public class IntegerManager
{
  public static final Integer ZERO = new Integer(0);
  public static final Integer ONE = new Integer(1);
  public static final Integer TWO = new Integer(2);
  public static final Integer THREE = new Integer(3);
  public static final Integer FOUR = new Integer(4);
  public static final Integer FIVE = new Integer(5);
  public static final Integer SIX = new Integer(6);
  public static final Integer SEVEN = new Integer(7);
  public static final Integer EIGHT = new Integer(8);
  public static final Integer NINE = new Integer(9);
  public static final Integer TEN = new Integer(10);
}

public class SomeClass
{
  public void doSomething(Integer i)
  {
    //Assume that we are passed a canonicalized Integer
    if (i == IntegerManager.ONE)
     xxx( );
   else if(i == IntegerManager.FIVE)
     yyy( );
   else ...
  }
  ...
}

There are various other frequently used objects throughout an application that should be canonicalized. A few that spring to mind are the empty string, empty arrays of various types, and some dates.

String canonicalization

There can be some confusion about whether Strings are already canonicalized. There is no guarantee that they are, although the compiler can canonicalize Strings that are equal and are compiled in the same pass. The String.intern() method canonicalizes strings in an internal table. This is supposed to be, and usually is, the same table used by strings canonicalized at compile time, but in some earlier JDK versions (e.g., 1.0), it was not the same table. In any case, there is no particular reason to use the internal string table to canonicalize your strings unless you want to compare Strings by identity (see Section 5.5). Using your own table gives you more control and allows you to inspect the table when necessary. To see the difference between identity and equality comparisons for Strings, including the difference that String.intern( ) makes, you can run the following class:

public class Test
{
  public static void main(String[] args)
  {
    System.out.println(args[0]); 	//see that we have the empty string

    //should be true
    System.out.println(args[0].equals(""));

    //should be false since they are not identical objects
    System.out.println(args[0] == "");

    //should be true unless there are two internal string tables
    System.out.println(args[0].intern( ) == ""); 
  }
}

This Test class, when run with the command line:

java Test ""

gives the output:

true
false
true

Changeable objects

Canonicalizing objects is best for read-only objects and can be troublesome for objects that change. If you canonicalize a changeable object and then change its state, then all objects that have a reference to the canonicalized object are still pointing to that object, but with the object’s new state. For example, suppose you canonicalize a special Date value. If that object has its date value changed, all objects pointing to that Date object now see a different date value. This result may be desired, but more often it is a bug.

If you want to canonicalize changeable objects, one technique to make it slightly safer is to wrap the object with another one, or use your own (sub)class.^[31] Then all accesses and updates are controlled by you. If the object is not supposed to be changed, you can throw an exception on any update method. Alternatively, if you want some objects to be canonicalized but with copy-on-write behavior, you can allow the updater to return a noncanonicalized copy of the canonical object.

Note that it makes no sense to build a table of millions or even thousands of strings (or other objects) if the time taken to test for, access, and update objects in the table is longer than the time you are saving canonicalizing them.

Weak references

One technique for maintaining collections of objects that can grow too large is the use of WeakReference s (from the java.lang.ref package in Java 2). If you need to maintain one or more pools of objects with a large number of objects being held, you may start coming up against memory limits of the VM. In this case, you should consider using WeakReference objects to hold on to your pool elements. Objects referred to by WeakReferences can be automatically garbage-collected if memory gets low enough (see Reference Objects).

Reference Objects

In many ways, you can think of Reference objects as normal objects that have a private Object instance variable. You can access the private object (termed the referent) using the Reference.get( ) method. However, Reference objects differ from normal objects in one hugely important way. The garbage collector may be allowed to clear Reference objects when it decides space is low enough. Clearing the Reference object sets the referent to null. For example, say you assign an object to a Reference. Later you test to see if the referent is null. It could be null if, between the assignment and the test, the garbage collector kicked in and decided to reclaim space:

Reference ref = new WeakReference(someObject);
//ref.get( ) is someObject at the moment
//Now do something that creates lots of objects, making
//the garbage collector try to find more memory space
doSomething( );

//now test if ref is null
if (ref.get( ) == null)
  System.out.println("The garbage collector deleted my ref");
else
  System.out.println("ref object is still here");

Note that the referent can be garbage-collected at any time, as long as there are no other strong references referring to it. (In the example, ref.get( ) can become null only if there are no other non-Reference objects referring to someObject.)

The advantage of References is that you can use them to hang on to objects that you want to reuse but are not needed immediately. If memory space gets too low, those objects not currently being used are automatically reclaimed by the garbage collector. This means that you subsequently need to create objects instead of reusing them, but that is preferable to having the program crash from lack of memory. (To delete the reference object itself when the referent is nulled, you need to create the reference with a ReferenceQueue instance. When the reference object is cleared, it is added to the ReferenceQueue instance and can then be processed by the application, e.g., explicitly deleted from a hash table in which it may be a key.)

There are three Reference types in Java 2. WeakReferences and SoftReferences differ essentially in the order in whcih the garbage collector clears them. Basically, the garbage collector does not clear WeakReference objects until all SoftReferences have been cleared. PhantomReferences (not addressed here) are not cleared automatically by the garbage collector and are intended for use in a different way.

The concept behind this differentiation is that SoftReferences are intended to be used for caches that may need to have memory automatically freed, and WeakReferences are intended for canonical tables that may need to have memory automatically freed.

The rationale is that caches normally take up more space and are the first to be reclaimed when memory gets low. Canonical tables are normally smaller, and developers prefer them not to be garbage-collected unless memory gets really low. This differentiation between the two reference types allows cache memory to be freed up first if memory gets low; only when there is no more cache memory to be freed does the garbage collector start looking at canonical table memory.

Java 2 comes with a java.util.WeakHashMap class that implements a hash table with keys held by weak references.

A WeakReference normally maintains references to elements in a table of canonicalized objects. If memory gets low, any of the objects referred to by the table and not referred to anywhere else in the application (except by other weak references) are garbage-collected . This does not affect the canonicalization because only those objects not referenced anywhere else are removed. The canonical object can be re-created when required, and this new instance is now the new canonical object: remember that no other references to the object exist, or the original could not have been garbage-collected.

For example, a table of canonical Integer objects can be maintained using WeakReferences. This example is not particularly useful: unlike the earlier example, in which Integer objects from 1 to 10 can be referenced directly with no overhead, thus providing a definite speedup for tests, the next example has overheads that would probably swamp any benefits of having canonical Integers. I present it only as a clear and simple example to illustrate the use of WeakReferences.

The example has two iterations: one sets an array of canonical Integer objects up to a value set by the command-line argument; a second loops through to access the first 10 canonical Integers. If the first loop is large enough (or the VM memory is constrained low enough), the garbage collector kicks in and starts reclaiming some of the Integer objects that are all being held by WeakReferences. The second loop then reaccesses the first 10 Integer objects. Earlier, I explicitly held on to five of these Integer objects (integers 3 to 7 inclusive) in variables so that they could not be garbage-collected, and so that the second loop would reset only the five reclaimed Integers. When running this test with the VM constrained to 4 MB:

java -Xmx4M  tuning.reuse.Test 100000

you get the following output:

Resetting integer 0
Resetting integer 1
Resetting integer 2
Resetting integer 8
Resetting integer 9

The example is defined here. Note the overheads. Even if the reference has not been garbage-collected, you have to access the underlying object and cast it to the desired type:

package tuning.reuse;

import java.util.*;
import java.lang.ref.*;

public class Test
{
  public static void main(String[] args)
  {
    try
    {
      Integer ic = null;
      int REPEAT = args.length > 0 ? Integer.parseInt(args[0]) : 10000000;

      //Hang on to the Integer objects from 3 to 7
      //so that they cannot be garbage collected
      Integer i3 = getCanonicalInteger(3);
      Integer i4 = getCanonicalInteger(4);
      Integer i5 = getCanonicalInteger(5);
      Integer i6 = getCanonicalInteger(6);
      Integer i7 = getCanonicalInteger(7);

      //Loop through getting canonical integers until there is not
      //enough space, and the garbage collector reclaims some.
      for (int i = 0; i < REPEAT; i++)
        ic = getCanonicalInteger(i);

      //Now just re-access the first 10 integers (0 to 9) and
      //the 0, 1, 2, 8, and 9 integers will need to be reset in
      //the access method since they will have been reclaimed
      for (int i = 0; i < 10; i++)
        ic = getCanonicalInteger(i);
      System.out.println(ic);
    }
    catch(Exception e){e.printStackTrace( );}
  }

  private static Vector canonicalIntegers = new Vector( );
  public static Integer getCanonicalInteger(int i)
  {
    //First make sure our collection is big enough
    if (i >= canonicalIntegers.size( ))
      canonicalIntegers.setSize(i+1);

    //Now access the canonical value.
    //This element contains null if the the value has never been set
    //or a weak reference that may have been garbage collected
    WeakReference ref = (WeakReference) canonicalIntegers.elementAt(i);
    Integer canonical_i;

    if (ref == null)
    {
      //never been set, so create and set it now
      canonical_i = new Integer(i);
      canonicalIntegers.setElementAt(new WeakReference(canonical_i), i);
    }
    else if( (canonical_i = (Integer) ref.get( )) == null)
    {
      //has been set, but was garbage collected, so recreate and set it now
      //Include a print to see that we are resetting the Integer
      System.out.println("Resetting integer " + i);
      canonical_i = new Integer(i);
      canonicalIntegers.setElementAt(new WeakReference(canonical_i), i);
    }
    //else clause not needed, since the alternative is that the weak ref was
    //present and not garbage collected, so we now have our canonical integer
    return canonical_i;
  }

}

Enumerating constants

Another canonicalization technique often used is replacing constant objects with integers. For example, rather than use the strings “female” and “male”, you should use a constant defined in an interface:

public interface GENDER
{
  public static final int FEMALE=1;
  public static final int MALE=2;
}

Used consistently, this enumeration can provide both speed and memory advantages. The enumeration requires less memory than the equivalent strings and makes network transfers faster. Comparisons are faster too, as the identity comparison can be used instead of the equality comparison. For example, you can use:

this.gender == FEMALE;

instead of:

this.gender.equals("female");

^[29]The VectorPoolManager is really an object with behavior and state. It is not just a related group of functions (which is what class static methods are equivalent to). My colleague Kirk Pepperdine insists that this choice is more than just a preference. He states that holding on to an object as opposed to using statics provides more flexibility should you need to alter the use of the VectorPoolManager or provide multiple pools. I agree.

^[30]Deserializing Booleans would have required special handling to return the canonical Boolean. All canonicalized objects similarly require special handling to manage serialization. Java serialization supports the ability, when deserializing, to return specific objects in place of the object that is normally created by the default deserialization mechanism.

^[31]Beware that using a subclass may break the superclass semantics.

Get Java Performance Tuning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial