ArrayLists in Java, and introducing Generics

ArrayLists as Arrays

You have seen arrays in Java, where for any type, for example Thing, adding [] to the end of the type creates an array type, so Thing[] is the type "array of Thing". Then, if i is an integer value, and a a variable of type Thing[], we can change the value of the ith component of the array referred to by a using the assignment a[i]=expr where expr evaluates to a value of type Thing and the component is set to that value. Any usage of a[i] in an expression gives the current value of the ith component of the array which a is currently referring to.

ArrayLists work in a similar way. The class ArrayList occurs in the package java.util, so if it is used in a file, there must be an import statement for it. The type ArrayList<Thing> is the type "arrayList of Thing", and in general ArrayList<X> is the type "arrayList of X" where X can be any type except a primitive type. So ArrayList<int> and ArrayList<char> are not allowed, this is where the equivalent wrapper classes need to be used as you can have ArrayList<Integer> and ArrayList<Character>. Then, if a is of type ArrayList<Thing>, the statement a.set(i,expr) is equivalent to the previous a[i]=expr when a referred to an array. The method call a.get(i) is the equivalent of a[i] in an expression. So note that, unlike arrays, there are no special symbols for referring to the components of an arrayList, it is done using ordinary method calls. Note that a.get(i)=expr does not make sense as a.get(i) is a method call, and you cannot assign a value to a method call.

Here is a static method which takes an arrayList of Integers and two integer values and destructively changes all occurrences of the first integer value to the second in the arrayList:

 public static void change(ArrayList<Integer> a,int m,int n)
 {
  for(int i=0; i<a.size(); i++)
    if(a.get(i).equals(m))
       a.set(i,n);
 }

You can compare this to the equivalent method for arrays which we saw previously:

public static void change(int[] a,int m,int n)
{
 for(int i=0; i<a.length; i++)
   if(a[i]==m)
      a[i]=n;
}

In the arrayList example, a.get(i).equals(m) calls get(i) on a to get the value of the ith component of the arrayList referred to by a, then equals(m) is called on that component to see if it stores a value equal to m. The method equals can be called on an object of type Integer, it takes an argument of type Integer and returns true if the argument stores an integer of the same value as its own integer, false otherwise. With the int variable as the argument, what actually happens is that m is "boxed" into the equivalent Integer object. Also in a.set(i,n) the int value n is boxed into the equivalent Integer object. You can see also that the number of components in an arrayList referenced by variable a is given by the method call a.size(), which is the equivalent of a.length for an array.

Here is a method which copies an arrayList of integers:

public static ArrayList<Integer> copy(ArrayList<Integer> a)
{
 ArrayList<Integer> b = makeArrayListInt(a.size());
 for(int i=0; i<a.size(); i++)
    b.set(i,a.get(i));
 return b;
}

This is equivalent to the method for copying arrays:

public static int[] copy(int[] a)
{
 int[] b = new int[a.length];
 for(int i=0; i<a.length; i++)
    b[i]=a[i];
 return b;
}

So you can use ArrayList<Integer> as a type for arguments to methods and for the return type of methods, just as you can use int[] in these ways. Note that in our example b.set(i,a.get(i)) is equivalent to

   {
    int n=a.get(i);
    b.set(i,n);
   }

that is, it gets the ith component of a and sets the ith component of b to it.

In this example, makeArrayListInt(a.size()) is used as the equivalent of new int[a.length], that is, it creates a new arrayList with the same number of components as a but with those components set to 0. However, Java does not provide us with this method, because arrayLists aren't intended to work like this. Here is the code for the static method makeArrayListInt:

public static ArrayList<Integer> makeArrayListInt(int n)
{
 ArrayList<Integer> a = new ArrayList<Integer>();
 for(int i=0; i<n; i++)
    a.add(0);
 return a;
}

This works using some of the additional flexibility arrayLists have over arrays which we describe below.

You can find the code for this and the other examples from this set of notes in the directory ~mmh/DCS128/code/arrayLists. The file UseArrayLists1.java gives a destructive change method, and a copy method as given above, together with a main method whichh enables it to be demonstrated.

ArrayLists as flexible sized arrays

Once an array object has been created, what each individual component refers to can be changed, but the size of the array cannot be changed, as we saw previously. ArrayList objects, however, can have their size changed. If a refers to an object of type ArrayList<Thing>, then the statement a.add(expr) where expr evaluates to a value of type Thing causes an extra component to be added to the end of the arrayList object referred to by a which is set to refer to the object returned by the evaluation of expr. The value returned by a.size() is increased by 1, and the value returned by a.get(a.size()-1) with the new value of a.size() is the value returned by expr. This is a destructive change, so any variable which aliases a will see the same change in the object it refers to as it is the same object.

So in our code for makeArrayListInt above, the statement

ArrayList<Integer> a = new ArrayList<Integer>();

declares a variable of type ArrayList<Integer> called a, and sets it to a new arrayList of integers whose size is initially 0. Then every time a.add(0) is called in the loop, an extra component is added to this arrayList object, every component added referring to an Integer of value 0.

In fact, there was no need for our copy method to create an initial arrayList of the same size as the argument arrayList. It could just have created an initial arrayList of size 0, and added the new components to make the copy:

public static ArrayList<Integer> copy(ArrayList<Integer> a)
{
 ArrayList<Integer> b =  new ArrayList<Integer>();
 for(int i=0; i<a.size(); i++)
    b.add(a.get(i));
 return b;
}

The code for this is found in the file UseArrayLists2.java.

But there is actually no need to write a copy method at all, as the class ArrayList provides a constructor which takes an arrayList as an argument and constructs a new arrayList which is the same length as the arrayList argument and for every i from 0 to one less than the size of the arrayList, the ith component of the new arrayList is an alias of the ith component of the argument arrayList. So, for example

ArrayList<Integer> b = new ArrayList<Integer>(a);

declares variable b of type ArrayList<Integer> and sets it to refer to a new ArrayList<Integer> object which has the same components as the ArrayList<Integer> object referred to be a. It's important to note that the components within the arrayList are not copied, this is referred to as a shallow copy. Since objects of type Integer are immutable, this is not an issue in this case, but it would be if we had an arrayList of mutable objects. Since the base type of an arrayList could be any object type, then it could be a mutable type such as DrinksMachine as covered previously. If a refers to an arrayList of DrinksMachine objects of size 4, then the result of

ArrayList<DrinksMachine> b = new ArrayList<DrinksMachine>(a);

could be illustrated diagrammatically as:

Code which uses a constructor to make a shallow copy of an arrayList and then uses the destructive change method can be found in UseArrayLists3.java. Note that although there is a constructor for the type ArrayList which takes an argument of type int, using it does not create a new arrayList of a particular length, it actually creates an arrayList of length 0 just as the zero-argument constructor does. The int argument is actually to do with the internal structure of an arrayList, which you need to be concerned with at this point. So in this way arrayLists work differently from arrays: with arrays you always create an array of a particular size with its elements intialised to 0 or null, but this cannot be done with arrayLists, instead the usual technique is to create an arrayList of length 0 and expand it by adding new elements.

You can find the complete documentation for the class ArrayList here. As we have previously noted with the official Java documentation, it is complex because it has to cover every aspect of the class, which includes plenty of things you have not covered yet. For instance, there's a large section at the beginning of the documentation noting that the implementation is not "synchronized", and then stating what to do "If multiple threads access an ArrayList instance concurrently". Concurrency and multiple threads is not something we shall cover at all in this course, you will need to become much more advanced Java programmers before this becomes as issue so you can ignore this section, but the problem when you are just starting out and faced with complex documentation like this is that it's hard for you to know what aspects you can safely ignore because they go beyond what you need to know. However, from the documentation you can see that there is a variety of methods provided in class ArrayList which perform useful operations changing both the size and content of the ArrayList object they are called on, so making arrayLists much more flexible than arrays.

The class is named as ArrayList<E> where E stands for whichever base type you use. So the method get, which is the method we described above, is documented as having argument type int and return type E meaning that if get is called on an object of type ArrayList<Integer> it returns an Integer, if it is called on an object of type ArrayList<String> it returns a String, and so on.

We have already seen the single-argument add method which adds its argument to the end of the arrayList object it is called on, but there is also a two-argument add which takes an integer and an object of the base type of the arrayList and adds the object at the position in the arrayList specified by the integer. The object which was at this position and all the objects in higher-indexed positions are moved up one place, and the size of the array increases by one. So this is a different thing from set which also takes an integer representing a position and an object of the base type, but replaces the object at the given position, so this object is no longer in the arrayList, and the size of the arrayList remains the same. Note that set does actually have a return value, given as type E, the base type of the arrayList object it is called on. Where we have used it above, we have done nothing with the value it returns, so used it as if it has return type void. This is fine, mostly this is how set is used. But the value it returns is the object that was replaced, just in case that was something you needed to keep a reference to. There are also two remove methods. One takes an argument of type int and removes the item from the arrayList it was called on at the position given by the int argument, moving everything following it down one place in the arrayList and causing the arrayList's size to be reduced by one.

The class arrayList has two single-argument methods called remove. We saw a method called remove previously with arrays, which was static and constructive, so a call remove(a,n) where a referred to an array of integers and n held an integer returned a reference to a new array which was like the one referenced by a but had the first occurrence of n removed and everything following it moved down one place. The remove methods in arrayList are not static and are destructive, so a.remove(n) changes the arrayList referred to by a to remove an item, all items in the arrayList following it are moved down one place, and the size of the arrayList is reduced by one. However, if n is of type int, the call a.remove(n) removes the item in the component of the arrayList indexed by n, otherwise it removes the first item in the arrayList which is equal to n. So method remove in class ArrayList is like method removePos we saw previously with arrays when it has an argument of type int, and like the previous remove otherwise. This is why we said there are two different single-argument methods called remove in class ArrayList: which one is used depends on the type of the argument. The situation where methods have the same name but can be distinguished by the number and/or the types of their arguments in known as overloading.

You might wonder how a.remove(n) works when a refers to an arrayList of integers. However, remember the base type of an arrayList of integers must be Integer, not int. So if a refers to an arrayList of integers, then n must be of type Integer to have the effect of removing an item equal to n rather than the item at position n. Note that both remove methods in class arrayList have return types. The method with argument of type int which removes the item at the position given by its argument returns a reference to the item that was removed, and throws an IndexOutOfBoundsException if the int argument is a value below 0 or equal to or great than the size of the arrayList it is called on. Otherwise, a call of remove on an arrayList returns true if the item given as an argument can be found and is removed, and it returns false if no item equal to the argument can be found.

The class arrayList also has static methods indexOf and lastIndexOf, which work like the position methods we gave previously, except again they are not static, so a.indexOf(n) gives the index of the first occurrence of an item equal to n in the arrayList referenced by a, while a.lastIndexOf(n) gives the position of the last occurrence; both of these return -1 if no item equal to n can be found in the arrayList.

Methods on ArrayLists

The class arrayList gives us some of the methods we would most likely need to use on arrayLists, but if there is not something there already to do a task we require, we need to write our own method. You have already seen a method which takes an arrayList of integers and two integers, and changes all occurrences of one integer argument to the other. Now here is a method which takes an arrayList of strings and two strings and changes all occurrences of the first string to the second:

 public static void change(ArrayList<String> a,String m,String n)
 {
  for(int i=0; i<a.size(); i++)
    if(a.get(i).equals(m))
       a.set(i,n);
 }

As you can see, it is identical to the method that deals with arrayLists of integers, except for the type String rather than Integer. It seems wasteful that we should have to write a separate method called change identical to other methods every time we want a method to do this destructive change on an arrayList for a different base type. We shall show below how this is unnecessary.

First, however, let us consider another operation, this time we want to take an arrayList and two integers, and insert the second integer after every occurrence of the first. So if the array is the one that can be represented by [2,5,8,3,2,4,2,7] and the numbers are 2 and 10, the result will be the array [2,10,5,8,3,2,10,4,2,10,7], that is, 10 has been added after every 2. With arrayLists, we could do the change either constructively or destructively. Here is a destructive method to perform the operation:

 public static void addAfter(ArrayList<Integer> a,Integer m,Integer n)
 // Adds n after every occurrence of m in a
 {
  for(int i=0; i<a.size(); i++)
    if(a.get(i).equals(m))
       a.add(i+1,n);
 }

Here a call addAfter(a,p,q) will add q after every p in a, and the arrayList referred to by a and by any other variable which aliases it will be changed. We have used the type Integer for its arguments, but it is passed int values they will be converted into Integer values automatically.

Now here is a constructive method performing the same operation:

 public static ArrayList<Integer> addAfter(ArrayList<Integer> a,int m,int n)
 // Adds n after every occurrence of n in a constructively
 {
  ArrayList<Integer> b = new ArrayList<Integer>();
  for(int i=0; i<a.size(); i++)
    {
     b.add(a.get(i));
     if(a.get(i).equals(m))
       b.add(n);
    }
  return b
 }

Here, addAfter(a,p,q) does not change the arrayList referred to by a, but constructs a new one with the desired change and returns it, so a1=addAfter(a,p,q) will leave the arrayList referred to by a unchanged but set a1 (assuming it is declared of type ArrayList<Integer>) to refer to the new one.

There is a problem with the destructive version. Suppose we want to add a 2 after every occurrence of 2, so that starting with [2,5,8,3,2,4,2,7] we change it to [2,2,5,8,3,2,2,4,2,2,7]. If we call the method with the call addAfter(a,p,q) where a refers to the arrayList written [2,5,8,3,2,4,2,7] and p and q are set to 2. You will find the code runs for much longer than you'd expect and terminates with an error message. The reason for this is that our code does not take into account the case where both integer arguments are the same value. In our example, having added a 2 after the first 2, it looks at the next integer and finding it is a 2 adds another 2 after it. Then it does the same and in fact keeps on adding 2 and expanding the size of the arrayList until it exceeds the maximum size the system can handle. The error message you will get here is:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

If you get an error message like this, unless you are doing something which you know for sure involves creating a huge arrayList or some other collection type, you should conclude your code somehow gets into an infinite loop which just carries on taking more and more store until it reaches the system limit. Always look for that infinite loop rather than ask "how can I increase the amount of store the system allows me to use?". The other thing to learn here is always check special cases, and the case where two arguments are the same where normally they'd be expected to be different is a good example. It's best that your code takes into account every possible special case, don't just assume "Oh, that will never really happen, so I don't need to bother adjusting the code to deal with it". Here is code for the destructive addAfter method which deals properly with the two integer arguments being the same and gives the result you would expect:

 public static void addAfter(ArrayList<Integer> a,int m,int n)
 // Adds n after every occurrence of m in a
 {
  for(int i=0; i<a.size(); i++)
    if(a.get(i).equals(m))
       {
        a.add(i+1,n);
        i++;
       }
 }

You can see that when a new integer is added to the arrayList, the variable i is increased by 1 in addition to its normal increase by 1 in the update part of the for loop. This is a good illustration of the way in which for loops do not always execute their body a fixed number of times, it can vary if the body contains code which can alter the value of the variables used in the test and update part of the for loop.

There is an issue we need to consider in our first version of a constructive method for the operation addAfter. In this case, it is simply a matter of good coding practice, it is not something that would cause it to give the wrong or an unexpected result. The code we gave calls a.get(i) twice in circumstances where it is always going to return the same value. If you find your code calls a method twice with the same arguments and you know nothing will change by calling the method and nothing in between the two calls will change the result returned, it's better to call it once, record that value using a variable, and then refer to that variable each time you want the result of the method call. This gives the following revised code for constructive addAfter:

 public static ArrayList<Integer> addAfter(ArrayList<Integer> a,Integer m,Integer n)
 // Adds n after every occurrence of m in a constructively.
 {
  ArrayList<Integer> b = new ArrayList<Integer>();
  for(int i=0; i<a.size(); i++)
    {
     Integer k=a.get(i);
     b.add(k);
     if(k.equals(m))
       b.add(n);
    }
  return b;
 }

So here the variable of type Integer called k holds the value of a.get(i). The reason for doing this is that if it took a significant amount of calculation to find the value of a.get(i), we would then avoid doing this calculation twice.

The file UseArrayLists4.java shows the first destructive version of the addAfter method, and the file UseArrayLists5.java shows the amended version. The file UseArrayLists6.java shows the first constructive version of the addAfter method, and the file UseArrayLists7.java shows the amended version.

Generic methods

Finally, here's how to get round the problem of having to write a separate method for each base type. Here's the amended version of constructive addAfter for dealing with arrayLists of strings rather than arrayLists of integers:

 public static ArrayList<String> addAfter(ArrayList<String> a,String m,String n)
 // Adds n after every occurrence of m in a constructively.
 {
  ArrayList<String> b = new ArrayList<String>();
  for(int i=0; i<a.size(); i++)
    {
     String k=a.get(i);
     b.add(k);
     if(k.equals(m))
       b.add(n);
    }
  return b;
 }

You can find this in the file UseArrayLists8.java. As you can see, the only way it differs is the use of String rather than Integer. The version of Java introduced in 2004, called Java 5, introduced the idea of type variables to deal with situations like this. When a method is written, any type variables it has come in a list before the return type, starting with the symbol < and ending with the symbol >. Here is the generic version of constructive addAfter:

 public static <T> ArrayList<T> addAfter(ArrayList<T> a,T m,T n)
 // Adds n after every occurrence of m in a constructively.
 {
  ArrayList<T> b = new ArrayList<T>();
  for(int i=0; i<a.size(); i++)
    {
     T k=a.get(i);
     b.add(k);
     if(k.equals(m))
       b.add(n);
    }
  return b;
 }

The type variable here is T. When the method is called, Java works out what the type variable is from the arguments so calling b=addAfter(a,p,q) will set T to Integer if a and b are of type ArrayList<Integer> and p and q are of type Integer. If a and b are of type ArrayList<String> and p and q are of type String, it will set T to String. The generic version of the addAfter method, together with code that shows it used for both an arrayList of integers and an arrayList of strings can be found in the file UseArrayLists10.java.

In the code for the generic version of addAfter you can see that the type variable may be used in the arguments to the method, the return type, and also to declare local variables, either as a type on its own or as the base type for ArrayList. However, we can assume nothing about objects whose type is given by the type variable. We can call the method equals on them, as we do on the example where we have k.equals(m)) where k and m are declared as of type T where T is a type variable. This is because the method equals can be called on objects of any type. We could not call a method on these variables of a sort which can only be called on objects of certain types. Later we shall see how type variables may be specified in a way which enables a wider range of methods to be called on them.

Going back to methods to take an arrayList and two values, and change all occurrences of one value to the other in the arrayList, here is a destructive generic version:

public static <T> void destChange(ArrayList<T> a,T m,T n)
{
 for(int i=0; i<a.size(); i++)
   if(a.get(i).equals(m))
      a.set(i,n);
}

This goes through the arrayList, and if the component of it indexed by i is equal to the argument m changes it to n by calling a.set(i,n). Here is a constructive version:

public static <T> ArrayList<T> constChange(ArrayList<T> a,T m,T n)
{
 ArrayList<T> b = new ArrayList<T>();
 for(int i=0; i<a.size(); i++)
   {
    T k = a.get(i);
    if(k.equals(m))
       b.add(n);
    else
       b.add(k);
   }
 return b;
}

This works by starting with variable b set to refer to an arrayList of size 0, then going through the components of a one at a time in order, adding them to the arrayList referred to by b, except in the case where the component is equal to the argument given by m in which case the value given by the argument n is added to the new arrayList. When this process is completed, the new arrayList is returned as the result of the method.

These methods can be found in a file called ArrayListOps.java. If a static method is declared as public in one class file, a call to it may be made in another class file by attaching the call to the name of the class the method is in. Examples of this are shown in the file UseArrayList11.java. For example, ArrayListOps.destReverse(a) calls a destructive reverse method, which takes a reference to an arrayList object and changes the object so that its components are placed in the reverse of their original order. The call ArrayListOps.destReverse(a) causes this to be done to the arrayList object referenced by the variable a.

Matthew Huntbach

Last modified: 7 February 2006