Algorithms and Data Structures in an Object-Oriented Framework (“ADSOOF”)

More array alteration

Changing the size of arrays

In Java once arrays have been created you cannot change their size. Note this does not mean that a variable which refers to an array of one size cannot be assigned to refer to an array of another size, but you cannot change the size of an actual array object.

For example, suppose we have a variable a of type int[] which refers to the array we can show diagrammatically as:

and we want to change it by adding the number 17 to the end so it refers to the array we can show diagrammatically as:

In more general terms we want an operation that takes an integer and an array of integers and adds the integer to the end of the array. When writing code and thinking about writing code, we should always try and put it in general terms. If you have a particular problem, see if it is an example of a more general problem. Then code to solve the more general problem can be used to solve the specific problem by setting variables to the specific values of the problem.

If the array a refers to the first array above, we might wonder if the statement

a[8]=17;
would do the trick. To generalise this, suppose a refers to the array, and n holds an integer. We know the components of the array are indexed from 0 to one less than the length of the array which is given by a.length-1, so would referring to a[a.length] automatically add an extra component to the end of the array? We could do this by the code fragment:
a[a.length]=n;
and we could make this code fragment into a separate method we could call whenever we needed to add an integer to the end of an array of integers:
public static void add(int[] a,int n)
// THIS IS SILLY CODE!
{
 a[a.length]=n;
}
It would not work, in Java referring to an array cell beyond the size of the array causes a run time error. Technically this means it throws an exception, which in this case is of the type ArrayIndexOutOfBoundsException. Note that even though you can tell from the code (if you knew how Java works) that it could not possibly execute as intended, it is a run-time error, it would not be picked up when the .java file is compiled, only when the .class file is run.

You also cannot get round this by re-assigning a value to a.length. For example

a.length=a.length+1;
a[a.length-1]=n;
This time, this is a compile time error. When you attempt to compile the .java file to produce the .class file you get an error message and the .class file is never produced. The technical reason is that a.length counts as a final variable, and final variables cannot be given new values in assignments.

The reason why referring to a[a.length] is a run-time error not a compile time error is that when Java code is compiled and something of the form a[...] is encountered, the compiler only tests that the ... part is something that is of type int. It doesn't check to see if it is the case that the code is such that the ... cannot possibly evaluate to a value which is greater than or equal to 0 and less than a.length.

To generalise further, as in Java it is not possible to change the size of an array while the code is executing, destructive methods, which work by changing an array, can only be employed if the the desired change is to the contents but not to the size of the array.

A constructive method which has the effect of adding something to the end of an array can easily be written:

public static int[] add(int[] a,int n)
{
 int[] b = new int[a.length+1];
 for(int i=0; i<a.length; i++)
    b[i]=a[i];
 b[a.length]=n;
 return b;
}
With this, a=add(a,n) will have the desired effect of changing a to refer to an array which is like the old a with n added to the end. There is, of course, no requirement that the variable names used in the method call correspond to the variable names used in the method declaration, so equally c=add(c,m), assuming c is of type int[] and m is of type int, will have the effect of changing c to refer to an array which is like the old c with m added to the end. But note calling this constructive method in this way has its effect by creating a whole new array and changing the original array variable to refer to it, it does not actually change the old array object itself.

Don't think that something like:

public static void add(int[] a,int n)
// THIS IS SILLY CODE!
{
 int[] b = new int[a.length+1];
 for(int i=0; i<a.length; i++)
    b[i]=a[i];
 b[a.length]=n;
 a=b;
}
will work so that a call on its own
add(a,n);
will have the effect of changing a to add n to the end. You will find it has no effect at all. You may think “but changes to arrays passed into a method are made permanently, and a is changed inside this method”. Changes to the array referred to by a are carried on outside the method, but if a is changed to refer to another array inside the method, which is what happens here, that has no effect on a outside the method. The reason for this is that a inside the method and a outside the method are separate variables which are aliases to the same array. Sounds tricky, perhaps, so lets's look at it in terms of the sort of diagrams we used before. Here is the situation just after the method call add(a,n), using our first example values:

Now you can see here that since the two a variables are aliases for the one array object, changing a[i] inside the method will cause a change to a[i] outside the method. But if a inside the method is assigned to refer to the array b inside the method refers to, which is the case with the a=b here, that will have no effect on a outside the method. This can be shown by the diagram below which represents the situation when the method above is just about to terminate after having executed a=b.

When the method terminates, its variables a and b (and n) go out of scope, and since no variable then refers to the array which its a and b were referring to, that array is “garbage collected”. The variable a outside the method continues to refer to what it referred to before the method was called. It is not possible to have a method which takes a variable as an argument and causes that variable to refer to a different object. It is only possible for a method to change the content of an object to which a variable it takes as an argument is referring.

Removing an item from an array

Now let's consider the case where we want to remove an item from an array. There are several variations on the general theme of removing an item.

Removing the last item

Suppose we have the array we had before:

If our remove operation works so that it always removes the last item of the array in this case we would get:

We cannot have a destructive method which does this because it involves a change to the size of the array, but we can do it with a constructive method:

public static int[] removeLast(int[] a)
{
 int[] b = new int[a.length-1];
 for(int i=0; i<a.length-1; i++)
    b[i]=a[i];
 return b;
}
As before, we create an array of the size needed, and copy the items in. So with this, a call:
a2 = removeLast(a1);
(assuming a1 and a2 are declared as of type int[]) will cause a2 to refer to an array with the length one less than the one a1 refers to and with the same integers in it except the last one. A call
a1 = removeLast(a1);
will appear to have the effect of removing the last integer from the array referred to by a1, but you should be aware it does that by creating a new array and making a1 refer to it, if another variable was an alias for a1 that variable will continue to refer to the old array with its last item in place.

Note that Java allows arrays of length 0, so the above remove method would work fine if its argument were an array of length 1, it would just return an array of length 0. If it were actually passed a reference to an array of length 0 as its argument, however, there would be a problem - you can't create an array of length -1 (attempting to do so causes an exception of type NegativeArraySizeException to be thrown). A full specification for a method will state any cases where there are valid arguments that would cause the method to throw an exception. How to deal with an array of length 0 is a good example of a case it's easy to forget could be possible. An important thing to be aware of is that an array of length zero is not the same thing as null. An array of length zero is an actual object but if a variable is set to null that means it does not refer to any object. See below for more details.

Removing a particular item

Rather than a method that just removes the last item of an array, as above, we might want one where you specify the item that is to be removed from an array. Suppose the variable a refers to the same example array we used before, and we want remove(a,12) to change the array to:

or in general remove(a,n) to remove from the array referenced by a the integer n. A naive approach to this might be to find where n is stored and replace it by 0. This could be done by the destructive method

public static void remove(int[] a, int n)
// THIS IS SILLY CODE!
{
 int i=0;
 for(; i<a.length(); i++)
    if(a[i]==n)
       break;
 if(i<a.length)
    a[i]=0;
}
The code is similar to the code we looked at previously for finding the position of an integer in an array of integers, but in this case having found the position we set the integer in that position to 0. We have a loop which increments a loop index variable and exits either when it indexes the value we are searching for, or it has gone through the whole array and not found it. Only if it exits if it found the position do we assign 0 to that position, so the effect is that our remove method does nothing if the integer to be removed does not occur in the array - again, what happens in this circumstance is something we should have specified when we started the problem.

This code would result in

for our example, which is not the same as what was required. It is not possible to cause a component of an array to disappear by assigning it any sort of special value, as we noted before once an array is created its size is fixed, you can change what it stores but not how many things it stores.

So we have no choice but to implement our method constructively, since what we want is for it to return an array of length one less than the argument array. But as we saw above, there is the alternative possibility that the integer being removed does not occur in the array. In this case, we need to return a copy of the original array. Why a copy? Because if we execute

b = remove(a,n);
and we have said that it works constructively, we don't want it to result in b unexpectedly being an alias of a in the case where n happens not to occur in the array referenced by a. If this did happen, later we might change the array referred to by b and not realise this would cause a change to the array referred to by a.

So we need a method which returns an array whose size is either one less than the size of the argument array, or the same size, but we don't know the size until we have checked whether the integer to be removed occurs in the array or not. This leads to the following code:

public static int[] remove(int[] a, int n)
// Returns a new array which is the result of removing the 
// lowest indexed occurrence of integer n from array a, moving 
// all subsequent integers down one place in the array.  
// Returns a copy of array a if n does not occur in it.
{
 int i=0;
 for(; i<a.length; i++)
    if(a[i]==n)
       break;
 if(i<a.length)
    {
     // This is the case where i gives the position of integer n in array a
     int[] b = new int[a.length-1];
     for(int j=0; j<i; j++)
        b[j]=a[j];
     for(int j=i+1; j<a.length; j++)
        b[j-1]=a[j];
     return b;
    }
 else
    {
     // This is the case where integer n does not occur in array a
     int[] b = new int[a.length];
     for(i=0; i<a.length; i++)
        b[i]=a[i];
     return b;
    }
}
Here we have given full comments to explain exactly what this method does. The comment at the top of the method is for the benefit of someone who may want to use it. It doesn't say how the method works because someone who just wants to use it isn't interested in that, they would just want to know exactly what it returns. It explains that it removes an integer from an array of integers by moving all following integers one place down. You may think this is obvious, but as we have noted above, it needs to be spelt out to make sure it is quite clear this is exactly what is done.

The comments inside the body of the method are for the benefit of someone who is trying to work out how the code actually provides the desired result. They point out the blocks of code which deal with the two possible alternatives. Note that in the case where n is removed from the array, everything before the position where n was is copied into the same position in the new array, but everything after it is copied into the position indexed by one less, thus accomplishing the “moving down” effect.

Although you may sometimes hear the advice that it's good to write lots of comment in your code, too many comments can make the structure of the code less easy to see. In general you should only put comments internally in code where the structure is fairly complex and it isn't clear just from looking at it what each part is for. See here for some advice on good and bad use of comments. With comments at the start of a method to explain how it is used, it is good practice to use Javadoc format, see below.

The comment describing what the method remove does makes clear that it only removes one occurrence of the integer argument, and that it's the “lowest indexed” one. There can be no doubt about what this means, which is good.

Removing all occurrences of a particular item

An alternative interpretation of the “remove” operation might be to remove all occurrences of the integer argument in the case where it occurs several times. So if the array argument referenced the array

and the integer argument was 6, the method would return a reference to the array:

In this case we could first find the size of the new array to be returned, which is the length of the argument array less the number of times the argument integer occurs in the array. We then create the new array of the appropriate length and copy integers into it from the argument array. This gives the following code:

public static int[] removeAll(int[] a,int n)
// Returns the array resulting from removing all occurrences of
// integer n from array a, keeping the remaining contents of the
// array in the same order.
{
 int count=0;
 for(int i=0; i<a.length; i++)
    if(a[i]==n)
       count++;
 // Here count gives the number of times n occurs in a
 int[] b = new int[a.length-count];
 for(int i=0, j=0; i<a.length; i++)
    if(a[i]!=n)
       {
        b[j]=a[i];
        j++;
       }
 return b;
}
Again, there is a comment at the top describing the method in terms of what it produces, not how it does it, but also a comment inside the method to help understand how it does it. We have called the method removeAll to distinguish it from the method remove given previously. Note how when copying from the argument array (referenced by variable a in the method) to the new array (referenced by variable b in the method) there are two variables used to index the array, i used to index a and j used to index b. This is because the integers do not go into the same position in b as they were in a. The variable j is increased by one only when an integer is copied from a into b so that j gives the position of the next integer to be copied. This means that j is not increased by one when a[i] is equal to n.

Another thing you can note in the code for removeAll is that it gives an example of more than one variable being declared in the initialisation part of a for-loop. The variable j is declared there alongside the variable i, but j does not appear in the update part of the for-loop because it is not changed every time the loop body is executed.

However, the code here is actually not the best way to deal with the requirement to remove some elements from an array. It involves going through the array and testing whether or not each element a[i] is equal to n twice, the first time when finding the length needed for the array that is returned, the second time when actually putting elements into the new array. In this case that is not a serious problem, as testing whether two integers are equal is not a time-consuming operation. In other cases though, where the test of whether an element needs to be removed would take some time to do, it would make a significant difference to the overall time taken if each test occurred twice. So, here is an alternative way of solving the problem where each element is only tested for equality to n once:

public static int[] removeAll(int[] a,int n)
// Returns the array resulting from removing all occurrences of
// integer n from array a, keeping the remaining contents of the
// array in the same order.
{
 int[] b = new int[a.length];
 int count=0;
 for(int i=0; i<a.length; i++)
    if(a[i]!=n)
       b[count++]=a[i];
 // Here count gives the number of times n does not occur in a
  int[] c = new int[count];
  for(int i=0; i<count; i++)
     c[i]=b[i];
  return c;
 }
What happens here is that a new array of the same length as the initial array is created, with variable b referring to it. Then all elements not equal to n are put into the new array. The variable count works like the variable j in the previous code. Once the whole of the array a has been gone through, it will give a count of the number of elements in a which are not equal to n. The array referred to by b will be longer than the length of the array that is to be returned, whose length is then given by count. However, its elements up to the position given by count will be those that need to be in the array that is to be returned in the position they need to be, so an array of that length is then created with c referring to it. The elements up to that position are then copied from the array referred to by b into the array referred to by c without having to be tested again, and the array referred to by c is returned.

Removing the item at a particular position

Another way of interpreting the general idea of a method which removes an integer from an array of integers would be for the argument to the method to be not the integer that is being removed but the position of the integer that is to be removed. So if the array argument to the method was the array as before:

and the integer argument was the number 2 the method would return a reference to the array

where the integer indexed by 2 is removed, and all integers after it are moved down one place. Remember that in Java indexing we start at 0, so that means the third integer in the array is removed, which happens to be the first occurrence of 6. Here is code to do this, and we have called the method removePos to distinguish it from the previous remove methods:

public static int[] removePos(int[] a,int pos)
// Returns the array resulting from removing the integer
// indexed by pos from the array a, moving all subsequent
// integers down one place
{
 int[] b = new int [a.length-1];
 for(int i=0; i<pos; i++)
    b[i]=a[i];
 for(int i=pos+1; i<a.length; i++)
    b[i-1]=a[i];
 return b;
}
When writing code dealing with arrays of integers you should always have it clear in your mind when an integer variable is intended to hold data to go in the array, and when it is intended to hold an index to the array. It helps to use the variable name pos in the method above rather than n as we used for the integer argument previously, as the name pos indicates clearly that it is intended to represent a position. It is a common source of programming errors amongst inexperienced programmers writing code dealing with indexed collections of integers to use an integer variable inconsistently so that data and indexes to data get confused.

Note, the above code for removePos does not cover every possibility. If the method is called with the second argument set to an integer either equal or greater than the length of the array of the first argument, or less than zero, running the code will cause it to halt and throw an ArrayIndexOutOfBoundsException. Either we should accept this, in which case it would be sensible to make a note of it in the comment which explains how it works:

// Throws an ArrayIndexOutOfBoundsException if pos is less than 0 or
// greater than or equal to a.length
or we should modify the code so that it gives us some reasonable behaviour in these cases (maybe return a copy of the original array), and document that in the comment describing the method.

To demonstrate use of the methods discussed here, you can look at the files UseArrays8.java, UseArrays9.java, UseArrays10.java, UseArrays11.java and UseArrays12.java linked to in the code index. Each of these files contains one of the methods given here, plus a main method which exists just to read in some example data and show the result of calling the method which is to be demonstrated.


Footnotes

(1) To see the difference between null and an array of length 0, consider a variable a which is set to refer to an object which is an array of length 0. In that case, if a.length is evaluated it will evaluate to 0, which is the value of the length variable in the object. However, if a is set to null and a.length is evaluated, a NullPointerException is thrown because there is no object that a is referring to, and so there is no variable that can be accessed by using a.length.

It's possible for any variable of an object type to have the value null, and since an array is an object that includes variables of an array type. Variables of primitive types, however, cannot be set to null. When writing a method which has a parameter of an object type, so that includes any method that takes an array as an argument, it is possible the method could be called with the argument passed through that parameter being null. In some cases it may be specified how to handle that situation. If not, you may need to consider how to handle it. However, if there is no special code written to handle it as soon as an attempt is made to use the parameter variable as if it refers to an object, a NullPointerException will be thrown, meaning the method terminates. In many cases that is the best way to handle it, and since NullPointerException is an unchecked exception there is no need to specifically indicate it can be thrown by putting throws NullPointerException in the method header.

(2) Javadoc is a comment processor system provided with that Java language which you can find explained here. If you write your comments in the Javadoc style, this processor will use them to generate nicely laid out pages annotated with HTML tags, which can then be displayed in a browser. The Javadoc processor is used automatically in most IDEs, it results in documentation for your code which looks like the official documentation of Java's APIs.

In Java the // form of commenting means the rest of the line after it is treated as a comment, that is the Java compiler completely ignores it. Another way of introducing a comment is to use the combination /* which means that everything up to the next */ (could be on the same line, could be many lines later) is treated as a comment. If the comment starts with /** it will be treated as a Javadoc comment by the Javadoc processor.

In order to concentrate attention on the actual code, in this module comments in the code have been kept short or not used at all. Providing proper documentation is an essential part of programming in any context where the code is intended to be long lasting and used by others. Javadoc is the standard way of providing such documentation for Java code, and its automated generation of web pages means high quality documentation can be provided for only a small amount of effort.


Matthew Huntbach

Last modified: 22 February 2019