Sorting Arrays and ArrayLists: Part 1

Constructive Insertion sort

We have previously looked at sorting algorithms using linked lists. Now let us consider sorting algorithms using arrays and arrayLists. The first sorting algorithm we considered was insertion sort. The idea of this is that we start off with a new collection which is empty at the start, and we go through the items of the collection which is to be sorted, inserting each one of them into this new collection. We need to insert them in a way which keeps the new collection in sorted order.

This is easily done with arrayLists, since an arrayList can change in size. You will remember that the class ArrayList defines two methods called add. One adds the argument it takes to the end of the arrayList it is called on, but the other specifies a position it is to go in. If we assume an arrayList is sorted, and we want to insert a new item and keep it sorted, we can go through the items in the arrayList one at a time until we find the first one which is not less than the one being inserted. Then we insert the new item at that position, which will cause the item already at that position and all the ones following it to be moved up one place. This gives the following code for insertion sort on an arrayList of integers:

 public static ArrayList<Integer> sort(ArrayList<Integer> a)
 {
  ArrayList<Integer> b = new ArrayList<Integer>();
  for(int i=0; i<a.size(); i++)
     insert(a.get(i),b);
  return b;
 }

 private static void insert(Integer n,ArrayList<Integer> a)
 {
  int i;
  for(i=0; i<a.size()&&a.get(i).compareTo(n)<0; i++) {}
  a.add(i,n);
 }

Remember that the base type for ArrayList (as for any generic type) must be an object type, so we use Integer rather than int. We use the compareTo method to compare Integer values. So if m and n are expressions of type Integer (maybe variables, but maybe method calls whose return type is Integer) then m.compareTo(n) gives a negative value of type int if m represents an integer value less than that represented by n, a positive int value if m represents a value greater than n, and 0 if m and n represent the same integer value.

Note that the loop in the method insert has to have a conjunction of two tests. It has to check that the integer indexed by the loop variable i is less than the integer n to be inserted, which is done by a.get(i).compareTo(n)<0. It also has to check that the loop index is less than the size of the arrayList. If the loop index is equal to the size of the arrayList, it means we have gone through the whole arrayList and even the last item is less than the item to be inserted, so we insert the new item at the end. In that case, i<a.size() is false and the test a.get(i).compareTo(n)<0 is never tried because of the "short circuit" effect of the && operator. That is why the i<a.size() test has to come first, because if i has reached the value of a.size() then calling a.get(i) will throw an IndexOutOfBoundsException. When writing code you always need to consider if there are any special cases which need careful dealing with, such as this case where inserting an item in an ordered happens to mean inserting it after all other items in the collection because all other items are smaller.

There are three files in the directory ~mmh/DCS128/code/sort which have exactly the same code for the sorting algorithm, but which differ in the support code to demonstrate it. In InsertSort1.java, the user is prompted to type in numbers to be sorted. In InsertSort2.java, the user is only asked for the number of numbers and the highest possible number. Then random numbers are generated within this range, but the numbers are printed before and after sorting. In InsertSort3.java, the numbers are generated randomly as before, but they are not printed; they are still sorted but only the time taken to sort them is printed. This enables you to test the code with your own numbers, then test it with so many numbers that it would be tedious to type that many yourself but they are still displayed so you can see the code works, then to test it with so many numbers that it would be impractical to display them.

The issue of efficiency is something we will deal with later. In practice, computer programs often have to deal with large amounts of data, much more than you could type by hand or display on one screen. Differences of efficiency between different algorithms become much more apparant when they are run with large amounts of data. So it's important to test code on large amounts of data.

Note that while testing code by running it with data and observing the results is an important part of software development, we can also analyse algorithms mathematically to prove they work correctly and to find out their efficiency. Although the emphasis in this course is on programming algorithms, the scientific study of algorithms is much more concerned with mathematical analysis of them then on coding issues.

Insertion sort in place

Destructive sorting of a collection involves moving the items from their original locations to new locations in the collection rather than constructing a new sorted collection. Another name for this destructive sorting is sorting in place. Destructive insertion sort works like constructive insertion sort by going through the collection one item at a time and inserting it in place. However, instead of inserting the item into a new collection, it works by having part of the collection holding sorted data, and the rest unsorted. As each item is gone through in the unsorted data it is inserted into the sorted data with the portion of the data which is sorted growing until it becomes all of it.

The method for insertion sort in place of an arrayList of integers is:

 public static void sort(ArrayList<Integer> a)
 {
  for(int i=1; i<a.size(); i++)
     insert(a.get(i),a,i);
 }

At the start of each execution of the loop body, the portion of the arrayList up to but not including that component indexed by i is sorted. The call of insert(a.get(i),a,i) in the loop body takes the component indexed by i and inserts it into the sorted part. It does this by moving up one place all those items in the sorted part which are greater than the item indexed by i, then putting the item which was indexed by i in its place. This makes the code for insert:

 private static void insert(Integer n,ArrayList a,int i)
 {
  for(; i>0&&a.get(i-1).compareTo(n)>0; i--)
     a.set(i,a.get(i-1));
  a.set(i,n);
 }

This is best illustrated using an example shown by a diagram. Suppose we want to sort the following arrayList:

When the loop body in the sort method has been executed five times, leaving i with the value 6, the arrayList will be:

The bar through it indicates the division between the sorted part and the unsorted part. The method insert is executed in an environment where n is set to a.get(i), which is 28, and there is a local variable i set to the value of i in the environment for sort, which is 6, and a refers to the same arrayList. Diagrammatically, the situation is:

After the loop in the insert method has been executed, the items in the sorted part of the arrayList which are greater than 28 will have neen "moved up" one place, making the situation:

with the local variable i in the environment for the call to insert having its value changed to 3. Then, executing a.set(i,n) sets the arrayList to:

where the bar again indicates the division between a sorted and unsorted part, and you can see it has moved up one place. When execution returns to the environment for the sort method call, the contents of the arrayList referred to by a will have been changed in this way. The loop on the sort call finishes when its i variable has as its value the size of the arrayList, indicating that now all of it has been sorted.

Since this algorithm for sorting an arrayList in place does not change the size of the arrayList at any time, the same algorithm can be used to sort an array in place. The code for this is:

 public static void sort(Integer[] a)
 {
  for(int i=1; i<a.length; i++)
     insert(a[i],a,i);
 }

 private static void insert(Integer n,Integer[] a,int i)
 {
  for(; i>0&&a[i-1].compareTo(n)>0; i--)
     a[i]=a[i-1];
  a[i]=n;
 }

Full code for sorting an arrayList in place can be found in the files InsertSort4.java and InsertSort5.java, in the first case with randomly generated numbers which are displayed, in the second with randomly generated numbers and only timing figures, no display. Similarly, full code and supporting code for demonstrating for sorting in place of an array can be found in InsertSort6.java and InsertSort7.java.

Selection Sort

Selection sort is an algorithm which is quite similar to insertion sort in place. As with insertion sort in place, the array is divided into a portion which is unsorted and a portion which is sorted. But with insertion sort in place, the sorted portion of the array consists of the first item from the original array, then the first and second item from the original array, then the first and second and third item from the original array, and so on. With selection sort, the sorted portion of the array consists of the lowest item from the original array, then the lowest and second lowest item from the original array, then the lowest and second lowest and third lowest item from the original array and so on.

The algorithm for selection sort works by first finding the lowest item in the original array and swapping that in position with the item in the first indexed component of the array. Then the lowest item in the remainder of the array (i.e. the second lowest item in the whole array) is found and swapped in position with the item in the second indexed component of the array. Then the lowest item in the portion of the array starting at the third indexed position is found and swapped with the item at the third indexed position, and so on. Here is the code to do that:

 public static void sort(Integer[] a)
 {
  for(int i=0; i<a.length-1; i++)
     {
      int pos = findMinPos(a,i);
      swap(a,i,pos);
     }
 }
 
 private static int findMinPos(Integer[] a,int pos)
 {
  for(int i=pos+1; i<a.length; i++)
     if(a[i].compareTo(a[pos])<0)
        pos=i;
  return pos;
 }

 private static void swap(Integer[] a,int pos1,int pos2)
 {
  Integer temp = a[pos1];
  a[pos1]=a[pos2];
  a[pos2]=temp;
 }

Here the method findMinPos with arguments a and pos returns the position of the smallest item in the portion of array a starting with the component indexed by pos. The method swap with arguments a, pos1 and pos2 changes the array a destructively by swapping round the item at position pos1 with the item at position pos2.

Code for selection sort of arrays of integers can be found in the files SelSort1.java SelSort3.java and SelSort5.java. In the first case there is supporting code for the numbers to be sorter to be typed in, in the second they are generated randomly and displayed, in the third they are generated randomly and only timing figures are given. In SelSort2.java SelSort4.java and SelSort6.java selection sort is similarly applied to an array of strings. You will also need the file WordGen.java which randomly generates pronounceable words. The code for selection sort of an array of strings is:

 public static void sort(String[] a)
 {
  for(int i=0; i<a.length-1; i++)
     {
      int pos = findMinPos(a,i);
      swap(a,i,pos);
     }
 }
 
 private static int findMinPos(String[] a,int pos)
 {
  for(int i=pos+1; i<a.length; i++)
     if(a[i].compareTo(a[pos])<0)
        pos=i;
  return pos;
 }

 private static void swap(String[] a,int pos1,int pos2)
 {
  String temp = a[pos1];
  a[pos1]=a[pos2];
  a[pos2]=temp;
 }

As you can see, it is exactly the same as the code for selection sort of an array of integers, apart from the type String used where Integer was used before. The reason it would not be possible to write a simple generic version with a type variable replacing String and Integer is the call to compareTo in the method findMinPos. Only some types may have the method compareTo called on them, so we cannot call it on an object where we know nothing about its type, which is the case with type variables as we have seen them up till now. Later we will see a way round this, but for now we will give code for sorting algorithms in terms of sorting collections of integers.

If we use the same example as before, sorting the array which starts out as [30,25,47,99,8,12,23,63,16,20], the lowest item in the whole array is found and swapped with the first item, giving [8,25,47,99,30,12,23,63,16,20]. Then the lowest item in the rest of the array is found and swapped with the item in the second position, giving [8,12,47,99,30,25,23,63,16,20]. Following each step in the algorithm, the successive steps in the transformation of the array are:

[30,25,47,99,8,12,23,63,16,20]

[8|25,47,99,30,12,23,63,16,20]

[8,12|47,99,30,25,23,63,16,20]

[8,12,16|99,30,25,23,63,47,20]

[8,12,16,20|30,25,23,63,47,99]

[8,12,16,20,23|25,30,63,47,99]

[8,12,16,20,23,25|30,63,47,99]

[8,12,16,20,23,25,30|63,47,99]

[8,12,16,20,23,25,30,47|63,99]

[8,12,16,20,23,25,30,47,63|99]

[8,12,16,20,23,25,30,47,63,99]

with the | character used to indicate the division between the sorted portion and the unsorted portion.

Selection sort may be applied to an arrayList as well, the only difference with its application to arrays being the use of method calls rather than the special array syntax. Here is the code for selection sort on an object of type ArrayList<Integer>:

 public static void sort(ArrayList<Integer> a)
 {
  for(int i=0; i<a.size()-1; i++)
     {
      int pos = findMinPos(a,i);
      swap(a,i,pos);
     }
 }

 private static int findMinPos(ArrayList<Integer> a,int pos)
 {
  for(int i=pos+1; i<a.size(); i++)
     if(a.get(i).compareTo(a.get(pos))<0)
        pos=i;
  return pos;
 }

 private static void swap(ArrayList<Integer> a,int pos1,int pos2)
 {
  Integer temp = a.get(pos1);
  a.set(pos1,a.get(pos2));
  a.set(pos2,temp);
 }

The files SelSort7.java and SelSort8.java give code for selections sort on arrayLists of integers, together with supporting code.

Matthew Huntbach

Last modified: 3 August 2005