Algorithms and Data Structures in an Object-Oriented Framework (“ADSOOF”)

Lisp Lists

Lisp and functional programming

In this section we introduce another generic collection type, this time called LispList. This type is not provided as a standard part of Java, so to use it you will have to copy the file LispList.class from the code directory for this section. As a generic type, it should be thought of as having type LispList<E>, meaning that a full type needs to be made by combining LispList with some element type in the same way as we combined ArrayList with element types. So LispList<Integer> is the type “Lisp list of integers”, LispList<String> is the type “Lisp list of strings” and so on. The code in the file LispList.class makes use of code from another file LispList$Cell.class, so you will need to copy this from the same directory to the same directory as the file LispList.class.

The name “Lisp list” comes from the programming language Lisp, which is one of the oldest programming languages in existence, having first been introduced in the 1950s. Its name is a shortening of “list processor”, with lists being its fundamental data structure. Lisp was the first example of a functional programming language, which means a programming language based on a symbolic form of calculation called the lambda calculus. The idea that programming languages should have this sort of basis was pioneered by Peter Landin, who was for many years a professor at Queen Mary. While the programming language Haskell is considered a better example of a functional language because it was developed after people like Peter Landin had worked on a deeper understanding of the theory behind it, Lisp never disappeared, and has recently been revived in a version called Clojure. One of the reasons Clojure has been successful is that is compiles into Java Virtual Machine (JVM) code. Another programming language with a functional style that compiles into Java is Scala, which is now in fairly widespread use in industry.

Java itself now has a whole new set of features, centred around lambda expressions introduced in the version Java 8, to enable Java to be used with a functional style of programming.

The reason for considering programming with Lisp lists is that it is a particularly simple data structure, yet a very powerful one. Because of this, some fundamental aspects of algorithms and data structures are best illustrated by considering them in terms of Lisp lists. Some universities, for example Oxford, Cambridge and Imperial College in the UK and MIT in the States, use a functional language as the first language for teaching programming, believing it the best way to put across important issues which get buried in the detail of more widely used languages such as Java.

Note that although we use the word “LispList” here, it is meant to capture only the general idea, and is adapted to fit into the object-oriented syntax of Java, so the operations are given as methods called on objects rather than functions. Also, as we are using this type just for learning purposes, it has been deliberately reduced to the bare minimum of operations. As one of the objectives of this module is to give you a thorough understanding of the basics of object-oriented programming, we will keep to using standard object-oriented syntax to introduce a functional style of thinking about programming, rather than do it through the additional syntax features of Java 8.

Lisp Lists

A Lisp list is a data structure which consists of an ordered collection of items, in that way it is like the arrayList we considered previously. However, arrayLists are mutable, that is there are operations on them which change their content, and are considered as an enhanced form of array, that is programming with them is centred round accessing their components using an integer index. Lisp lists are immutable and are not primarily manipulated using indexes. Only the following operations are provided:

head - returns the first item of the list it is called on
tail - returns a new list consisting of all but the first item of the list it is called on

cons - takes an argument and returns a new list whose head is the argument and whose tail is the list it is called on

isEmpty - returns true if the list it is called on is the empty list, returns false otherwise
empty - returns an empty list

Here are the Java method signature for these operations:

E head()
LispList<E> tail()
LispList<E> cons(E item)
boolean isEmpty()
static <T> LispList<T> empty()

Note that the method empty is provided as a static method to return a new empty list. Otherwise, all the methods are called on a LispList<E> object, which is why they do not have their own type variable declaration, as the E in their signature means the type argument of the LispList<E> object they are called on.

Conventionally, we write lists enclosed with the square brackets [ and ], with the items in the list in order separated by commas. So, for example, [30,15,7,12] is the way we would write a list of type LispList<Integer> whose head is 30 and whose tail is [15,7,12]. If ls is a variable of type LispList<Integer> which refers to an object whose value we write as [30,15,7,12], then ls.head() will return 30 and ls.tail() will return a reference to an object whose value is written as [15,7,12]. The call ls.tail() does not change the object ls refers to, it remains with the value represented as [30,15,7,12], but the assignment ls1=ls.tail() would cause the variable ls1 to start referring to a new object which can be represented as [15,7,12]. Also the call ls.tail() could be passed directly as an argument to a method, or used directly to have a further method called on it. So, for example, ls2=ls.tail().tail() would cause ls2 to refer to a new object whose value can be written [7,12]. Similarly, ls.cons(8) does not change the object ls refers to, it continues to be one represented by [30,15,7,12], but ls3=ls.cons(8) would cause the variable ls3 to start referring to a new object whose value can be represented by [8,30,15,7,12]. The empty list is represented by [], an attempt to call the methods head or tail on a variable referring to an object representing the empty list would cause an exception to be thrown, but the cons operation can be called on such a variable.

In the actual language Lisp the operations we call head and tail are called car and cdr respectively, this is for historical reasons connected with the original implementation of the language. In some data structures textbooks, the term “tail” is used to mean the last element of a list. It is unfortunate that these two different uses of “tail” have arisen with lists, but in these notes we will only ever use the term “tail” when applied to a list to mean “all but the first element”. So Lisp lists are broken down using the operations head and tail, and assembled using the operation cons. We also need a way of getting an object representing the empty list, which in Java is done by a static method called empty. In general, when you call a static method from another class, you attach the call to the class name, but here we have the additional aspect of calling a generic static method. To create a new object of type LispList<Integer> representing an empty list, you use the call LispList.<Integer>empty(). The <Integer> after the dot which attaches the method call to the class name LispList gives a value to the type variable of this method, in this case setting it to Integer. But you can omit it if the value of the type variable can be worked out otherwise, for example if the value being returned by the call to the method empty is being assigned to a variable of type LispList<Integer> we know the type variable must be Integer so the call can be just LispList.empty().

To maintain the simplicity of the LispList generic type, there are no additional methods provided except for a toString method so that a text representation of Lisp Lists in the form used above is given, and an equals method to enable them to be compared for equality. There are no public constructors at all. Any other operations must be provided by writing static methods as required. The methods cons, tail and empty are the only way to construct new LispList objects. You can find code for Lisp lists in the directory code directory for this section. To use the type LispList, you will need to copy two .class files into your directory, LispList.class and LispList$Cell.class. The second of these is a class which is only used internally by objects of type LispList.

Abstract Data Types

As was done previously with the DrinksMachine class, you are deliberately given the .class file without the .java file that was used to produce it. The point is to get you used to the idea of dividing code up, so that you think in terms of abstract data types, that is data structures viewed only in terms of the methods that can be called on them, not in terms of how they are stored in the computer hardware.

There is code to implement LispList, and we will look at it later, just as we will look at the code to implement DrinksMachine later. With DrinksMachine we used the idea that is commonly used when introducing object-oriented programming of objects that model something physical in the real world. With LispList we are modelling an abstract computational concept. Whereas writing a class to model real world objects means abstraction in the form of trying to work out what are the essential features of those objects that need to be modelled computationally, with an abstract data type the methods actually define the objects.

To be able to deal with large scale software systems it is essential to be able to see them in terms of their individual parts, because it soon becomes impossible to manage if the only way to understand the system is to think of it just in terms of all its code. It helps if we can write and think about code which uses objects of a particular class just in terms of their public methods and not also in terms of the code insider their classes that makes those methods work. That is what we will be doing initially with LispLists, using them as if they are a built-in aspect of Java. If the class is not already provided, we can then write the code to provide it, as we will with LispLists. But that is not fundamentally different from the classes that are provided directly with Java, such as ArrayLists so we shall also see later how code could be written to provide the class ArrayList<E> if it were not already provided with Java.

The code to implement an abstract data type must be written in a way that just provides the required public methods, with it not being possible to manipulate or access objects of that type in any way except through those public methods. So variables in the class must be declared as private, and if any extra helper methods are written because it helps structure the code to have them, they too must be declared as private so they cannot be used directly by code outside the class they are in. This is again seeing large scale software systems in terms of small independent parts, because the code that implements an abstract data type can be written without having to be aware of the code that uses it, apart from providing the required methods for it to use.

The code used to implement an abstract data type will have a data structure inside it to store it directly in the computer, and this is termed its “concrete data structure”. It is important to understand the distinction between an abstract data type and the concrete data structure that implements it. If from your previous experience you are already familiar with the concept of a “linked list”, you may be thinking that a LispList as defined here and a linked list as you know about are the same thing. They are not, although a linked list is the best concrete data structure to implement LispList, as we shall see later. The concrete data structure that implements LispList will be mutable, and is manipulated directly by accessing its variables, but the public methods restrict access to make it immutable and are methods, not direct use of variables.

Iterative methods on Lisp Lists

If the variable ls refers to an object of type LispList<T>, the items in the list will be of type T. The first item in the list is given by ls.head(), the second item in the list is given by ls.tail().head(), the third item in the list is given by ls.tail().tail().head(), and so on. This is because the tail of a list is the list of every item except the first one, so the head of the tail must be the second item, and so on. A common way of dealing with lisp lists is to have a variable which is first set to a list, then set to the tail of the list, then to the tail of the tail, then to the tail of the tail of the tail, and so on. If the variable is set to its own tail, it will refer to the portion of the list with one less item at the front than it used to, and keep resetting it this way means it goes through the whole list with the head of the object referenced by the variable being each item in the original list in turn. When the variable refers to the empty list, it has gone through the whole list.

For example, suppose ls1 is of type LispList<Integer> and we want to add together all the integers in the list. The following code fragment will do this:

int sumSoFar=0;
for(LispList<Integer> ls2=ls1; !ls2.isEmpty(); ls2=ls2.tail())
   {
    int n = ls2.head();
    sumSoFar = sumSoFar+n;
   }

Note that, strictly, ls2.head() is of type Integer, not int, but assigning it to a variable of type int will convert it to the int equivalent, using the principle of “unboxing” we described previously.

After this code fragment has been executed, the variable sumSoFar holds the value obtained by adding together all the integers in the list referred to by ls1, and ls1 continues to refer to the whole list. Suppose ls1 refers to the list [8,30,15,7,12]. Then the first time round the loop, ls2 also refers to [8,30,15,7,12], and ls2.head() returns 8. The update statement in the for loop causes ls2 to be reset to refer to [30,15,7,12] on the second time round the loop, with ls2.head() returning 30. The next time round the loop, ls2 refers to [15,7,12] and ls2.head() returns 15 and so on. This could be put into a static method:

static int sum(LispList<Integer> ls)
{
 int sumSoFar=0;
 for(; !ls.isEmpty(); ls=ls.tail())
    {
     int n = ls.head();
     sumSoFar = sumSoFar+n;
    }
 return sumSoFar;
}

Code for reading a Lisp list of integers and adding the integers in it can be found in the files UseLispLists1.java and UseLispLists2.java. The second of these shows the operation done through the static method given above. It demonstrates that, for the reasons we showed previously, although ls=ls.tail() is executed within the method, this does not change what the list variable used as an argument to the method refers to. The variable ls inside the method is a local variable, which is initially set to the object passed as the argument to the method, but if it is changed to refer to another object, the variable that was used as the argument to the method call stays referring to its original object. So a call to sum(ls) will return the sum of integers in the Lisp list referred to by ls, but will not change what ls refers to.

The code given in the files also includes a method parseIntLispList which takes a string and gives the Lisp list of integers it represents using the text representation given above. This uses some of the methods provided in the Java library we discussed previously. However, please remember this is “support code”, put into these files to enable you to run them. The important thing in these files is the part of code which adds the elements in a Lisp list of integers, not the support code which enables a Lisp list of integers to be constructed from what you type in. Note that when the front-end code asks you to “Enter a list of integers” it uses that method parseIntLispList to convert what you typed into a LispList<Integer> object. So it is expecting the integers to be typed in the standard textual representation for Lisp lists, which is with [ and ] surrounding the numbers and with commas separating them. It is not just asking for numbers to be typed with spaces between them, as was asked for in previous front-ends. The code is written to be simple, so there is nothing in it which works out what you probably meant if you typed say 10 20 30 instead of the required [10,20,30].

The method sum here is not generic because it relies on the element type of the Lisp list it handles being Integer. However, we can consider generic methods for operations which do not rely on the element type. We have seen that for arrayLists there is a method provided by Java which gives the position of an item in an arrayList. Here is a static method which provides a similar operation for Lisp lists:

 public static <T> int indexOf(LispList<T> ls,T item)
 {
  for(int pos=0; !ls.isEmpty(); ls=ls.tail(),pos++)
     if(ls.head().equals(item))
        return pos;
  return -1;
 }

This gives the position of any item in a Lisp list of items of any type, so long as the type of the item and the element type of the Lisp list are the same. It uses the Java convention that counting positions starts at position 0 and that -1 is returned instead of a position if the item does not occur in the list. It uses the technique we used previously of a for loop, where the exit test for the for loop is when the whole of the structure has been gone through and the item not found, and where there is a return statement in the for loop which is executed to return the position (and hence exit the for loop and the whole method) when it is found. This is a case of a for-loop with two update statements: ls=ls.tail() and pos++, changing what the list variable ls refers to and the value of pos giving the position in the original list of the current head of ls. It is possible to have more than one update statement in a for loop, if so the update statements are separated by commas.

Loops with empty bodies

An alternative way of programming the loop would be to have it exit either if the list becomes the empty list, or if the head of the list is the item whose position we are trying to find. Then -1 is returned if the list is the empty list, and the value of the position variable if it is not as that means the loop has been exited because the position of the item has been found. Here is the code for this:

 public static <T> int indexOf(LispList<T> ls,T item)
 {
  int pos=0;
  for(; !ls.isEmpty()&&!ls.head().equals(item); ls=ls.tail(),pos++) {}
  if(ls.isEmpty())
     return -1;
  else
     return pos;
 }

The body of the loop is empty, all the loop does is the update and test.

The body of the loop is given by the symbols {} indicating a block of statements with 0 statements in it. It is easy to miss this and so think that the following if-statement is inside the loop whereas actually it comes after the loop. However, the fact that the if-statement is not indented, that is the keyword if is at the same position as the keyword for, should be enough to make it clear that it is not inside the loop. It is the {}, not the indentation, that tells the Java compiler that the if-statement is not inside the loop. In Java (unlike some other languages such as Python), indentation has no meaning for the compiler, but using it correctly and consistently makes it much easier for other programmers to see the general structure at a glance. Code examples you are shown in this module will always be correctly indented.

In Java, you can just use ; on its own after the for-loop header rather than {} to indicate a loop with an empty body. Remember that the body of a for-loop is the statement following it, with { and } combining multiple statements into one, but ; used to end a single statement, so ; on its own following the for-loop header means the single statement consisting of nothing is its body. This is even easier to miss, so I will always use {}.

The files UseLispLists3.java and UseLispLists4.java give these two versions of the method indexOf, together with support code to test them with lists of integers and lists of strings. A separate method parseStringLispList is needed to convert a string representing a list of strings into the actual Lisp list object. The text representation for Lisp lists of strings is the one we used above, starting with [ and ending with ] with the strings separated by commas, spaces before and after the commas are not treated as part of the strings in the list, otherwise all characters are treated as characters in the strings.

Stack and Reverse

Now let us consider the operation of changing all occurrences of one item to another, as we have already considered with arrays and arrayLists. Since Lisp lists are immutable, we can only do this constructively. A first attempt at such a change method might look rather like the one we used previously for a constructive change in arrayLists, going through the items of the argument collection, adding them to a new collection which is initially empty, but if the item is the one to be changed, adding the one it is to be changed to instead. With Lisp lists, the cons method is the only way to construct a new list which represents adding an item to an existing list. This gives the following code:

 public static <T> LispList<T> change(LispList<T> ls1,T m,T n)
 {
 // NOTE - not quite correct code for the "change" operation
  LispList<T> ls2 = LispList.empty();
  for(; !ls1.isEmpty(); ls1=ls1.tail())
     {
      T h = ls1.head();
      if(h.equals(m))
         ls2 = ls2.cons(n);
      else
         ls2 = ls2.cons(h);
     }
  return ls2;
 }

You will find this code and supporting code to demonstrate it in the file UseLispLists5.java. If you run this, you will find it makes the change as required, but returns the list in reverse order. If ls is set to represent the list written [2,12,4,17,21,4,9,10,4] and m is set to 4 and n is set to 50, then the call change(ls,m,n) will return a reference to a Lisp List object with textual representation [50,10,9,50,21,17,50,12,2].

The reason for this is that we can only take items from the front of a list and join them to the front of a list. If you repeatedly take items from the front of one list and join them to the front of another, you will end up reversing the list. If we want the list in the correct order, a simple way to do it would be to use the same technique again to reverse the reversed list, so this would make the method:

 public static <T> LispList<T> change(LispList<T> ls1,T m,T n)
 {
  LispList<T> ls2 = LispList.empty();
  for(; !ls1.isEmpty(); ls1=ls1.tail())
     {
      T h = ls1.head();
      if(h.equals(m))
         ls2 = ls2.cons(n);
      else
         ls2 = ls2.cons(h);
     }
  LispList<T> ls3 = LispList.empty();
  for(; !ls2.isEmpty(); ls2=ls2.tail())
     ls3 = ls3.cons(ls2.head());
  return ls3;
 }

This code and supporting code to demonstrate it can be found in the file UseLispLists6.java. Note that ls3 = ls3.cons(ls2.head()) means the same as

   {
    T h = ls2.head();
    ls3 = ls3.cons(h);
   }

You can express this using a local variable to hold the value returned by ls2.head() then making that variable the argument to the call to cons, or you can make the call ls2.head() directly the argument to the call to cons. It makes no difference to the way the code executes when run in Java, deciding how to write it is just a matter of what you feel looks clearer. Experienced programmers tend to make less use than novice programmers of variables whose only role is to hold a value of an expression which is then immediately used, tending instead just to use the expression itself. However, it can lead to over-complex expressions when one is made up of several sub-expressions. Also, you should always use a variable to hold the value of a method call if you want to use it more than once and you know it will always result in the same value, because that is more efficient than actually repeating the method call.

Methods which are centred around the use of a loop are known as iterative. An iterative algorithm is a way of solving a problem which is best explained in terms of making repeated changes to some data until the desired condition is met. In the above example, the data being changed in each loop consist of two variables referring to Lisp list objects. Each time round the loop, one of them is changed to refer to a shorter list, the other is changed to refer to a longer list. We know the loop will only be gone through a certain number of times, because the loop test is that the variable which keeps being set to a shorter list refers to a non-empty list - eventually it will refer to an empty list and the loop will be exited. Then the variable which kept being set to a longer list holds the desired result. No actual object is changed, since Lisp list objects are immutable meaning they can't be changed. But new objects are created and the variable change their values because they are set to refer to these new objects.

The technique used here for solving Lisp list problems iteratively can be called stack and reverse. We could imagine our lists as a stack of blocks, one on top of each other, with the head of the list being the top block. We can take the top block off a stack, and we can put a new block on top of a stack. But we can't take a block from the middle of a stack or put one in the middle. So our problems have to be solved by taking blocks off one stack, investigating them, and putting them on top of another. We are then likely to want to go through the process of taking from one stack and putting onto another again, in order to reverse the order.

Recursive methods on Lisp lists

The alternative to iteration when thinking about problem solving is recursion. We have seen this concept previously, but it works more naturally with Lisp lists. A recursive approach to problem solving uses the consideration that we can often solve a problem by solving a smaller version of the same problem then adapting the solution, or sometimes using more than one solutions to smaller versions of the problem. For example, suppose I need to make a journey from town A to town B, and I don't know a direct route. I could pick another town, let's call it C, where I know a direct route from A to C and C is closer to B than it is to A. Then I have reduced the problem of finding a route from A to B to one of finding a route from B to C and putting the route from A to C I know on the front of it. Finding a route from C to B is done in the same way, by reducing the problem to finding a route to B from a town closer to B, and eventally the problem will be reduced to one where there is a clear direct route and it does not have to be broken down further. An alternative way of solving the problem is to pick a town midway between A and B, again we'll call it C. Now the problem has been reduced to two similar problems: finding a route from A to C and finding a route from C to B. Eventually, if we solve those two problems in the same way, we'll get down to routes which are short enough for a direct route to be found.

Recursive approaches to solving Lisp list manipulation problems often work well because Lisp lists are a recursive data type. A Lisp list is either the empty list, or it consists of two parts its head and its tail, and the tail is itself a Lisp list. So a Lisp list is defined in terms of a Lisp list. Suppose we want to change all occurrences of m to n in a Lisp list, ls. If we change all occurrences of m to n in the tail of ls (remember this involves creating a new Lisp list object representing the change, and not destructively changing the tail of ls) we almost have the desired Lisp list. All we need to do is put n on the front of it if m was on the front of the original list, or put the front item of the original list at the front of the new list if m was not at the front of the original list. We also need to consider that if we wish to change all occurrences of m to n in the Lisp list ls, then if ls is the empty list, then the answer is the empty list. This gives us the following code which performs the same change operation as above:

 public static <T> LispList<T> change(LispList<T> ls,T m,T n)
 {
  if(ls.isEmpty())
    return LispList.empty();
  else
     {
      LispList<T> t = change(ls.tail(),m,n);
      T h = ls.head();
      if(h.equals(m))
         return t.cons(n);
      else
         return t.cons(ls.head());
     }
 }

Again, there is no right or wrong way to express a solution to this problem. You may find the iterative solution easier to understand, or you may find the recursive solution easier to understand. Recursive solutions often have a simpler structure, but require a bit more intuitive thinking to appreciate. A good computer scientist should be able to understand and devise recursive and iterative solutions to problems. If recursion seems a rather strange way to go about tackling a problem, you will need to practice with it until you find it natural.

Let us consider recursive versions of the other operations we have discussed on Lisp lists. The sum of the all the elements in the empty list is clearly 0. The sum of all the elements in any other list is equal to the first element added to the sum of the elements in the rest of the list. I hope this seems intuitively obvious, and yet it is very close to a program to give the sum. The “sum of all the elements in the rest of the list” is a recursive call to the sum operation taking as its argument the tail of the original argument. So this gives the following code:

static int sum(LispList<Integer> ls)
{
 if(ls.isEmpty())
    return 0;
 else
    return ls.head()+sum(ls.tail());
}

A recursive method must always have a case when it doesn't make a recursive call. With Lisp lists this is often, but not always, when the argument list is the empty list. The case when no recursive call is made is known as the base case. Any recursive call must always have arguments which are closer to a base case than the original arguments, since we want a succession of recursive calls eventually to come to an end in the base case. So with Lisp lists this generally (but not always) means the recursive call takes as an argument the tail of an original argument.

Tail Recursion

Another way of calculating the sum of integers in a Lisp list of integers is this:

static int sum(LispList<Integer> ls)
{
 return sumPlus(ls,0);
}

private static int sumPlus(LispList<Integer> ls, int sumSoFar)
{
 if(ls.isEmpty())
    return sumSoFar;
 else
    return sumPlus(ls.tail(),sumSoFar+ls.head());
}

If you are asked to write a method that takes a particular argument, in this case a LispList<Integer> object, it should always do just that, and not expect the outside code to provide another argument with a particular value. So here the work is done by a recursive helper method, with the initial call to it with the required initial value done in the public method which does nothing else but make that call. This is because the recursive method takes an extra argument. However, that argument is set to 0 in the initial call. You can think of a call to the helper method sumPlus(ls,n) as returning the sum of the integers in the list ls added to n, which is n if ls is the empty list, otherwise it is the sum of the integers in the tail of ls added to n plus the head of ls. This is tail recursion, as discussed previously. It is really the iterative solution given above expressed as recursion. If, as in this case, there is a simple recursive solution, but your approach to solving it recursively would be to come up with a more complex tail recursive solution, it suggests that you are still thinking of it in an iterative way rather than thinking in terms of recursion.

If we consider the operation indexOf we considered previously, there are two base cases. One is when the list is empty, then the item cannot occur in it so it must return -1. The other is when the item is the first item in the list, then it must return 0. Otherwise, the index of an item in a list is one more than its index in its tail. For example, the position of 80 in [20,90,70,50,80,60] is 4 (remember with the usual Java convention of starting counting at 0), this is one greater than the index of 80 in [90,70,50,80,60]. But this relationship is not quite right. The index of 40 in [20,90,70,50,80,60] is -1 (meaning it doesn't occur), but the index of 40 in [90,70,50,80,60] is also -1. So the index of an item in a Lisp list is -1 if the list is empty, 0 if it is the first element, -1 if the index in the tail is -1, otherwise it is one greater than its index in the tail. This translates to the following code:

 public static <T> int indexOf(LispList<T> ls, T item)
 { 
  if(ls.isEmpty())
     return -1;
  else if(ls.head().equals(item))
     return 0;
  else
     {
      int index = indexOf(ls.tail(),item);
      if(index==-1)
         return -1;
      else
         return index+1;
     }
 }

You can see from this that when programming using recursion, it often makes more sense to think in terms of logical relationships between the result of a call and the result of a recursive call than it does to think of it in terms of one-by-one steps as with iteration. In this case, the direct recursive solution is rather complex, and the tail recursive version:

 public static <T> int indexOf(LispList<T> ls, T item)
 { 
  return indexOfPlus(ls,item,0);
 }
 
 private static <T> int indexOfPlus(LispList<T> ls, T item, int pos)
 { 
  if(ls.isEmpty())
     return -1;
  else if(ls.head().equals(item))
     return pos;
  else
     return indexOfPlus(ls.tail(),item,pos+1);
 }

is the more obvious way of solving it recursively. Thinking of it recursively, the call indexOfPlus(ls,item,n) returns -1 if item does not occur in ls, otherwise it returns n added to the position of item in the tail of the list referred to by ls.

The file UseLispLists2a.java is the same as the file UseLispLists2.java and the file UseLispLists3a.java is the same as the file UseLispLists3.java except that they use recursion as given above, rather than iteration. You will see that when you run the code using the recursive method the result is exactly the same as when the code is run using the iterative method. This is correct - the important thing is not how the code is programmed, but whether it gives the right result. There may be more than one way of programming an operation to give a particular result, but someone who uses a program to get a particular result is only concerned that the result is correct.

Actually, as we shall see later, there is also the issue of efficiency. Different ways of programming an operation may give the same result in terms of what the method for the operation does or returns, but one way may take much longer than another. Someone who uses a program will be concerned that it operates as quickly as possible.

Static variables

When a variable is declared in a class outside a method, it usually means it is meant to be an object variable, that is there is a separate variable of that name for each object of that class. We have not covered this yet, but we will see it when we get to defining objects. However, if the keyword static is put before the variable declaration, it means the variable is a static variable, also often called a “class variable”. It means instead of there being a different variable of that name for every object of that class, there is just one variable of that name which is shared by every object of that class and also is accessible from every static method in that class.

This breaks what we have been saying about static methods being self-contained and executing in an environment which is just their own. Instead it means every environment for every method call contains the same variable, not a different variable of the same name when that variable is a static variable in that class. So one method call can set the variable to something and another method call can use that new value, or change it. This is BAD programming practice, the reason being that it adds another way in which method calls may interact, not just by returning values or changing objects passed as arguments, but also by changing these shared variables. It can make code very difficult to follow and debug. The static variable may be used in code buried deep within the method so this way of communicating cannot be seen from the method's header. Furthermore, it means that one method call relies on another method call to set the static variable to the right value, and relies on there being no other method calls which changes it in some other unexpected way. It means a static method call may have one effect if the static variable is set to one value, and another effect if it is set to another value, even though arguments to the method call are identical.

For this reason you should simply not use static variables, there is almost never any need for them as there is almost always a better way of programming which gets round where you might be tempted to use one. One exception to this rule is variables declared as both static and final, the final means their value can't be changed. As their value can't be changed, the problems involved with a shared variable which can be changed are not there. A static final variable is useful when there is some constant value which code in the class refers to, it means the value is stored in one place instead of multiple copies being made of it.

Here is a method which adds together all the integers in a Lisp list of integers which uses a static variable:

 static int sumSoFar;

 public static int sum(LispList<Integer> ls)
 // This is how NOT to implement sum recursively!!
 {
  if(ls.isEmpty())
    {
     sumSoFar=0;
     return 0;
    }
  else
    {
     sum(ls.tail());
     sumSoFar=sumSoFar+ls.head();
     return sumSoFar;
    }
 }

The static variable sumSoFar keeps a running total of the sum of the numbers, it is initialised to 0 when recursion reaches the empty list, and incremented by the heads of the lists ls in each environment as recursion goes back up. Code like this sometimes gets produced by people who have started using recursion, but have not got used to recursive thinking. As a result they want to have a variable whose value can be changed, in a way that is used in iteration, like the variable sumSoFar in the iterative version of summing the integers in a Lisp list of integers. The code above is longer and more messy than the recursive version given previously. There isn't any benefit to it, although it does work and gives the correct result. It is really mixing recursive practice with iterative thinking, so is more complex than doing either plain recursion or plain iteration.

The best way of approaching recursion is to try and think of it in terms of seeing how the result of the recursive call relates to the final desired result and putting it together to get the final desired result. This ought to be simpler, although the problem may be that it requires a different way of thinking which it is hard to get into if you have become used to thinking in terms of iteration and changing variable values.

Matthew Huntbach

Last modified: 10 October 2019