The name of the class can be used as a type to declare variables and
method parameters in just the same way as the types provided automatically
by Java such as int
and String
. It's important
to note there's a difference between a variable of a particular type
and an object of that type. The variable can be considered a name
that can be given to the object. We can only refer to objects through
the names they are given in this way. But declaring a variable of a particular
object type is a separate thing from creating a new object of that type.
A variable is declared by a Java statement having the class name followed
by the variable name. A new object is created by a Java value which
consists of the word new
followed by the class name followed by (
followed by
some arguments used to set up the object followed by )
.
Often variable declaration and object creation are combined into one
statement where a variable is declared and initialised to a new object.
You need to distinguish between a statement and a value in Java. A
statement is a command, when executed as the program is run it will
have an effect. A value, however, is something which is used within
a statement, perhaps as the right-hand of an assignment statement or
as an argument in a method call. As a simple example, x+y
is a value, it cannot do anything as a command on its own, but
x=x+y
is a statement, it has the effect of changing the
value in variable x
. The situation is confused because
there are some things in Java which can be both statements and values.
For example, x++
has both an effect (increasing the value
held in x
by 1) and a value (the value held in x
before it was increased by 1), so it can be used as a statement on its own,
or as a value inside a statement.
Java has two sorts of methods, static methods and object methods.
You have seen static methods in your previous programming course, they
can be considered as self-contained "mini-programs", which are sometimes
called because they have the effect of doing something like displaying
some value, and sometimes called because they calculate and return a
value. When object methods are called, unlike static methods, the call
has to be "attached" to an object. This is generally done by having the
method call coming after a variable name with the .
character
separating them. We say the method has been called on the object which is
named (or "referred to" or "pointed to" - these all mean the same thing)
by the variable. Like a static method call, an object method call may have
the effect of doing something externally like printing, and/or returning
a value. But it may also have the effect of changing the internal state
of the object it is called on. If an object method has an external
effect or returns a value, the effect it has or value it returns will
depend on the state of the object it is called on.
It's important to understand that a method call which only returns
a value, but does nothing externally and makes no change to any object,
only makes sense when used as a value. A method call which does not
return a value (indicated by it having return type void
)
only makes sense when used as a statement. Method calls which both make
changes and return a valuie can be used either as statements or values.
In Java, assignment causes aliasing. This means that if
a
and b
are variables, the effect of
executing the statement b=a
is that "b
becomes another name for the object named by a
".
With a
and b
being two names for the
same object, any change to the object caused by calling a method
on variable a
will also be observed to cause a change
in the object referred to by variable b
because they
are, in fact, the same object.
We saw that when static methods are called, they are executed in their own environment, meaning that the code in them is executed using the variables from their parameter list and variables declared locally in the method, and making no use of the variables from the code the method was called from. The parameter variables of the method can be considered to have been assigned the matching arguments of the method call. Only when execution of the method call is finished does computation return to its previous point, and back to using the variables which were in existence at that previous point. If a method returns a value, that may be used to set a variable to this value in the code the method was called from.
A method call may also have the effect of changing the state of an object passed to it as an argument. The reason for this is that when a method argument is a variable referring to an object, the method call causes the variable in the method to be an alias of the method argument variable. So if a change is made to an object referred to inside the method through a variable which is one of its parameters, that change will be observed to have taken place to the object outside the method referred to by the variable which was the matching argument in the method call. The change continues after the method call is finished
The DrinksMachine
class is something written by a programmer
(me), not a part of the Java language as provided by the suppliers of Java.
Later in this course, we shall see how to write our own classes. This week
the important thing was to see how to use classes already provided, maybe
as part of the Java code library, maybe by some other programmer.
Week 2: 16/17 January
We started off by considering the usage of Java variables. A
variable is used to hold a value which is used elsewhere. A
method may be called on a variable which refers to an object,
or an object may be passed as an argument to a method call through
having a variable referring to it used as the argument. If an object, once
created, needs to be referred to more than once, we need a variable
to hold it and be used in the different places where it is
referred to. If an object is created and only used once, however,
rather than have a separate variable which is used only to transfer
reference of it from where it is created to where it is used, it
is possible to use the method call which creates an object directly
to refer to that object when a method is called on it, or directly
as the argument to another method. This may make the program look
neater as there are less variables to have to consider, and the
creation and use of an object is all in one place. It leads to
more complex statements, however, so in some cases it is more clear
to break down a complex statement into separate statements with
variables holding temporary results. It is important to note that this
makes no difference whatsoever to the operation of the program.
It is entirely a matter of what you as the programmer consider to
be the best way of setting out the program to make it understandable
to other programmers who may have to use it or modify it. Similar
applies to such things as how you lay out the program in terms of
spaces and empty lines between code, and whether you omit brackets
in cases where they are not necessary (for example, we saw that a
for-loop takes the form
for(initialisation;test;update)
{ body }
, but the {
and
}
can be omitted if body
consists of just one statement).
The main topic we considered this week was arrays.
An array is the most basic form of data structure in Java,
similar constructs can be found in most programming languages,
and directly reflect the way storage is managed in the hardware
of a computer. A data structure is an object which can be treated
as a single value, but can also be treated as being made up of components.
For any type t
in Java, we can also consider
t[]
to be a separate type, which we call
"array of t
". We refer to t
as the base type of an array of type t[]
.
Note that Java (unlike, for example,
the Pascal programming language) does not make use of the word
array
in any special way, you do not use it to create
or declare arrays, but you could use it as a variable name if you
liked. Note that the []
combination without anything between
the square brackets is used only to combine with type names
to make new type names (for some reason, a common mistake amongst
those new to Java programming seems to be to suppose you have to add
[]
to the name of any variable if that variable is
meant to refer to an array object). An array type name, created using
[]
, can be used anywhere in Java where any other type
name could be used: to declare variables and method arguments of that
type, or as they return type of a method.
If a
is a variable which refers to an object of type
t[]
, then a[intval]
, where
intval
is a Java value which evaluates to an
integer, can be used as if it is a variable of type t
: a value
can be assigned to it like any other assignment, and it can be used
as a value of type t
, for example as the argument in a
method call where a value of type t
is required, or as
the value which is assigned to a variable of type t
.
The important thing is that intval
could be
a variable of type int
, or an arithmetic expression
involving a variable of type int
. Then, if we assign a new
value to that variable, the variable space that intval
means will also change, so we can write code which depends on
a[intval]
changing to refer to different
components in an array as a variable in intval
changes.
Commonly we use the variable name i
for a variable whose
main use will be as an index to an array.
Array types are like any other object type in Java: declaring a variable
of an array type does not create an object of that type, that has
to be done separately, and assigning one variable of the array type
to another creates an "alias", the two variables can be considered
two different names for the same object. So, for example, if
a
and b
are two variable of type t[]
,
and a
has been set to refer to an array object, then
executing b=a
causes b
to refer to the same
object. Following that, if a[i]=val
is executed, where
val
is a value of type t
, then
the value of a[i]
and b[i]
will both change
to val
. This is because a[i]
and b[i]
are actually the same thing.
A new array object is created by the word new
followed by
a space followed by the base type name, followed by [
,
followed by an integer value, followed by ]
. This
creates an array of the length given by the integer to which
the integer value evaluates. So, for example
String[] a = new String[n];is a statement which declares a variable called
a
which can refer to arrays of strings, and sets that variable to refer
to an array of strings of the length of whatever integer is held in
variable n
.
If a
refers to an array object, then a.length
gives the length of the array (note, there is no ()
following the word length
). We can consider that there
are also then variables named a[0]
, a[1]
,
a[2]
, and so on up to a[a.length-1]
. So note
that Java array indexing always starts at 0 and goes up to one less than
the length of the array.
We saw that a common pattern in programming using arrays is to go through the elements of an array one element at a time, using a loop which will have a header of the form
for(int i=0; i<a.length; i++)We may wish to halt when we have found an array element which satisfies some particular test. This could be done using a
break
statement rather than have the test in the test part of the for loop.
However, if we want to know the value of the array index when we halted,
we then have to declare the array index variable i
before
the for loop, so we can access it after. This will give us code which
takes the form:
int i; for(i=0; i<a.length; i++) if(test(a[i])) break;In the lecture, the test of
a[i]
was that it was an
integer with value equal to n
so it was written
a[i]==n
.
If the loop is inside a method, and we want to return the array index
as the result of the method, then we can use return i
instead of break
in the above. Note that a
return
statement always has the effect of completely
ending execution of a method call, even if it is inside a loop.
We would also need a return
statement following the
loop to return a value indicating that no component of the array satisfies
the test. A convention is that -1
is returned in such a case.
We noted that in most cases in this course we will be working on
writing classes and methods to represent particular structures and
perform particular operations. In order to experiment with a
particular method, we will need support code which enables a
human user to type in some data and prints some output as the result
of the method call with that data as arguments. It is important that
you concentrate on the method that performs the operation,
and not on the support code used to run it in the lab.
The support code is of no relevance to what you are meant to
learn in this course. In exams and tests, you will never
be asked to write code that interacts with a human user. When
you are asked to write a method, it is important that you
understand this means with a signature giving its arguments and
return type, but not a program with a method called
main
taking an array of strings as its argument.
Do not confuse "take as argument" with "read from the screen" and
"return" with "print to the screen".
When we write a method, we need to specify the result we want it to have. It is possible our specification is open to more than one interpretation, in which case the method is underspecified. For example, the specification to return the index of a component of an array which satisifes a particular test does not say which index to return when more than one component satisfies the test. The loop above returns the index of the lowest indexed component which satisfies the test, code could be written which is similar but checking the array elements from the highest indexed downwards, in which case the index of the highest indexed component which satisfies the test will be returned. Here the loop would take the form:
int i; for(i=a.length-1; i>=0; i--) if(test(a[i])) break;
We considered the distinction between destructive and constructive operations on an array. If we pass an array as an argument to a method and make changes on it inside the method, those changes will remain in place once the method call has finished. This is because the variable in the method is an alias to the variable which refers to the array outside the method, so changing the array through the variable inside the method will change the array outside the method as they are the same object. A destructive method on an array is one which has its effect by changing the array (and hence destroying how it was originally). If a destructive method is called, since the actual array is changed, the change is observed through any variable which is an alias to the one passed as an argument to the method as well. A constructive method is once which has its effect by creating a new array which is like a copy of the old array with the desired changes, but does not change the old array
We noted that operations which change the size of an array can only
be done using constructive methods. Once an array is created in
Java, its size can never be changed. We can, however, create a new
array of the new size, and assign the variable which referred to
the original array to refer to the new one. This is not the same as
changing the actual array, so if other variables were aliases to
the variable whose value has been changed, they will continue to refer
to the original array.
Week 3: 23/24 January
We started by continuing to look at static methods which manipulate arrays,
looking in particular at the code for for methods which remove items from arrays.
As noted last week, array objects cannot change their size, so such
methods have to be programmed to work constructively: the array passed
as an argument is not changed, instead a new array is created representing
the desired change, and this is returned as the result of the method.
With a method to remove items from an array, we do not initially know the
size of the array that is to be returned. If the item to be removed
does not occur in the array, then a copy of the original array
should be returned, that is an array of the same length with each component
in the new array assigned the value of the component with the same index
in the original array. This is done rather than return a reference to
the original array so that we can be sure, whether or not the item to be
removed from the original array occurred in it, the result of the call to
the remove
method is not an alias of the original array.
Otherwise, after executing the statement b=remove(a,n)
we would not know whether making changes to the array named by b
would also change the array named by a
or not.
If we are to remove exactly once occurrence of an item from an array, and we have checked to see it does occur, the resulting array is of length one less than the original array, and must be created before elements from the original array are copied into it. The elements up to the position where the item to be removed occurs are copied into the same position in the new array, the elements after it are copied to a position indexed by one less than their position in the original array.
If we are deleting all occurrences of a particular item from an array, we need to count how many times it occurs first in order to create a new array of the size needed. Then when elements are copied from the original array into the new array, a separate index variable is used to give the position of the next place to copy an element on each step of the for-loop that does it. This separate index is increased by one only when the next element considered in the original array is not the one to be deleted.
We also noted we could have a method which deletes an item from an
array where the item to be deleted is given not by an argument to
the method set to the actual value of the item, but by an argument of
type int
which gives the position of the item to be
deleted. When programming with arrays it is important to keep clear
the distinction between variables which hold actual values to be
copied to or from positions in the array, and variables which hold
positions. It is particularly easy to confuse the two when we are dealing
with arrays of type int[]
where both sorts of variables
will be of type int
. We noted that we needed to consider
how our method which took an array and a position and removed the item at
that position would handle the case where the position argument variable was
set to a negative value or to a value greater than or equal to the array
length. Either code should be written to handle this situation (maybe
return a copy of the original array), or the comment which describes
the behaviour of the code should state that the position argument must
be in the correct range. This again shows the importance of fully
specifying a method, covering cases of arguments having unexpected values.
We noted that it is a good idea to comment code, but that there are two sorts of comment which a method would have. One would go with the signature of the method, and would describe how the method appears to work to code that calls it. This is for the benefit of anyone else who wants to use the method, it would not normally say what happens inside the method for it to have the required external result or effect. The other sort of comment is inside the method, and is for the benefit of anyone who wants to see how the method works, perhaps to modify it. This would only be necessary where the code is complex, so perhaps it is helpful for a comment to be attached to a particular section to explain the particular circumstances that section was added to deal with.
We next looked at arrayLists. An arrayList can be considered like
an array but more flexible. Where the type of an array whose individual
components are of type t
is t[]
,
the type of an arrayList array whose individual components are of type
t
is ArrayList<t>
.
Although you might want to think of ArrayList
as a
Java keyword, it is actually a Java class provided in Java's code
library, in package java.util
, so if you use it in a class
you have to include the import statement in the file
import java.util.*;
.
Whereas arrays use special symbols in Java (the square brackets
[
and ]
are used only for array operations),
arrayLists are manipulated entirely by Java method calls. The
equivalent of setting the component of an array indexed by i
to value n
with arrayLists is to use the
method set
attached to the variable naming the arraylist.
So if a
is of type t[]
and als
is of type ArrayList<t>
and e
is of type
t
and i
is of type int
, then the
equivalent of a[i]=e
is als.set(i,e)
.
The equivalent of accessing the component of an array indexed by
i
is to use the method get
. So what is
done by e=a[i]
with arrays is done by e=als.get(i)
with arrayLists. It is important to note that with arrayLists the
"angled brackets" (<
and >
, the same
characters as used for arithmetic comparison, but treat them here as brackets,
not arithmetic operators) are just one aspect of a wider aspect of Java
called generic types, which was introduced with the new version of Java,
Java 5, in 2004. They are not used like the square brackets with arrays to
access individual elements.
Unlike arrays, arrayList objects can have their size changed destructively.
If als
refers to an arrayList of type ArrayList<t>
and e
is of type t
, then the statement
als.add(e)
increases the size of the arrayList referred to by
als
by one, and sets the new highest indexed component to
e
. If i
is of type int
then
als.add(i,e)
increases the size of the arrayList by one,
sets the component indexed by i
to e
, and
causes every component of als
that was indexed by an integer
higher than the value of i
to become indexed by an integer
one higher than its previous index. The operation als.remove(e)
will cause the lowest indexed occurrence of an object equal in value to
e
in the arrayList to be removed, so the size of the
arrayList decreases by one, and every component indexed by an integer
i
greater than the index of the object that was removed is
changed so that it refers to the object that was previously referred to
by the component indexed one higher.
The method size
gives the length of an arrayList object
meaning the total number of elements it has, and thus also one greater than
the maximum index i
that can be used in the call
als.get(i)
where als
refers to an arrayList.
So the call als.size()
gives the current length of the
arrayList referred to by als
. Unlike a.length
where a
refers to an array, als.size()
may change the value it returns
without als
changing the object it refers to, for example, it
will be one greater after the call als.add(e)
to what it was
before. Where array objects are always created of a fixed length,
arrayLists objects are generally created with initial length 0, which
is increased as elements are added to them. The call
new ArrayList<t>()
creates a new arrayList to
store elements of type t
, initially storing 0 elements.
There is not the direct equivalent of creating an array of a particular
length with its components initially set to null
, as with
new t[n]
.
One restriction on arrayLists which does not apply to arrays is that the
base type must be an object type. This is any type (including String
)
except the primitive numerical types (int
, double
and the others representing numerical values stored in extra or less space),
and char
and boolean
. Java has wrapper type
which match the primitive types, Integer
matching int
,
Character
matching char
, and the others the same
word as the primitive type except the initial letter is upper case.
These should be used instead of the primitive types, so an arrayList of
integers is given by the type ArrayList<Integer>
.
In older versions of Java, to convert a value from a primitive type to its
equivalent in the wrapper type, or vice versa, required calling special methods.
In Java 5, it is done automatically every time a value of a primitive type
is used where a value of the equivalent wrapper type is needed, or vice versa.
So you can almost treat Integer
as if it was just another
way of writing int
and so on. One place where you can't is
that als.remove(i)
where i
is of type
int
is taken to mean removing the element at the position
indexed by i
from the arrayList referenced by als
,
rather than removing an element equal to i
. If als
is of type ArrayList<Integer>
and you want to remove the
number i
from wherever it is stored in als
,
you will need to do that by declaring i
to be of type
Integer
when using it in als.remove(i)
.
We saw some examples of methods which worked with arrayLists, noting the
distinction between methods which work destructively and methods which
work constructively. As arrayList object may have their size changed,
unlike with arrays, destructive methods are not limited to those which
keep the size of the collection unchanged. We saw that, for example,
a method to perform some operation on an arrayList of integers would
look exactly like a method to perform the same operation on arrayLists
of strings, except that everywhere the word String
occurred
in one, the word Integer
would occur in the other. We might
want to perform the same operation on arrayLists with other base types,
does this mean we have to have a different version of the method for
every base type we might want to use? The answer is "no", because Java 5
introduces the idea of type parameters to methods.
In a method signature, if there are type parameters they are given between angled brackets, separated by commas if there are more than one, before the return type of the method. Then this variable name can be used as a type name in the rest of the method. When the method is called, the value of the type parameter is not given as a separate argument, instead it is worked out from the types of the arguments in the method call. For example,
static <T> ArrayList<T> replace(ArrayList<T> a,T m,T n)is the signature of a static method which is intended to take an arrayList of any base type and two values of that base type, and return an arrayList of the same base type where all occurences of the first value are replaced by the second. Here
T
is the type variable,
and may be used further in the body of the method. When the method call
replace(als,p,q)
is made, the method will be executed
with T
actually meaning String
if
als
is of type ArrayList<String>
and
p
and q
are of type String
, and
with T
actually meaning Integer
if
als
is of type ArrayList<Integer>
and
p
and q
are of type Integer
.
String
which is provided
as part of the Java language. Although in some ways, String
has special treatment in Java, in others it can be regarded as just
another class. An object of type String
in Java is an
indexed collection of value of type char
. Unlike some
other programming languages (C, for example) a string is not directly
equivalent to an array of characters. A string should be considered an
object, which like other objects can have methods called on it. Strings
are an example of immutable objects, meaning there are no methods
which can be called on them which change the object itself. The methods
that can be called on a string object either provide some information about
the string without changing it, or represent a change but do it
constructively, that is by returning a new string object which
is like the string object with the required changes, but the object the
method is called on is not changed. For example, the method
toUpperCase
can be thought of as changing all lower cases
letters in a string to their upper case equivalent, but it does so
by creating a new string. So if we have variable str1
of
type String
set to "Hello World"
, then the
statement str2=str1.toUpperCase()
will cause the variable
str2
of type String
to refer to "HELLO WORLD"
,
but str1
still refers to "Hello World"
.
The method charAt
takes an int
value as its
argument and is called on a string and returns the character indexed by
the integer argument, with the usual Java convention that indexing starts
at 0. So with str1
set as above, if variable i
of type int
is set to 4
then str1.charAt(i)
will return 'o'
. As it is a method call, you should not
expect it to work like array indexing, so for example,
str1.charAt(1)='u'
will not change str1
to
"Hullo world"
, it will not be accepted by the Java compiler as
the Java language simply does not have assignment statements where the
assignment is to a method call.
The 0-argument method length
gives the length of a string,
so with the above, str1.length()
returns 11
.
Another important method is substring
which takes two arguments
of type int
and returns a new string consisting of
characters from the string it is called on starting with the one indexed by
the first argument and going up to but not including the one indexed by the
second argument. So str1.substring(1,7)
would return
"ello W
. There is also a one-argument version of
substring
which returns a new string starting with the character
from the argument position and going up to the end of the string it is called
on. So str1.substring(7)
returns "orld"
.
The class String
in the Java language provides a few other
methods which would commonly be used with strings, some were covered in
the lecture, but it is not important for this course that you should know
them all.
One feature of strings which makes them different from other Java objects
is that there is a way of writing literal values of string
objects, meaning a way of referring to a particular object value rather
than referring to it through a variable set to it. As we have already
seen, a string literal is the characters of the string, in order, between
two double-quote symbols. Java also treats the +
symbol
as a string append operations (do not assume from this that *
and -
can also be used as operator on strings, they can't).
The value str1+str2
is a new string consisting of all the
characters from str1
followed by all the characters from
str2
. Also, str+obj
where str
is
of type String
but obj
any object variable,
is treated as meaning str+obj.toString()
, while str+n
,
where n
is a numerical value, is treated by meaning the
string obtained from appending the string representation of n
to str
. Always be aware that a string consisting entirely of
digits is a different thing from the integers value it represents. For example,
"57"
is a string of length 2 whose first element is '5'
and whose second element is '7'
. It is a different thing from
the integer whose literal value is 57
but which is stored in the
computer as a binary value which makes no reference to 5
or
7
.
We noted that the wrapper classes which we considered last week in the
context of their use as base type for arrayLists also contain convenient
static methods relevant to their types. For example, the class
Character
has methods for testing whether individual
characters fall into various categories, so for instance
Character.isDigit(ch)
where ch
is a variable
of type char
returns true
if ch
is set to a numerical digit character, and false
otherwise.
Note that when we call a static method, unless we are calling it in the
class where it is defined, we have to call it attached to the class
name, which is what is being done here.
We looked at how we would program methods which operate on strings.
We cannot ourselves add new methods to the class String
,
so we have to code any additional operations we might want as static
methods. For example, Java provides a method in class String
to return the string obtained by replacing all occurrences of one
character by another. If str
is of type String
,
and ch1
and ch2
of type char
,
then str.replace(ch1,ch2)
gives the string obtained by
replacing all occurrences of ch1
by ch2
in
str
, but if we wanted to write a static method to do this,
the string would have to be an argument as well, so the call would
be replace(str,ch1,ch2)
.
We looked at methods which involve using a loop to go through the
characters in a string one at a time, this technique is known as
iteration. When the method has to return a new string,
the loop will also add characters to a string to build up the
new string. Note that str=str+ch
might be said to be
"adding character ch
to the back of string str
",
but what it is actually doing is creating a new string object and
setting the variable str
to refer to it, it is not
really changing an object.
Another technique that can be used for operations on strings is recursion.
It is often easy to express an operation on a string in terms of an
operation on a smaller string, often the string which consists of
all but the first character of the original string (given by
str.substring(1)
if str
is the original
string). Thinking this way translates directly to writing methods
which involve a call to the same method. The important thing to note
when considering how this actually works is that each method call
has its own environment. So when a method "calls itself",
new variables with the same names, but possibly different values, as
the variables in the previous method call are set up. A recursive method
must always have a base case which is a condition when it
does not make a further "call to itself". So what happens is that
computation keeps making calls until it reaches a base case, then does
further work when it returns to the old environments which are left
waiting. When there is no work left to be done on return from the
recursive call (that is when the return
statement returns
no more than the value of the recursive call), it is known as
tail recursion. Tail recursion can easily be changed into
iteration by changing the recursive call to a re-use of the existing variables
with new values, and adding a loop where the exit condition is what was
the base case.
To take consideration of these algorithm issues (iteration and
recursion) further, we considered Lisp lists. Although
this is not a type provided by Java, it is easy to write a class
which provides it, and this has been done for the purpose of this
course. As with previous classes you have been introduced to, at this
point you were shown the use of the class without considering what
goes inside it (the implementation). The class you have been given,
LispList
is a generic class, so just like ArrayList
which you have already seen, it is parameterised by giving it a base
class to provide a full type, so for example, LispList<Integer>
represents the type "Lisp list of integers".
The name "Lisp list" comes from an early programming language called
Lisp, which was intended for processing lists of data (hence its name,
from "List processing"). It was one of the earliest example of an
abstract programming language - start from the sort of computation you
want and see how you can get the computer to do it - as opposed to
the standard programming language which starts from what the electronics
of the computer can do and works upwards towards something more human oriented.
It was the forerunner to a class of programming languages known as the
functional languages. Although these languages have not obtained
widespread use, they have been influential in the development of more
practical programming languages, and many top universities have an
introductory course in functional programming, seeing it as a good way to
introduce some fundamental topics in programming.
Using the class LispList
enables us to get some of the
benefits other universities see in having a dedicated functional
programming course, while remaining in the Java programming framework.
There are just four operations which can take place on a Lisp list:
head
which returns the first item in the list,
tail
which returns a new list containing all the items
of the original list in the same order, but missing the first one,
cons
which takes an item as its argument, and returns
a new list which consists of all the items of the original list in the
same order, with the argument item added to the front, and
isEmpty
which returns a boolean saying whether the list
it is called on is empty. These can all be represented by Java methods.
In addition, there is a Java static method, empty
which
returns an object representing a list which contains zero elements.
You can call cons
on an empty list, but you cannot call
head
or tail
on it (in Java terms, attempting
to do so will cause an exception to be thrown). Note the relationship
will always hold that if c
is set to t.cons(h)
,
then c.head()
will be equal to h
and
c.tail()
will be equal to t
. Like strings,
Lisp lists are immutable: the operations cons
and
tail
return new Lisp list objects, they do not change the
object they are called on.
The type "Lisp List" is an example of an Abstract Data Type (often abbreviated to "ADT"), meaning a data type which is completely defined by the operations possible on it, not by how it is stored in the computer.
We saw some simple examples of methods which gave further operations on
Lisp lists. As with strings, these methods are static methods, as we are
not making any additions to the basic methods of the type. As with strings,
methods that may be described as making changes to a structure actually
create new objects representing the change, because as with strings, Lisp
lists are immutable, there is no operation which actually changes the object
it is called on. We saw that with various problems it is often possible
to come to an iterative solution by thinking of it one way, and a
recursive solution by thinking of it another.
Week 5: 6/7 February
We looked at examples of various methods which work on Lisp lists.
As Lisp lists are a recursive data type, it often makes sense to try and
think recursively when programming with them. For example, suppose we
want to change all occurrences of item M to item N in a Lisp list.
Recursive thinking involves noting that if we change all occurrences of M
to N in the tail of the list, then all we have to do to get the final
result is to put the head of the old list onto the front of the result, or
to put N on the front of the result if the head was M. The Lisp list operation
cons
works to put an item at the front of a list.
We saw how this actually works is that the work following the recursive
call in the method is left waiting to be done once the recursive call
has finished. So, with the previous example, as one recursive call makes
another recursive call, a trail of cons
calls is left waiting,
which are picked up as computation of each recursive call finishes and
a return is made to the recursive call which made it.
When we solve a Lisp list problem iteratively, the general process is
to construct a new list starting with the empty list and adding items to it using
cons
as we go through the argument list. Since as we go through
the argument list from front to back we add new items to the new list at the
front each time (as that is how cons
works) the result is often
a list almost like the one we want except in reverse. So we then need to
use the same technique to reverse this list: go through it one element at a
time, adding each element to a new list. We call this technique
stack and reverse. One way of thinking about this is to consider
the lists as stacks of building blocks, one on top of the other. The only
way we can manipulate them is to take an item off the top of the stack
(equivalent to calling tail
on a list variable then setting that
variable to refer to the result) and to put an item on top of the stack
(equivalent to calling cons
on a list variable, with the
item as the argument, then setting the list variable to the result),
and to look at the top item of a stack without changing it (equivalent
to calling head
on a list variable).
We looked at three different sort algorithms working on Lisp lists, and saw code to implement them.
The first was insertion sort. The idea is that we set up a new Lisp list, initially empty, then go through the list to be sorted taking each item in turn and inserting it in its ordered position in the new list. We can also express this recursively: to sort a list, sort its tail, then insert its head in ordered position into the result. Inserting an item into its correct position in an ordered list can be expressed iteratively: take items off the list, stacking them onto a temporary list, until the correct position is found. Then take items off the temporary list and stack them back onto the original list. The correct position to insert is either when the original list is empty or when the first (top) item is greater than the item being inserted. Alternatively, insertion can be thought of recursively. To insert an item into an empty list, or into a list whose first item is greater than the item being inserted, add that item to the front of the list. To insert an item into a list whose first item is less than the item being inserted, insert that item into the tail of the list, then put the original first item onto the front of the result.
Next, we looked at quick sort. This is an algorithm for sorting collections which can be described simply as follows: "take one item out of the collection (the "pivot"), and then divide the collection into two parts: all those items less than the pivot and all those items greater than the pivot; then sort those two collections, then join the two sorted collections together with the pivot in the middle". This is a recursive description, sorting is described in terms of sorting. The intention is that the two collections should be sorted using the same algorithm, so this is a complete description, nothing more is required except to note that the base case is when the collection being sorted has zero or one elements, in which case it can be returned as it is. Since the recursive calls always take as arguments collections of smaller size than the original argument, we know they will always be a step closer to the base case, so the recursion will always come to a halt.
A variant of quick sort avoids the need for a separate join operation, it works on the idea that sorting a list is a special case of sorting a list and joining the result to another list (the "accumulator"), with that other list set to the empty list. In this case, if the list being sorted is empty, return the accumulator. Otherwise, as before divide the list into a pivot and two parts, elements smaller than the pivot and elements greater than the pivot. Sort the greater list with the accumulator, join the pivot to the front of the result, then use this as the accumulator to sort the smaller list.
We also looked at an algorithm called merge sort. Like quick sort, merge sort works by dividing a collection into two parts, sorting each part, and joining the results together (with the base case being the collection having none or one elements). With quick sort, the work on comparing elements is done in the division into two parts, but with merge sort is done in joining the two sorted parts. With merge sort, any element can be put into either part so long as the result is two roughly equal parts. But joining the two sorted lists in merge sort involves merging, where whichever element is the smallest from either sorted list is taken off and put onto a new list, and this is done repeatedly until one of the sorted lists is empty, then the rest of the elements from the other sorted list are added in order. So the comparison of elements takes place in this merging.
The final thing noted was that recursion may sometimes be
mutual recursion. This is where rather than having a method
"call itself", it may call another method and that method may call the first
method. There could even be a chain of three or more method calls to have
a method "call itself" indirectly.
Week 6: 13/14 February
The Monday lecture slot was taken over by the first test for the course.
You can find a copy of the test on-line
here.
In the Tuesday lecture we started looking at implementing objects.
Before Tuesday we had only looked at using objects, we saw we could use the
object name as a type for variables, we could call methods on objects, and
we could create new objects using constructors. But we had not seen the
code that provides those objects.
In some cases, classes to provide objects are already provided as part of
the Java library. In others, we must write our own. The class
DrinksMachine
which we used in
week 1 is not provided as
part of the Java library. You were able to use it because it has already been
written, all you needed was the .class
file that the original
file DrinksMachine.java
compiled into.
A Java class is generally stored in a separate file with the same name
as the class with the addition of .java
to the end. It
starts with the word class
, followed by the name of the
class, followed by an opening {
, followed by the code which
implements the class, followed by a closing }
. In the case of
a class which uses classes from the Java library that are part of any package
except the java.lang
package, the file needs an import
statement before the class
to use them, but this is the only
thing which occurs outside the class definition. The name of the class
may be used as a type in other Java classes. If you have the class file
in the same directory as the file of code which uses that type, when you
compile the code which suses the type, the file which implements the type
will also be compiled automatically.
Variables which are declared in a Java class but not inside any methods
in that class, and which are not declared as static
,
are known as object variables or fields.
Every object of that class has its own variables of those names.
Methods which are declared in a class, and are not declared
as static
, are known as instance methods.
When an instance method is called it must be attached to an object.
The method may have its own variables, both those given as its parameters
and those declared inside it as local variables. Like the variables
in static methods, these exist only for as long as the method call is
executing, with the parameters being assigned to the matching argument
values when the method was called. But instance methods may also use
object variables from the class they are in, and in this case the object
variables when inuse in a method call are taken to be the object
variables of the object the method was called on. These object variables
remain in existence for as long as the object remains in existence. So the
call of one instance method may change the value of an object variable,
and that change will be observed when another method is called on the
same object.
Note, an instance method may be called inside the code for another instance method without it being given as attached to any object. In this case, it is taken to be attached to the object the enclosing method call is attached to.
The code for constructors is like the code for methods, it just looks like
a method with no name and the return type the same type as the class.
The code in constructors sets initial values to the object variables.
We saw that the method call this
in a constructor followed
by (
followed by some arguments followed by )
has the effect of using the code from another constructor in the same
class whose parameters match the arguments. It is possible to have more
than one constructor in a class so long as the number or types of the
arguments differs, we saw an example in the DrinksMachine
class.
It is also possible to have other cases of methods with the same name
but differing in number or types of arguments, though we didn't see an example
of this. More than one constructor or method of the same name is known
as overloading.
We ended up by going through question 1 of the mid-term test. The important
thing here is that the types in the method headings tell us how to fit
together method calls and variables. A method call must be made on a
variable of the class the method call is in. If a variable is assigned
to the result of a method call, the variable must have the same type as the
return type of the method call. Arguments to a method call must be variables
or literals of the same type as the matching parameter in the method
call signature, or the argument can itself be a method call whose return
type matches the type of the parameter. A method call may be made on
a method call so long as it is from the class of the return type of the
method call it is made on. If a method call is used as the test in an
if
or while
statement, it must have return type
boolean
. If a method call is used as part of an arithmetic
expression, it must have a numercial return type, such as int
.
Week 7: 20/21 February
We considered how we would implement a type like ArrayList
if it wasn't already provided by Java. We would need a class which
has some fields to hold the data representing the arrayList object,
and public methods to represent the operations. A simple class
to implement an array-like object might just have an array inside
it, with methods get
and set
to
return the value of the cell of the array at a particular index, and to
change the value to another one, respectively. The constructor for
this class would just take an integer argument and create an array
of its length. So what we have is a layer around an array which enables
it to be manipulated through methods like a standard Java object.
This was our first example of a generic class. A generic class
is defined by following the class name (after the word class
)
in the header of the class by the type variables enclosed in angled
brackets. The rest of the class is then written as if those
variables were actual types. So, for example if a class is headed
class MyClass <T>
, we can create actual
objects of type MyClass<Integer>
,
MyClass<String>
, in fact MyClass<...>
with ...
replaced by any type (including a type
which itself has type arguments). Then when the methods for object
of that class are executed, T
will be taken as meaning
whatever type the ...
is. One problem is that Java does
not allow you to create an array directly using a type variable. So
we cannot have new T[n]
, where T
is a
type variable and n
is an integer value. Instead, we have to
use (T[]) new Object[n]
, which achieves the same
thing. As Java is set up at present, when this is compiled it will give
a warning about "unsafe" code, but you can ignore that.
Java's ArrayList
objects, however, do not behave exactly
like arrays. They have methods which enable them to grow or shrink in
size by adding or removing items. ArrayLists are created of initial
length 0, and then grow in size as items are added using the
add
method. They do not have the equivalent of the array
constructor where an array is constructed of an initial length with
all its cells initialised to null
. In order to simulate
an arrayList growing or shrinking in our own class, the variable inside
the object referring to the array could be changed to refer to a
new array representing the addition or removal of an item. Although
the new array is created constructively, because the variable inside
the object is changed to refer to it, the object itself is changed
destructively.
A better way of dealing with this is to use an array and count technique to represent the arrayList object. This means that the array inside the object is of fixed length, but only part of it is used to store the data. There is a separate variable which holds a count of the number of cells in the array currently counted as data of the arrayList. When a new item is added to the end of the arrayList, it is put in the next unused cell in the array representing it, and the count variable is increased by one. The other methods which appear to change the size of the arrayList are actually managed by moving items up and down the array as required, and changing the value of the count variable. This means there does not have to be a copying of every item into a new array every time the arrayList changes size.
A modification to the array and count technique overcomes the problem
that we might increase the size of the arrayList beyond the size of
the array used to represent it. The modification is to change the array
to a new much bigger one if the size of the arrayList reaches the size
of the array. So the array is replaced at times, but not every time
the arrayList changes size. This is how Java actually implements
ArrayList
.
Note that if we had some code which used our own class for arrayLists, and we changed the internal representation in that class from an array the size of the arrayList to an array and count, we would not have to change the code. That is because a clear distinction has been made between the way the arrayList object operate in terms of the public methods which manipulate them, and the way they are represented in terms of the private variables used to implement them, which are manipulated by these public methods. This uses the principle known as information hiding. The idea is that information in some parts of the program is hidden from other parts. This is a good principle because we know if one part of a program cannot access data in another part, it cannot rely on it, so we can safely make changes to that data without causing problems. The fewer ways two parts of a program can interact, the easier it is to keep track of the interactions and ensure nothing done in one part of the program causes something unexpected to happen in the other.
We also looked at sorting algorithms on arrays and arrayLists. One algorithm was insertion sort. A simple version of this algorithm creates a new arrayList, initially empty, and goes through each of the items in the original arrayList inserying them into the new arrayList in order. A method is needed which takes an arrayList with its items assumed to be stored in order, and takes another item, and has the effect of adding that item to the arrayList in the correct place to keep it in order. So the insertion method goes through the items in the arrayList to which the new item is being added, until it reaches the end or the first item greater than the item being inerted, it then inserts the item at that position.
A variation of insertion sort does not keep a separate sorted arrayList. Instead, it has part of the arrayList sorted and part unsorted. At each step in the sorting, the first item from the unsorted part is put in order in the sorted part. This is done by moving items in the sorted part up one place, covering the initial position of the item currently being inserted, until the correct place for the item being inserted is found. So each step in the sorting increases the size of the part of the arrayList being sorted until the whole arrayList is sorted. As this never changes the size of the arrayList, the same algorithm may be used with arrays as well.
A similar algorithm is selection sort. Again, it works by having the array divided into a sorted and unsorted part, with each step in the algorithm increasing the size of the sorted part and decreasing the size of the unsorted part until the whole array is sorted. With selection sort, the lowest item in the unsorted part is found and is swapped in position with the first item in the unsorted part. Since the sorted part will consist of those items previously picked out as being the lowest, the next item picked out will be the next lowest, and so is in the correct order for the final result. So the consequence is that the sorted part can be taken to have been increased by one in size, and the unsorted part decreased by one.
Finally, we looked at solutions to question 2 of the mid-term test.
It was noted that a destructive method does not return a new
value, but changes an object passed as a parameter, so its return type
is void
. A constructive method constructs a new
object, so its return type is the type of the object constructed, in
this case int[]
. The basic loop to move items one place
up an array referenced by variable a
, doing it destructively is:
for(int i=a.length-1; i>0; i--) a[i]=a[i-1];The original last item of the array is saved in a variable before this loop so it can be put in the first cell after it. Note, you have to go through the loop from the highest indexed position downwards. If the loop went the other way:
for(int i=1; i<a.length; i++) a[i]=a[i-1];the result would be to copy
a[0]
into a[1]
,
then to copy a[1]
into a[2]
but a[1]
now holds a copy of a[0]
so a[0]
is copied into
a[2]
as well, and in fact the original value of
a[0]
will be copied into every cell.
We saw a variation on insertion sort where instead of setting up a new array or arrayList to hold the sorted collection, items were moved around the array or arrayList being sorted, with it remaining of a fixed size. The technique was to consider the array being divided into a sorted part and an unsorted part, with the sorted part initially being just the first element. Then, repeatedly, the first element in the unsorted part is placed into the sorted part by moving elements up one place in the sorted part (covering the initial position of this element) until its place in order in the sorted part is found. Each time this is done, the sorted part increases in size by one element, and the unsorted part decreases in size by one element. So eventually the whole array gets sorted.
A similar algorithm is called selection sort. Again, it works on the basis of the array being divided into a sorted and an unsorted part, with the sorted part growing in size by one and the unsorted part decreasing in size by one in each iterative step of the algorithm until the whole array is sorted. With selection sort, at each step the lowest item in the unsorted part of the array is swapped in position with the item at the first position in the unsorted part of the array, and this results in the sorted part growing by one as this swapped item must now be in its correct final position.
Destructive merge sort still requires extra arrays to copy the contents of the original array into (two of half the size of the original array, but the recursive sort on a half-sized array will require two more arrays and so on). However, to make it destructive, the sorted elements from the two arrays are copied back into the space of the original array when they are merged.
Destructive quick sort can be done without any extra storage. Instead of creating two new arrays, one of elements less than the pivot, the other of elements gretaer than the pivot, an index is started at each end of the array, and the two indexes changed to become closer to each other, with elements at the position they refer to swapped in place in the array every time the element at the position given by the upper index is less than the pivot and the element at the position given by the lower index is greater than the pivot. The separate portions of the array, now holding all elements less than the pivot in one, and all elements greater than the pivot in the other, are sorted recursively (that is, the same algorithm is applied, only to those sections).
We noted that the Java language already has static methods to sort arrays
and arrayLists
provided in its library in the classes Arrays
and
Collections
respectively. We learn how these
algorithms work in this course, not because we would need to write the code
for them in practice (it is always better to use library code if suitable
library code exists, rather than write our own code), but because it serves
as a good introduction to the topic of algorithms. We have seen that algorithms
may be expressed recursively or iteratively, that algorithms to change a
structure may be expressed as working constructively or destructively, and
that there may be more than one algorithm for a particular problem.
We looked at algorithm efficiency in terms of time taken to complete an operation. Experimenting with running the code for the sort algorithms will show that whatever type of elements is being sorted and on whatever computer, quick sort and merge sort will always complete a sorting task quicker than insertion sort and selection sort, with a huge difference in time taken once we apply them to collections of thousands of elements. In order to consider this further, we first looked at a simpler problem than sorting, the problem of searching, that is checking whether a particular element occurs in some collection, possibly giving its position in the collection.
If the collection we are searching is not in any order, then we have no choice but to look at each of its elements in turn until we have either found the one we are searching for, or we have looked at every one and not found it. Looking at each element in turn like this is called linear search. If the collection is ordered and we use linear search, if we come across an item which is greater than the one we are searching for, we know the rest must be even greater, so we can halt the search and report the item not found without looking through the other items. However, if the collection is ordered, a much better way of searching is to use binary search. With binary search, we look at the middle item first. If it is not the item we are searching for, we can then restrict our search either to the items positioned before it or to the items positioned after it in the array, depending on whether the middle item is greater than or less than the item we are searching for.
So, consider how linear search works. At each step in the algorithm it
reduces the size of the data it is searching by one. If the search is
of an array of size N
, at most it will look at every item in
the array and check it to see if it is equal to the one being searched for.
On average, assuming the items in the array are a representative sample of
all the possible items, it will make N
/2 checks. With binary
search, each step in the array makes one check of an item being looked at
against the item being searched for, and if they are not equal, cuts the
amount of data being searched by half. It must end when there is just one data
item being searched. So the total number of checks for an array of size
N
is at most the total number of times N
can be
cut in half. In mathematical terminology, this is the logarithm
to the base 2 of N
, written "log2N
".
We say that linear search is "of order N
" and binary search
is "of order log N
". This is conventionally indicated by
O(N
) and O(log N
) respectively.
The order of an algorithm is an estimate of the number of operations
is takes to complete expressed as a function of the size of the problem
(such as the length of the array being processed). The estimate ignores
constant mutiplying factors, and considers only the highest term. This
is known as the big-O notation due to this convention of indicating
it using the capital O letter.
With selection sort, it can be seen that with an array
of length N
, on the first step in the algorithm N
-1
comparisons of one item against another are done to find the lowest item in
the unsorted section, on the second step N
-2, and so on down
to the last step where just one comparison is made to find the lowest of
the last two items. With insertion sort, 1 comparison is made on the first
step to insert the first item from the unsorted part into the sorted part
(consisting of the first two items). Then, up to 2 comparisons to insert
the next item into the sorted part, and so on until up to N
-1
comparisons are made to insert the last item into its correct position to
make the whole array sorted. We know that the sequence
(N
-1)+(N
-2)+...2+1 adds up to
N
2/2. This means we categorise both selection
sort and insertion sort as O(N
2).
With merge sort, the array is divided into two halves, which are sorted.
Then merging the two halves into one takes approximately N
comparisons for an array of length N
. But each of the two
arrays, of length N
/2, was itself produced by merging
two arrays of length N
/4. In total, that's another
approximately N
comparisons. At the next level down, there's
eight arrays of length N
/8 merged to form four arrays of
length N
/4. We continue down until the arrays have been
halved until they are of length 1. So there are approximately N
comparisons multiplied by the number of times N
can be
halved, which as we already said is log2N
. So
we say that merge sort is O(N
log N
).
Note we don't give the base of the logarithm in big-O notation, because the
logarithm of any number in one base is the logarithm of the same number
in another base multiplied by a constant factor (the logarithm of the first
base in the second base).
By a similar argument, quick sort can also be categorised as
O(N
log N
), except that
the N
comparisons at each level are made as the arrays are
divided into two, and not as they are joined. We noted that quick sort
has a worst case: when applied to an array which is already sorted
and the first element is used as the pivot, it partitions into an empty
array and one of size one less than the original array, with the same
happening when that second array is partitioned and so on. So the
result is that on each step of recursion the size of the problem is reduced
by one, not halved, which means that quick sort when considering the
worst case is O(N
2).
We finished by considering question 3 of the mid-term test. One point to note
is that it asked for generic code, which implies a type variable. A type
variable is surrounded by angled brackets, as in <T>
, when
it is initially declared and then only when it is combined with
a generic type, as in ArrayList<T>
. Angled brackets are
not used in any other circumstances. It was also noted that when a question
asks you to write a method which "takes ... and returns ...", what it
takes becomes parameters of the appropriate types in the method heading,
and what it returns becomes the return type. You should never interpret
such a question as asking you to write code that prompts a human user to
type something on the screen, or which prints anything to the screen.
Also, you will be given an English description of the problem, and usually
an example which illustrates it. But when, for example, the question says
"takes an integer" and illustrates it by an example where that integer is 3,
that does not mean you are being asked to write code that deals with
the integer 3, your code should deal with any integer as given as an
argument to the method call not just the one that happened to have been given
in the example.
In part a) of question 3, to get the reversing effect, the best thing was
to go through the original arrayList in reverse order, so using a loop
variable initialised to one less than the size of the arrayList and then
reducing it by one each time round the loop. Then a.add(n)
with
adds the argument n
to the end of the arrayList a
.
Alternatively, a.add(0,n)
adds n
to the beginning
of a
if you did use a loop which went through the original array
in normal order.
In part b) of question 3, you needed to note that the loop
for(; j<a.size(); j++) if(a.get(j).equals(thing) break;ends either when
j
has reached a.size()
or when
j
indexes a cell in a
whose contents are
equal to thing
, where previous code makes
thing
the item stored at
position i
in a
, and j
start the
loop with value i+1
. So if the loop exits with j
equal to a.size()
it must be because no duplicate of the item at
position i
has been found later in the arrayList a
.
In this case, a count held in variable n
is increased by 1. That
is what the code
if(j==a.size()) n++;does. All this is in a loop which goes through the entire arrayList with
i
as the index of each item in turn.
So the count is increased for each item where there is no later occurrence
of the item in the arrayList (including the last occurrence of a repeated
item). Hence the effect of the method is to return a count of the number
of items in the arrayList argument, ignoring duplicates. Note that when
you are asked what value a method returns, what is required is a
description in terms of the arguments to the method only, not a description in
terms of variables internal to the method.
interface
types, followed by
bounded type variables. The first two of these are topics
you will also have covered in your
object-oriented programming course. These are such an important topics,
but ones that can be quite difficult to grasp properly, that it makes sense
for them to be covered in two different ways with different sets of examples.
The whole issue of type variables and generic classes isn't being covered
in the object-oriented programming course. That is because they were
introduced only recently in the
latest revision of Java,
whereas the decision in the object-oriented programming course has been to use
the version of Java before type variables were introduced. So note in
that course there is an introduction to Java's Collections framework in its
week 9,
but described using the "raw" collection types, in which generic types
store elements only of type Object
, and type-casting has
to be used when elements are retrieved from collections to restore a
view of their original type. It is a useful contrast for you to see
both approaches.
Inheritance describes a new class in terms of its difference from another
class. The key word extends
is used after the class
name in the header to the class, followed by the name of the class it
changes, to indicate this. The new class only gives those aspects where
it differs from the class it extends. It may add new variables and methods
to that class, or it may change the behaviour of existing methods. You
can use methods from the class which is being extended (which is referred
to as the superclass) on objects of the new class (which is
referred to as the subclass) even if those methods are not mentioned in
the code for the sub class, and they will operate as they did in the
superclass. If there is code in the subclass for a method with the same
name and number and types as one in the superclass, that method is said to be
overridden, and when a method call is made on objects of the subclass,
this new code will be used, so the effect is to change the behaviour of the
method. Objects of the subclass can be assumed to
have the internal variables of the superclass within them, but those variables
can only be accessed in the code of the subclass if they have been declared
as protected
rather than private
.
An important point is that a variable which is declared as of the type of the superclass may be set to refer to an object of the subclass, including variables which are parameters to methods. This means that any use of a variable of an object type has an apparent type (the type the variable is declared as having), and an actual type (the type of the constructor used when the object it is referring to was created), which may not be the same. Methods that are called on a variable must be from its apparent type (that includes from classes which are superclasses of the apparent type). However, if a method which has been overridden in the actual class is used on a variable whose apparent type is the superclass, the code from the subclass is used to execute the method call. This behaviour is known as late binding or dynamic binding, meaning that the decision on what actual code is used to execute a method call is made when the code is executed and not when it is compiled (since it is not always possible to tell at compile time what the actual type of a variable will be when the code is executed).
A variable of a subclass may not be directly assigned the value of a variable
of a superclass, even if it is known the actual type of that value will be
the subclass. To do such an assignment, it is necessary to use
casting, which means putting the name of the subclass in round brackets
before the value. Java has an operator, instanceof
,
which enables you to test whether a variable refers to a value which is
of a subclass of its apparent type.
The usefulness of the mechanisms described above is that it enables you
to write generalised code. If a method only relies on operations from
a superclass, you can write it with parameters declared of the type of
the superclass, then the same method will work for objects of any of
its subclasses, you do not have to write separate versions of the method
for every possible subclass. Taking this further, a Java interface
enables you to define a type solely as something on which a particular
method or set of methods may be called. A Java interface is written like
a Java class, but methods are given only in terms of their signatures.
The idea is that the interface is a superclass given in order that generalised
code can be written, but variables of that superclass will always have an
actual type of one of its subclasses. The same rules for superclasses and
subclasses apply, except that a class is decalred as implements
an interface rather than extends
, and it must have methods with
full code for all the methods named in the interface. Also, a class can
only extend one class, but it can implement any number of interfaces.
So, if we have class Sub extends Super
where
we also have class Super
, or we have
class Sub implements Super
where we also have
interface Super
, a variable of type Super
can always refer to an object of type Sub
. However, Java
does not also allow a variable of type ArrayList<Super>
to refer to an object of type ArrayList<Sub>
, and the
same for any generic type with the type variables set to Super
and Sub
. The reason for this is that an object of type
ArrayList<Super>
could contain elements of type
Sub
, but could also contain elements of other subclasses
of Super
, whereas an object of type ArrayList<Sub>
could only contain elements of type Sub
. So there are operations
you can do with the former that you can't do with the latter, so it is not
a proper superclass-subclass relation. This means you cannot write
generalised code with parameters of type ArrayList<Super>
and expect it to deal with matching arguments of type
ArrayList<Sub>
, or of an arrayList of any other
subclass of Super
. This would seem to weaken our ability to
write generalised code, but Java provides a way round it. If we declare the
parameter as of type ArrayList<T>
, we can then state that
T
must be set to Super
or to a subclass of
Super
(such as Sub
) by bounding the type
variable T
when it is declared. This is done by writing the
declaration of the type variable as
<T extends Super>
instead of just <T>
. Doing this also means that a
variable declared as of type T
in the method may have
methods from the class Super
called on it, and may be
passed as an argument in method calls requiring arguments of type
Super
.
We then noted that the Java library has many interface types built into
it, one of the most important being Comparable
.
You can see this documented in the official Java documentation
here.
It is actually a generic type, so it takes type variable T
,
and is defined as having a single method with signature
int compareTo(T o)
. Any class Thing
(where Thing
could be any class name) which is declared
as implements Comparable<Thing>
must have a
method with signature int compareTo(Thing t)
in it.
This method establishes the natural order for items of type
Thing
. If t1
and t2
are two
variables referring to objects of type Thing
, then
t1.compareTo(t2)
returns a negative integer if t1
is considered to be less than t2
, a positive integer if
t1
is considered to be greater than t2
and
0 if they are considered to be equal in this ordering. For example,
the natural ordering of objects of type String
is their
alphabetic ordering. Java's built-in sort methods, which you can find in
class Collections
documented
here
and in class Arrays
documented
here,
use this natural ordering to sort data structures.
So if you have written your own class, and you want to be able to sort it using
Java's built-in sorting, you can do so by declaring it as implementing
Comparable<C>
, where C
is the name of the
class repeated, and writing a compareTo
method
to give it the ordering you want. If you are writing generic code for your own
algorithm, and that code relies on the items it is working with having
an ordering, you need to define the code as working on objects of
a type variable T
declared as
<T extends Comparable<T>>
, and use
compareTo
in the code to compare the items in order. We saw
this done when we wrote our own sort algorithms, and we saw our own made-up
class ScrabbleWord
written to represent words with a natural
ordering given by their score in the game
Scrabble.
One complication is that a more general way of writing generic code
which relies on ordering is to declare the type variable as
<T extends Comparable<? super T>>
.
This means that T
may actually be a subclass of some other
class S
where S
is declared as
implements Comparable<S>
. That is, it allows
for the possibility of working with a class of items ordered not by their
own compareTo
method, but by a compareTo
method
inherited from a superclass.
Week 10: 13/14 March
We noted the difference between implementation and
application of an abstract data type. The application is the
code which uses objects of that type. This code views objects of that
type purely in terms of the public methods the class for that type
provides, with the assumption that they work according to a predefined
set of rules. For example, if we have variable t
of type
Thing
, and variables list1
and list2
of type LispList<Thing>
, we know that if we execute
list2=list1.cons(t)
then list2
will be set
to an object such that list2.head()
will return an object
equal to t
and list2.tail()
will return an
object equal to list1
and list2.isEmpty()
will
return false
. We do not have to consider the code that is
executed when these methods are called, we can just assume that it always
functions in this way.
The implementation of an abstract data type is the code within the
class for that type which actually makes the methods work as they should.
There will be a set of variables, declared as private
so
they cannot be interfered with directly by code outside the class, which
together form a data structure which represents objects of the
abstract data type. The methods will have code which manipulates the
data structure to represent destructive changes, or creates a new data
structure and encloses it inside a new object of the type to represent
a constructive method.
We first considered using an array as the data structure representing the
abstract data type LispList
. Elements are stored in the
array in the same order as in the corresponding Lisp list. The operations
tail
and cons
return a new LispList
object, so they are implemented by setting up new arrays of size one
less and one greater respectively than the size of the array in the
object the methods are called on. Then items are copied into the new array
from the old into their correct positions (one position lower for
tail
with the item at position 0 not being copied, one position
higher for cons
with position 0 holding the new item). The
new array is incorporated into a new LispList
object via
a constructor. The constructor is declared as private
, so it is
not possible to use it outside the class LispList
to create
new LispList
objects, it is used only to create new
LispList
objects to be returned by the public methods
tail
, cons
and also empty
(which
returns an object representing the empty list, with an internal array of
length 0).
Next we considered an entirely different implementation of LispList
using a data structure known as a linked list. A linked list is made
up by linking together "cells", which are objects of a recursive class.
A recursive class is a class which has a variable inside it of its own type.
A linked list is composed of objects of the simplest sort of recursive type,
where each cell object has one variable holding a data item of its base type,
and one recursive variable, linking to another cell. The chain formed can
come to an end because the recursive variable can always be set to
null
(another way it can come to an end, which we did not
consider, is for the recursive variable to be set to refer to a cell
earlier in the list, forming a circular list).
We used diagrams to represent this structure called
cells and pointers diagrams. A cell object is represented
by a box with two parts, one holding the data of the cell, the other
holding a reference to a further cell. A reference to a cell is indicated
by an arrow leaving from the part of the box representing the reference
(or a single one-part box if the reference is in a separate variable which is
not part of a cell) and pointing to the whole two-part box representing the
cell being referred to. Since a variable may only refer to one object,
there can only be one arrow leaving a particular box, but since
several variables may alias one object, there may be any number of
arrows leading to a particular box. There is no concept of the arrows
pointing to a particular part of a box. A box with a diagonal line through
it rather than an arrow leaving from it represents a variable set to
null
.
The operation to add a new item to the front of a linked list is done by creating
a new cell whose data variable holds the item, and whose recursive field
is set to the linked list. This is used to implement the cons
operation of Lisp lists, making a reference to the new cell inside a
new LispList
object to make cons
constructive.
The tail
operation is implemented by a method which creates
a new LispList
object whose cell variable is set to the
value of the cell variable of the linked list of the LispList
object it is called on, thus making it an alias of the second cell.
The result is that although Lisp List objects implemented in this way
may share cells, there is no way a cell reached from one can be changed by
an operation on another Lisp List objects, since all operations work by
creating new objects, not by changing existing ones.
We also saw how the type ArrayList
could be implemented using
a linked list data structure, rather than the array or array and count
data structure we considered in week 7.
Since the type ArrayList
has destructive methods, we do need
to alter the linked lists inside them destructively. However, this will
not cause a problem if the cells of the linked list are not shared by any
other object.
To destructively remove a cell from a linked list, we need a variable
(referred to as a pointer) which refers to (or "points to") the
previous cell in the linked list. We then set this cell's recursive variable
to the value of the recursive variable of the cell it refers to, so
changing it to refer to the next cell in the list. This implements the
remove
operation. Adding a new cell to implement the
add
operation
again involves having a pointer to the previous cell, but in this case
setting the recursive variable of the cell it points to to a new cell
which holds the data item to be added and whose recursive variable holds
the previous value of the recursive variable which now points to this new
cell. Destructively changing the value of cell to implement the
set
operation involves having a pointer to the cell being
changed and using this to change the data variable of that cell.
In implementing these operations, a common pattern is to use a loop
with a pointer initially at the first cell in the linked list, and with
the update moving it one cell down, possibly also with a count variable
holding the number of links moved down. The common operation of adding a
new element to the end of an arrayList (single argument add
)
is implemented, when the arrayList is implemented by a linked list,
by moving the pointer down the list until it points to the last cell,
the one whose recursive variable is set to null
, and then setting that
recursive variable to refer to a new cell holding the data item to be added
and a recursive variable set to null
.
The array implementation of ArrayList
has the advantage
that the get
and set
operations work
very efficiently, as access to an indexed position is gained in one
step. However, it is less efficient with the add
and
remove
operations for an indexed item, as this involves
moving all following items up or down the array. In this case, a
linked list implementation may be more efficient as it is easier to
add or remove a cell from a linked list. This shows how there is not
always one best way of implementing an abstract data type, sometimes one
way is best if one sort of operation is more commonly used, another way is
best if another sort of operation is the one used more often. This is
recognised in the Java code library, where there is not only the type
ArrayList
, which we have considered before, but also the
type LinkedList
, which has official documentation
here.
The type LinkedList
has similar operations to the type
ArrayList
, but its implementation underneath is with a
linked list rather than an array and count. The methods they have in
common are shown by them both implementing Java library interfaces
such as List
documented
here
which give those methods.
Week 11: 20/21 March
We continued looking at ways of implementing the abstract data type
ArrayList
. This abstract data type has operations on it
which take an integer argument representing a position in the
collection. If such an argument is set to a value beyond the
current size of the collection, an exception should be thrown.
With just a linked list as the data structure representing an
arrayList, we would only find if a position value is greater than
the size by moving a pointer down the linked list until it has
reached the end, and keeping a count of the number of links followed
which has not reached the position required when the end of the linked
list is reached. To give correct Java behaviour, at this point a
new exception should be created and thrown. A Java method which is
called when it "can't be done" should always throw an exception
of an appropriate type, it should not cause an error message
to be printed, or throw an exception of a type which does not indicate
what the problem was. The appropriate type of exception to be thrown
when an index value is greater than the size of the collection it is
indexing is IndexOutOfBoundsException
(rather than,
say NullPointerException
, which would be thrown if there
was no check the end of the linked list had been reached before the
required position).
If the data structure used to represent an arrayList was a linked list and a separate variable giving the number of cells in the linked list, when an operation involving an index is called and the index is beyond the size of the collection, an exception can immediately be thrown after comparing it with the value in the size variable, rather than only finding out after the whole list has been gone through. So although the size variable does not give any additional information that could not be worked out (by sending a pointer down the linked list until it reaches the end, and counting the number of links passed), it improves the efficiency of the operation in time terms. The cost is the additonal storage required for each arrayList object to store the size variable. With data structures used to represent abstract data types, there often is this tradeoff between efficiency in time and efficiency in storage: to save time we must use more storage.
Another way of representing an arrayList, which we saw, has an additional
variable in the object of type ArrayList
which holds
a pointer directly to the last cell in the linked list representing the
arrayList. The advantage of this representation is that a common
operation on arrayLists is to add a new item to the end of the list.
With a pointer to the last cell of the linked list, this can be done in
the two steps of creating a new cell following from the cell this back
pointer points to, then moving the back pointer to point to this new
cell. This is more efficient in time than starting a pointer at the
first cell in the linked list and moving it down one cell at a time
until it reaches the last cell.
With these extra variables in the representation of arrayLists, care
needs to be taken to ensure they are always correctly updated in
all operations. Whatever happens, the back point must always be left
pointing to the last cell in the linked list (or set to null
in the representation of an empty collection) and the size variable
must always be left having the value of the number of items stored
in the collection.
Finally, we looked at a representation of arrayLists using doubly-linked lists. This uses cells like the linked lists we have been considering, but each cell has two recursive variables, not one. The second recursive variable is always set to point to the cell whose first recursive variable points to it. In other words, there is one chain of links pointing one direction in the list, and another pointing in the reverse direction. This is an improvement on just having one pointer in the arrayList object to the back cell, since if the back cell is removed that back pointer can be reset in one step rather than by following down the entire linked list to get to the new back cell. Operations to reach a particular cell in the list can be implemented by starting at whichever end of the doubly-linked list is closest and moving either backwards or forwards in the list. Adding or removing a cell from the linked list involves resetting the links so that when the operation is finished, the correct doubly-linked property is maintained. However, only the links surrounding the cell which is to be added or removed have to be changed. The rest of the data structrue can be left unchanged.
As noted last week, the Java library has two classes which appear
to operate very similarly, ArrayList
and
LinkedList
. The main difference is the representation
underneath, with ArrayList
using an array-based
representation, and LinkedList
using a linked list
representation, which is actually a doubly-linked list. These two
classes are linked by both implementing the interface type
List
. As a result, if we write methods whose
arguments are given as of type List
, the code will work
for objects which are of type ArrayList
or
LinkedList
. In most cases, the only time we need
refer to these types rather than the more general interface type is
when we actually create new objects.
We noted that these types form just part of the Collections
Framework in Java, and gave brief consideration to other
types provided in this framework. The main property of
List
objects (whose actual types will be one of the
types which implements the List
interface) is that they are
indexable: they have the concept of their contents being stored
at particular locations given by an integer value index in the range
from 0 to one less than the size of the collection. The methods given
in the List
interface include those necessary for an indexable
collection. Another interface type in the collections framework is
Set
, again a generic type so it must be combined with
a base type or type variable. A set does not have the concept of its elements
occurring at particular locations, nor does it have the concept of
duplicates of elements being stored. So the operation to add a new
item to a set object is defined as not changing that set object if
an item equal to it is already contained in it.
We noted that the Java Collections framework assumes the existence
of a method equals
which is called on an
object, with one argument, and returns a boolean. For example,
if we attempt to add an object referred to through the variable
obj1
to a set, it will not be added if there is another
object in the set, which if referred to by variable obj2
has the property that obj1.equals(obj2)
returns true
.
The equals
method is also assumed in the operation
remove
for List
objects, where if
list
is of type List<Thing>
, and
obj1
is of type Thing
, then
list.remove(obj1)
will remove an item from the
collection referred to by list
if when obj2
is a reference to that object, obj1.equals(obj2)
returns
true
.
Since all classes in Java which are not declared as extending other
classes implcitly extend the most general class,
Object
, they inherit the equals
method
from the class Object
unless they are given their
own equals
method to override it. The default
equals
method inherited from object
works as an
alias-tester: obj1.equals(obj2)
returns true
if and only if obj1
and obj2
are aliases:
that is, both variables refer to the same object. The
==
operation always works this way,
so whatever types obj1
and obj2
are,
obj1==obj2
will return true
if and only
if they refer to the same object. In some cases, however, we may
wish two different objects to be treated as being equal, for example
if they are two different objects but the variables inside one of them all
have equal values to the corresponding variable inside the other.
To achieve that, we need to write our own equals
method
to go in the class of these objects. We saw an example to go in our
class LispList
. Note, the argument type of this
equals
method must be Object
rather than
the class it appears in, in order for it to correctly override the
default equals
and also ensure obj1.equals(obj2)
works for any obj1
and obj2
(returning
false
if they are of different types). Then the
instanceof
operator is used to check that the argument
is of the correct type, and type casting is used to view it as being
of this type. Java's dynamic binding mechanism means that when
equals
is called in the implementation of the collection
framework methods, it will be the equals
method from the
actual class of the object it is called on.
We noted that the Set
interface type is implemented by
two actual object types, HashSet
and TreeSet
.
The difference between these two types is that HashSet
appears to store its contents in a random order, while
TreeSet
appears to store them in their natural order
as described previously.
The reason for this is to do with the different data structures used
to implement these types, hash tables for HashSet
and
ordered trees for TreeSet
. These data structures were
covered when this course was run previously (see
here)
but they are not covered this year. For this year, you need only know the
results of using HashSet
and TreeSet
, not
what happens underneath when they are used. HashSet
compensates for its lack of ordering by being more efficient in locating
elements in a set.
Since Set
objects have no indexing, you may wonder how ordering
is involved. The ordering is shown in the text representation, but also
when iterators are used. An iterator is an object which is
dependent on some collection object, and goes through the items of the
collection one at a time. The Java collections framework contains a
generic type Iterator<E>
, with objects of this
type being constructed from the call iterator()
in objects
of some collection type with base type E
. Then, each call
of next()
on the iterator object returns a different item
from the collection until all items have been returned, with the
call hasNext()
saying whether there are more items to be
returned. Java 5 introduced a new from of loop called a
for-each loop, which uses iterators automatically.
The code for(E item: a) body
where a
refers to any collection with base type
E
(it also works if a
is an array of type
E[]
) executes body
repeatedly
with variable item
set to each of the elements in
the collection in turn.
We considered comparator object. The Java collections framework
gives the generic type Comparator<E>
, which is
defined by an interface which contains just the method signature
int compare(E obj1,E ;obj2)
. The idea
is that if comp
refers to an object of type
comparator<Thing>
, and obj1
and
obj2
are of type Thing
, then
comp.compare(obj1,obj2)
behaves similarly to
obj1.compareGo(obj2)
as used to define the natural
ordering. However, it may define a different ordering.
So we can write code where the ordering of a class of objects is
given by a comparator object passed as an argument rather than fixed
as the natural ordering. The type TreeSet
has a
constructor which takes a comparator object as its argument. The
result is to create a TreeSet
object where the ordering
of objects within it is as defined by the comparator object used when
it was created, and not by the natural ordering of its base type.
When a TreeSet
object is created with a 0-argument
constructor, ordering is by natural order.
Week 12: 27/28 March
The Monday lecture slot was taken over by the second test for the course.
You can find a copy of the test on-line
here.
On Tuesday, the last lecture for this course, we summarised the course
contents, and discussed the format of the exam in May, and its role in
the assessment of your degree overall.
One way of checking the main topics in this course is to look at all the words in bold type in this lecture summary. You should make sure you understand each of these terms.
We started off this course by looking at the use of objects, how we can call methods on them which may return values and/or change their internal state, how there is a distinction between objects and the variables that refer to them, so a variable (like a name) may change the object it refers to, and (like names) there may be more than one variable referring to a single object. Then we considered arrays, the idea of writing methods which work on arrays, and that these methods be destructive (change arrays passed as arguments) or constructive (leave argument arrays unchanged, but create and return new arrays).
Next we considered that Java generic types, such as ArrayList
provide a more flexible way of holding and manipulating collections of data
than arrays. We considered the type LispList
as an alternative
form of collection to ArrayList
, which was recursive rather
than indexable, and immutable rather than mutable. We used Lisp lists to
look at operations expressed using recursion as an alternative to
iteration.
We followed by turning to how we could write classes which implement the objects we have been using when these classes have not already been provided in the Java library. We contrasted the idea of an Abstract Data Type, which is viewing a class of objects in terms of the operations which can be called on it, with a data structure, which is the structure of variables inside an object which are manipulated so that it appears to give the behaviour as dictated by its abstract data type.
We looked at sorting algorithms, and noted that the speed taken to sort a large collection of data could vary dramatically depending on the algorithm chosen. We saw the "big-O" notation as a way of categorising algorithms in terms of their efficiency by estimating the number of significant operations (such as comparisons between data items) that occur if a particular algorithm is used.
Next we looked at inheritance, how one class of object can be declared as a variant on another class (modifying it and/or adding extra capabilities), and how we can write Java code which is general for a class and for all classes which extend it in this way. This leads into the mechanisms needed to handle the fact that a Java variable declared as of one class may refer to an object created using the constructor of a class which extends that class. We saw the idea of type variable used to write code which is general for a range of types, and bounded type variables used to write code which is general for a range of types which extend some particular type.
We saw the idea of linked lists as an alternative data structure to
arrays used to implement collection classes. We noted that just as
there may be different algorithms used as alternative ways of solving the
same problem, so there may be different data structures used to implement
the same abstract data type. This is recognised in the Java code library,
which, for example, has the type LinkedList
, which behaves
similarly to ArrayList
, but has a doubly-linked list
data structure inside it rather than an array-and-count. We looked at
some of the other features provided by Java in its collections framework.
This includes classes which give set-like rather than array-like
behaviour, interface types which enbable us to write objects which can
be passed as arguments to tell some code how it should order a class
of objects, and iterators which are objects that go through a collection
of objects one item at a time.