ArrayList
and LinkedList
which seem to behave
in the same way. Actually, this is not quite correct because the type
LinkedList
has some extra operations which the type
ArrayList
does not have.
You can find the complete documentation for the class ArrayList
here
and the complete documentation for the class LinkedList
here.
Note that correctly these types should be described as
ArrayList<E>
and LinkedList<E>
to indicate they are types with a type parameter, to get the full type
you must give the type of the objects which are held in the
ArrayList
or LinkedList
object.
If you look at the complete Java documentation, you will see the extra
operations which LinkedList
has and ArrayList
does not. In particular, there are some concerned with accessing the first and
last items of the list. This reflects the fact that LinkedList
is implemented using a
doubly-linked list,
so these operations are particularly suited to this implementation.
But also, ArrayList
has some operations which LinkedList
does not have. One is removeRange
which takes two integers as
its arguments and changes the object so that all elements starting with
the one indexed by the first argument and going up to but not including the
one indexed by the second argument are removed. The others reflect the
underlying implementation of ArrayList
using the
array and count technique.
The method trimToSize
changes the array in the object so that
it is of the exact size of the object with no spare capacity. The method
ensureCapacity
takes an integer argument and changes the
array in the object so that it is of the size of the integer argument if it
was previously smaller. Using these operations would only make sense if you
had a particular concern about efficiency.
You will note that the Java documentation refers to the types
ArrayList<E>
and LinkedList<E>
as
being subtypes of other types, and implementing various interface
types. The reason for this is to allow generalised code to be
written. The various classes and interfaces of which these are
part are referred to as the Java Collections Framework. We
will cover the most important aspects of this in this and the
following
section of notes, but if you are interested you can find a full guide
here. For our purpose now, the main thing to note is that both
ArrayList<E>
and LinkedList<E>
implement the
interface List<E>
. You can find the full documentation for this type
here.
This means that if you write a method whose parameter type is List
,
you can use that same method for arguments that are of type
ArrayList
or LinkedList
. This is the best thing
to do unless the code in the method really needs to use methods which are
only found in ArrayList
or LinkedList
. To
make your code as general as possible you should always use the type
List
rather than the types ArrayList
or
LinkedList
except when you are actually creating objects
using new
. As List
is an interface type,
you cannot create an object whose actual type is List
,
you can only make a variable of type List
refer to an object
whose actual type is one of those which implements List
.
Apart from ArrayList
and LinkedList
, another
type which implements List
is Vector
, which
you may see referred to in some older Java textbooks. You can find the
full reference to this type
here,
but the intention of Java's supplier is that though it is still made available
for old code, the type ArrayList
is a replacement for it.
The type List
provides all the operations needed to manage
a collection of items stored in a way so that each has a position in
the collection. As a simple example, here is one program:
import java.util.*; class BiggestInt { public static void main(String[] args) throws Exception { Scanner input = new Scanner(System.in); System.out.println("Enter some integers (all on one line):"); String line = input.nextLine(); String[] words = line.split(" "); List<Integer> a = new ArrayList<Integer>(); for(int i=0; i<words.length; i++) a.add(Integer.parseInt(words[i])); int pos = ListOps.posBiggest(a); System.out.print("The biggest integer is "+a.get(pos)); System.out.println(" at position "+pos); } }which creates an
ArrayList
of integers
and here is another one:
import java.util.*; class LastAlphabetically { public static void main(String[] args) throws Exception { Scanner input = new Scanner(System.in); System.out.println("Enter some words (all on one line):"); String line = input.nextLine(); String[] words = line.split(" "); List<String> a = new LinkedList<String>(); for(int i=0; i<words.length; i++) a.add(words[i]); int pos = ListOps.posBiggest(a); System.out.print("The last word alphabetically is \""+a.get(pos)); System.out.println("\" at position "+pos); } }which creates a
LinkedList
of strings. In both these
cases, the variable a
is of a List
type, but the
actual object created can't be of this type as List
is
only an interface. Both programs make use of
a static method in the class ListOps
(we have just made this up,
it is not built into Java) which is written:
public static <T extends Comparable<? super T>> int posBiggest(List<T> ls) { int pos = 0; for(int i=1; i<ls.size(); i++) if(ls.get(i).compareTo(ls.get(pos))>0) pos=i; return pos; }So this method can be used to find the position of the "biggest" item in any collection, provided the collection is of a type which implements the
List
interface. Note it also uses the bounded generic concept
we saw previously
to indicate that the elements of the collection must be of a type
which has its own sense of ordering, and the "biggest" object in
the collection is as given by this sense of ordering. So the same method
can be used to find the highest number from an ArrayList
of
integers, and the last alphabetically from a LinkedList
of strings.
You can find all the code examples given here in the directory
~mmh/DCS128/code/collections
Collection<E>
is an even more general type
than the type List<E>
. You can find its full documentation
here.
This is an interface which contains a method add
, which
adds an item to a collection, a method remove
which
removes an item from a collection, and a method contains
which tests whether an item is in a collection. But it contains no methods
which refer to the position of an item in the collection. The interface
List
extends the interface Collection
by adding
those methods which are concerned with items having a position in a
collection. So if we want to write code which depends only on an
item being either in or not in a collection, and makes no reference to
its position, we can use the more general type Collection
.
The type Set
(documented
here)
is an alternative extension of Collection
which does not
have any concept of items being in a position in the collection, but which
enforces the rule that an item may be in a set only once at most. So if
s
is of type Set<T>
and t
is
of type t
, then if s.contains(t)
returns
true
(meaning t
is already in s
) then
s.add(t)
will have no effect on s
. Like
List
, the type Set
is itself an interface which
is implemented by two actual classes, HashSet
and
TreeSet
(documented
here and
here)
which implement the underlying concept in two completely different ways.
For the purposes of this course, you need not be concerned with how those
two classes work underneath, but you would be able to find details in any
good book on algorithms and data structures: one uses hash tables, the other
uses trees. Using the "big-O"
notation we considered previously,
accessing a particular data item is O(1) with
HashSet
, but O(log N) with TreeSet
.
This suggests that HashSet
should be used rather than
TreeSet
, unless we particularly need to use the extra
operations and conditions which TreeSet
provides, as
HashSet
gives more efficient access.
Here are two simple programs which use these two classes:
import java.util.*; class HashSetTest { public static void main(String[] args) throws Exception { Scanner input = new Scanner(System.in); System.out.println("Enter some words (all on one line):"); String line = input.nextLine(); String[] words = line.split(" "); Collection<String> a = new HashSet<String>(); for(int i=0; i<words.length; i++) a.add(words[i]); System.out.println("The words are: "+a); } }using
HashSet
and
import java.util.*; class TreeSetTest { public static void main(String[] args) throws Exception { Scanner input = new Scanner(System.in); System.out.println("Enter some words (all on one line):"); String line = input.nextLine(); String[] words = line.split(" "); Collection<String> a = new TreeSet<String>(); for(int i=0; i<words.length; i++) a.add(words[i]); System.out.println("The words are: "+a); } }using
TreeSet
. All these programs do is read some words
in, store them in a collection of strings, and print out the text
representation of the collection. You will see if you run this code
that in both cases any duplicated words are stored only once. In the
TreeSet
implementation, you will see the text representation
puts the words in alphabetical ordering, while in the HashSet
representation, the words are jumbled up in an ordering which is neither
alphabetical nor the ordering in which they were added.
Now suppose you wanted to go through all the items in a TreeSet
or HashSet
object to find which is the largest according to
their own sense of ordering (last alphabetically if the items are strings).
The problem is that since sets have no sense of items being in a particular
position, there is no method get
which takes an integer argument
i
and returns the i
th element of the set. So we
could not use a for-loop in a way similar to that we used in generalised
code for List
objects above.
The Java collections framework gives us a way round this using the concept
of an iterator.
The method iterator
is provided in the Collection
interface,
it is documented
here.
So for any object whose actual type is a subtype of Collection<E>
,
if a
is a reference to the object, a.iterator()
returns
an object of type Iterator<E>
. So this can be used for objects
of type HashSet
, TreeSet
, ArrayList
or
LinkedList
. If it
refers to an object of type
Iterator<E>
and was produced by the call
it=a.iterator()
then each time it.next()
is
called it will return a different item from the collection referenced by
a
until all the items in that collection have been returned
exactly once. You can also call a.hasNext()
which will return
a boolean value, true
if there are some items in a
which have not yet been returned by a call of a.next()
and
false
otherwise. Attempting to call a.next()
when
a.hasNext()
would return false
causes an exception
of type NoSuchElementException
to be thrown.
Using this idea of an iterator, the following loop:
for(Iterator<E> it=a.iterator(); it.hasNext(); ) { E element = it.next(); ... }will cause whatever code is given by
...
to be executed
for the variable element
having each of the value from the
collection referenced by the variable a
. The type written
as E
can be any type. This pattern
is so common that Java 5 introduced a variation on the for-loop,
called the for-each loop which represents it. The code
for(E element: a) { ... }will behave exactly the same as the previous for loop. All that is required is that
a
(or whatever variable name is used there) is such that
the call a.iterator()
will return a value of type
Iterator<E>
. Any variable name can be used in the place
of element
and then ...
will be executed with
that variable set in turn to each of the elements from the collection
referenced by a
. Note, this also works if a
is
of type E[]
, that is an ordinary array.
Here, for example, is code which reads some words, stores them in a
TreeSet
and prints them out using a for-each loop:
import java.util.*; class TreeSetPrint2 { public static void main(String[] args) throws Exception { Scanner input = new Scanner(System.in); System.out.println("Enter some words (all on one line):"); String line = input.nextLine(); String[] words = line.split(" "); Collection<String> a = new TreeSet<String>(); for(int i=0; i<words.length; i++) a.add(words[i]); int count=1; System.out.println("The words (duplicates deleted) are: "); for(String str: a) { System.out.println(count+": "+str); count++; } } }This will cause the words to be printed out in alphabetic order. If
TreeSet
is replaced by HashSet
they will
not be printed in alphabetic order. So the difference between a
TreeSet
and a HashSet
is that it is possible,
by using an iterator, to access the elements in the TreeSet
in
their natural order. A HashSet
, however, has more efficient access
to data, but this is something that would only be noticeable with a large
amount of data. Note also that TreeSet
has some extra methods
over HashSet
, for example headSet
takes as its
argument an item from the type of object stored in the set and returns the
set of all objects from the original set less than that item according to
its natural ordering.
As another example of the use of an iterator, here is a static method which
returns the biggest item (according to the items' own natural ordering) from
any collection of items:
public static <T extends Comparable<? super T>> T biggest(Collection<T> coll)
{
Iterator<T> it = coll.iterator();
T biggest = it.next();
while(it.hasNext())
{
T nextItem = it.next();
if(nextItem.compareTo(biggest)>0)
biggest = nextItem;
}
return biggest;
}
Suppose this method is in a class called CollectionOps
, then a
call CollectionOps.biggest(a)
will return the biggest item from
the collection referenced by a
, where that collection could be
an ArrayList
, a LinkedList
, a TreeSet
or a HashSet
. Note that an iterator for a List
object (that is, either ArrayList
or LinkedList
)
will go through the items in the list in the order of their index.
List
objects also have a method listIterator
which produces an object which is like an iterator but can move through
the items in the collection in both directions. A call to
listIterator
on an object of type List<E>
produces an object of type ListIterator<E>
. This is like
an object of type Iterator<E>
except that as well as
having methods next
and hasNext
for moving forward
through the collection and testing whether the end of the collection has
been reached, it also has methods previous
and
hasPrevious
for moving backwards through the collection
and testing whether the start of the collection has been reached.
The method Comparator
objectsbiggest
above
will always return the "biggest" item from a collection according to the
natural ordering of the base type of the collection. As we noted
previously, however, we
may want to consider orderings other than the natural ordering of some
type. For example, we may want to order strings not by their alphabetic
ordering, but by their length. We can generalise methods which rely on
ordering objects by using a comparator object as an argument.
The Java system provides an interface type for such objects,
Comparator<T>
, documented
here.
You can write your own Comparator
classes, all you have
to do is write a class which extends Comparator
(possibly
specifying a particular base class), and provides the method
compare
which takes two objects of the base class and
returns an integer. The integer returned should be negative if the first
object is considered less than the second object in the ordering the
comparator gives, it should be positive if the first object is considered
greater than the second in that ordering, and it should be 0
if
the two objects are considered equal in that ordering. As an example,
here is a class for a comparator object which orders strings according to
their length:
import java.util.Comparator;
class LengthComparer implements Comparator<String>
{
public int compare(String str1,String str2)
{
return str1.length()-str2.length();
}
}
Note you have to specify that the method compare
is
public
and you need the import statement to make the
reference to Comparator
refer to the type from the
java.util
package. You will see that if comp
refers to an object of type LengthComparer
, then
comp.compare(s1,s2)
where s1
and s2
refer to strings will return a negative number if s1
is
shorter than s2
, a positive number if s2
is shorter
than s1
and 0
if they are of equal length.
Now here is a version of the method biggest
which takes
a comparator object as an argument:
public static <T> T biggest(Collection<T> coll,Comparator<T> comp) { Iterator<T> it = coll.iterator(); T biggest = it.next(); while(it.hasNext()) { T nextItem = it.next(); if(comp.compare(nextItem,biggest)>0) biggest = nextItem; } return biggest; }You will see that where, in the previous version of the method
biggest
, we had
nextItem.compareTo(biggest)>0
to compare the item referred
to by nextItem
with the item
referred to by biggest
we now have
comp.compare(nextItem,biggest)>0
. Since we no longer
rely on the items being in a class which has the method
compareTo
in it, we no longer have to specify that the
type parameter T
extends
Comparable<? super T>
. If this method
biggest
is in a class called CollectionOps
then
the following code provides a framework for testing it with a
LengthComparer
argument turning it into a method which
returns the longest string from a collection of strings:
import java.util.*; class Longest { public static void main(String[] args) throws Exception { Scanner input = new Scanner(System.in); System.out.println("Enter some words (all on one line):"); String line = input.nextLine(); String[] words = line.split(" "); List<String> a = new ArrayList<String>(); for(int i=0; i<words.length; i++) a.add(words[i]); Comparator<String> comparer = new LengthComparer(); String longest = CollectionOps.biggest(a,comparer); System.out.println("The longest word is \""+longest+"\""); } }Another comparator object we could use orders strings by their Scrabble scores (as discussed previously). So the method
biggest
with its second argument set to such
a comparator will take a collection of strings as its first argument
and return the string from that collection with the highest Scrabble score.
Here is the front-end program which does this:
import java.util.*; class BestScrabble1 { public static void main(String[] args) throws Exception { Scanner input = new Scanner(System.in); System.out.println("Enter some words (all on one line):"); String line = input.nextLine(); String[] words = line.split(" "); List<String> a = new ArrayList<String>(); for(int i=0; i<words.length; i++) a.add(words[i]); Comparator<String> comparer = new Scrabble(); String best = CollectionOps.biggest(a,comparer); System.out.print("The word with the highest Scrabble score is \""+best); System.out.println("\" with score "+Scrabble.score(best)); } }Here is the class
Scrabble
which provides the necessary
comparator:
import java.util.Comparator; class Scrabble implements Comparator<String> { public static final int[] scores = {1,3,3,2,1,4,2,4,1,8,5,1,3,1,1,3,10,1,1,1,1,4,4,8,4,10}; public static int score(String str) { String str1 = str.toUpperCase(); int score = 0; for(int i=0; i<str1.length(); i++) score+=scores[str1.charAt(i)-'A']; return score; } public int compare(String str1,String str2) { return score(str1)-score(str2); } }This class also has a static method
score
in it, which takes
a string and returns its Scrabble score. This shows that just because
a class implements some interface provided by Java does not mean it can't
also be used to provide some additional methods.
Comparator objects may also be used to vary the behaviour of TreeSet
objects. As we saw above, a
TreeSet object stores a collection of data (with no duplicates) and
has the property that if we go through the data using an iterator,
it will be given in its natural ordering. But the class
TreeSet<E>
has, besides its zero-argument
constructor which gives this sort of behaviour, another constructor
which takes an argument of type Comparator<E>
.
If a TreeSet
is constructed using this constructor, when its data is
gone through using an iterator, the items will be given in the order
determined by the comparator object given as the argument to the
constructor. The following shows some code which demonstrates using
a TreeSet
constructed using a comparator of type
LengthCompararer
as given
above:
import java.util.*; class TreeSetPrint3 { public static void main(String[] args) throws Exception { Scanner input = new Scanner(System.in); System.out.println("Enter some words (all on one line):"); String line = input.nextLine(); String[] words = line.split(" "); Collection<String> a = new TreeSet<String>(new LengthComparer()); for(int i=0; i<words.length; i++) a.add(words[i]); int count=1; System.out.println("The words in order of length with duplicates deleted are: "); for(String str: a) { System.out.println(count+": "+str); count++; } } }It is meant to take some words, store them in a
TreeSet
ordered
by a LengthComparer
object, and print them out.
If you run it, however, you will see it doesn't behave as you
might expect. It doesn't print out every word that was typed in.
To see why this, consider that a TreeSet
object does not store
duplicates. Two objects referred to by obj1
and
obj2
are considered equal if obj1.compareTo(obj2)
returns 0
. But if the TreeSet
object is constructed
using a comparator referred to by comp
it regards
obj1
and obj2
as equal if
comp.compare(obj1,obj2)
returns 0
.
So if obj1
is in the set, then attempting to add
obj2
to the set won't change it, because it is considered
that obj2
is just a duplicate of obj1
.
As we defined LengthComparer
, two strings are considered
equal if they are of the same length, regardless of the letters in them.
So the TreeSet
created by
new TreeSet<String>(new LengthComparer())
will be such that once a string of a particular length is added to it,
no other string of the same length may be added to it, attempting to
add one will just cause it to stay the same without indicating an
"error". If we want a TreeSet
which stores words in order of
their length, but stores different words of the same length, we need
a modified comparator object which does not regard different words
of the same length as equal. Here is code for such an object:
import java.util.Comparator; class LengthComparer1 implements ComparatorThis causes strings to be ordered by their length, but if their lengths are equal they are then ordered alphabetically. So creating a{ public int compare(String str1,String str2) { if(str1.length()==str2.length()) return str1.compareTo(str2); else return str1.length()-str2.length(); } }
TreeSet
using new TreeSet<String>(new LengthComparer1())
will give us what we want.
We noted previously that
Collections.sort(a)
would cause the elements in
an arrayList referenced by a
to be sorted in their
natural ordering (with the arrayList being changed destructively).
Note Collections
(with a final s) in the java.util
package is a separate class from Collection
(without a final s).
You can find the documentation for the class Collections
here.
It contains a number of static methods (hence they are called attached
to the name Collections
and not to an object reference)
which deal with collections. Apart from the single argument sort
discussed previously, there is a double argument sort which takes
a Java List
object (so an object of type ArrayList
or LinkedList
) and a comparator, and sorts the list in the
ordering given by the comparator. Many of the other operations in the
class Collections
are ones we could write ourselves,
but it makes sense to use the code already provided as part of the
Java system, both because that code can be guaranteed to be correct and
it helps to standardise the names used for these common operations.
For example, the two methods max
are the exact equivalents
of the methods we have called biggest
here. One takes
a collection and returns the biggest element of the collection according
to its natural ordering, the other takes a collection and a comparator
and returns the biggest element of the collection according to the
ordering given by the comparator. This means that in the program
above we could replace the
line:
String best = CollectionOps.biggest(a,comparer);by
String best = Collections.max(a,comparer);and it would work the same, but this time does not rely on us providing the method
biggest
in our own class called CollectionOps
Matthew Huntbach
Last modified: 17 November 2005