Algorithms and Data Structures in an Object-Oriented Framework (“ADSOOF”)
Further aspects of Java's Collections Framework
Another interface provided as part of the Java Collections Framework
is Map
. You can find this documented
here.
This is a generic type with two type parameters, so its type is
given as Map<K,V>
. Like List
and
Set
which we saw in the
previous set of notes, this interface
type is implemented by two main actual types. For Map
,
the two main implementing types are HashMap
and
TreeMap
.
The idea of an object of type Map<K,V>
is that
it is a collection of objects of type V
indexed by values
of type K
. In some ways you can think of it as
like an arrayList where the components can be indexed by values of
any type rather than only by integers. Objects of the indexing type
are known as keys. Commonly, the indexing type will be
String
, but Java allows it to be any object type.
Consider that for an object of type ArrayList<E>
referenced by a variable of type List<E>
called
a
, the call a.add(i,item)
, where i
is of type int
and item
is of type E
,
causes item
to be put into the collection referred to by
a
such that a.get(i)
will return item
.
For an object of type TreeMap<K,V>
or
HashMap<K,V>
, referenced by a variable m
of type TreeMap<K,V>
, the call
m.put(key,item)
where key
is of type K
and item
is of type V
, causes item
to be put into the collection referred to by
m
such that a.get(key)
will return item
.
One difference between lists and maps is that in lists if an object
is added at a particular index, all objects with higher indexes are shifted
up one index. With maps, when an item is added with a particular key,
that same key will always be used to index it until the item is removed
or another item is put in with the same key as its index. For a map
referred to by variable m
, the method call m.remove(key)
will remove the item from the map indexed by key
, in the same
way that for a list referred to by variable a
, the method
call a.remove(i)
will remove the item from the list indexed by
integer i
. However, a.remove(i)
causes all
items with indexes higher than i
to be moved down by one place,
but m.remove(key)
does not change the index of any other
item in the map. Also, both m.remove(key)
and
a.remove(i)
return the item that has been removed.
If there is no item with index key
in the
map referred to by m
, then m.remove(key)
returns null
and does not change the map. The call
a.remove(i)
causes an IndexOutOfBoundsException
to be thrown if there is no item in the list referred to by a
which is indexed by i
because i
is greater than
or equal to the size of the list. Similarly, m.get(key)
returns null
if there is no item with index key
in the map referred to by m
.
Maps are commonly used to store data where each possible item of data has a unique key which can be used to identify it. For example, a student record system will store records of students with all the data about an individual student being indexed by the student number, since we know there is a different student number for every student. A database for cars would probably use the car registration number (which is actually a combination of digits and alphabetic characters, hence it would be represented by a string not an integer) as the key.
We saw with sets previously,
collections of type TreeSet
store their elements in
order, so that an iterator derived from a TreeSet
object will return
the elements in order; collections of type HashSet
do
not store their elements in any order. With maps, collections of
type TreeMap
store their elements in order of their
key (as given by the natural ordering of the key type), while
collections of type HashMap
do not have any ordering.
This is balanced by the operations for HashMap
being
more efficient than the operations for TreeMap
.
The reason for these characteristics is given by the underlying
representations, trees for TreeSet
and TreeMap
,
hash tables for HashSet
and HashMap
, but there
is not time to discuss these data structures in this course. As with
TreeSet
, a TreeMap
object may be
constructed with a Comparator
object as an argument to the constructor, this will result in a map
which is ordered according to the ordering the comparator object gives
to the keys which index items in the map, rather than according to the
natural ordering of the keys.
It is not possible to produce an iterator directly from a map to show
the ordering, but it can be done in two stages. If m
refers to a map of type TreeMap<K,V>
, then the call
m.keySet()
returns an object of type Set<K>
which gives the keys of the items in the map in their order,
and m.values()
returns an object of type
Collection<V>
which gives the items in the map in
the order (given by their keys) in which they are stored. An
iterator can be obtained from the objects returned from
m.values()
and m.keySet()
to go through
the items in them in order. These methods may also be called on objects
of type HashMap
, but in this case the sets returned would
not be in any particular order.
The method entrySet
called on a Map
object
returns a set of key/value pairs. Formally, if m
refers to
an object of type Map<K,V>
(which could be a
HashSet
or a TreeSet
object), then
m.entrySet()
returns an object of type
Set<Map.Entry<K,V>>
. This is a set of objects of
type Map.Entry<K,V>
. This is an example of a nested
class, it means that Entry
is a type defined within the type
Map
. If p
refers to an item of type
Map.Entry<K,V>
, then p.getKey()
will
return the key of the pair, and p.getValue()
will return
the value of the pair, where the value was put into the original map with
the key as its index.
HashMap
and TreeMap
are implemented using the
same data structures as HashSet
and TreeSet
.
This means that HashMap
has the advantage of its operations
to find or place a particular object with a particular key being
O(1),
whereas with TreeMap
they're
O(log N). So HashMap
should be used unless we know there are occasions where we need to go through
the entire collection of data in key ordering or an ordering given by a
comparator. Note that if we generalise our code so that we always use just
Map
for variable types, and only use HashMap
when
actually creating a new object (as we must then use an actual type), it
is easier to start off using HashMap
and change to TreeMap
if we find that becomes necessary as it would only be the code that creates the
new object which needs to be changed.
Below is an example of a program that uses the Map
type.
Here it is Map<String,Integer>
, a map from strings
to integers. If the variable table
refers to an object of this
type (maps are also often known as “tables”), and str
refers
to a string, then map.get(str)
will return the integer with
value n
added to the table by table.put(str,n)
.
In the example below, the strings are student exam numbers, and the
integers are marks.
import java.util.*; import java.io.File; class Marks { public static void main(String[] args) throws Exception { Scanner screen = new Scanner(System.in); System.out.print("Enter file name for marks: "); String fileName = screen.nextLine(); Scanner file = new Scanner(new File(fileName)); Map<String,Integer> table = new HashMap<String,Integer>(); while(file.hasNext()) { String examNo = file.next(); Integer mark = file.nextInt(); table.put(examNo,mark); } System.out.print("Do you want to know any mark (yes/no) ? "); boolean flag = getYesNo(screen); while(flag) { System.out.print("Enter student number: "); String examNo = screen.next(); Integer mark = table.get(examNo); if(mark==null) System.out.println("No student with number "+examNo); else System.out.println(examNo+" has mark "+mark); System.out.print("Do you want to know any other mark (yes/no) ? "); flag = getYesNo(screen); } } private static boolean getYesNo(Scanner input) { String word = input.next(); while(!word.equals("yes")&&!word.equals("no")) { System.out.print("Please enter \"yes\" or \"no\": "); word = input.next(); } return word.equals("yes"); } }Note also this program shows an example of a scanner which reads from a file. The statement:
Scanner file = new Scanner(new File(fileName));creates a scanner referred to by the variable
file
which
reads, not from the screen, but from the file whose name is given by
the value of the string variable fileName
. Note it is
necessary to have an import statement for the type File
from the java.io
package to use this. A sample file for use
with this program can be found in the file
marks.txt. If this file is
in the same directory as the Marks.class
file being run,
then entering marks.txt
in response to the prompt for a
file name will open a scanner linked to this file. The program expects
the file to have the format of a student number (which can actually be
an string which does not contain white spaces) followed by an integer.
The following are the first three lines in the example file:
AN528 25 AN884 29 CJ653 27although the program does not test that the student numbers are in this format of two alphabetic characters followed by three numeric ones. The program reads in data from this file, storing marks indexed by exam numbers. It then allows the user to access marks as indexed by exam numbers.
The methods unmodifiableList
, unmodifiableSet
and unmodifiableCollection
provided by class
Collections
are useful in that they provide what appear to be
immutable
copies of the collections they take as arguments. Thus if a
is of type List<T>
(meaning the object it refers to is of
type ArrayList<T>
or LinkedList<T>
),
then
List<T> b = Collections.unmodifiableList(a);will cause
b
to refer to an object which appears to
be a copy of the object which a
refers to.
Since b
is of stated type List<T>
then
later code can call on b
any method from the interface
<List>
.
However, if an attempt is made to execute a call on b
of any
method which would cause a change to the contents of the list it
refers to, the list is unchanged and an UnsupportedOperationException
is thrown. This would be useful if we want to pass a copy of the list
to another part of the program, but we did not know how that part of the
program operated and we wanted to make sure it couldn't change the
actual list. So even though Java does not provide a range of extra types
or immutable collection objects, such objects can be produced.
Actually, unmodifiableList
, unmodifiableSet
and unmodifiableCollection
do not produce copies of the
original collection, they produce a view of it. This means they
share the data with the original collection, so if it is changed, the view
changes, So with our declaration of b
above, although
we cannot make a change to the collection viewed through b
,
if we make a change to it viewed through a
, the collection
viewed through b
will also change. This means that it is not really
correct to describe the objects returned by these methods as “immutable”,
because that suggests they can never change. They can change, but not through a call
on the actual object returned. So they are described as “unmodifiable views”
of a modifiable object.
Here is some code which demonstrates this:
import java.util.*; class UnmodifiableTreeSetTest { public static void main(String[] args) throws Exception { Scanner input = new Scanner(System.in); System.out.println("Enter some words (all on one line):"); String line = input.nextLine(); String[] words = line.split(" "); Set<String> a = new TreeSet<String>(); for(int i=0; i<words.length; i++) a.add(words[i]); System.out.println("The words are: "+a); Set<String> b = Collections.unmodifiableSet(a); System.out.println("Unmodifiable view of the words is: "+b); System.out.println("Enter a word to delete: "); String delWord = input.next().trim(); a.remove(delWord); System.out.println("Changes words to: "+a); System.out.println("Unmodifiable view is: "+b); } }If the code were the same but with
b.remove(delWord)
instead
of a.remove(delWord)
, it would compile, but when it was run and
reached that statement, it would throw an UnsupportedOperationException
.
The method headSet
which we mentioned
previously returns a view
of the original set rather than copying. Given s1
of type
TreeSet<E>
and item
of type
E
, then the call s2=s1.headSet(item)
sets s2
to a view of all items in the set referenced by s1
which
are less than item
. This is according to the ordering of
this set, which could be the natural ordering of E
or
some other ordering given by a comparator argument when the set was constructed.
But as it is a view, if a new item less than item
is added to
the set referenced by s1
, it will also be added to the set
referenced by s2
. Here is code that demonstrates this:
import java.util.*; class HeadSetTest { public static void main(String[] args) throws Exception { Scanner input = new Scanner(System.in); System.out.println("Enter some words (all on one line):"); String line = input.nextLine(); String[] words = line.split(" "); SortedSet<String> s1 = new TreeSetHere the variables(); for(String word : words) s1.add(word); System.out.println("The words are: "+s1); System.out.println("Enter a word to split at: "); String splitWord = input.next().trim(); SortedSet<String> s2 = s1.headSet(splitWord); System.out.println("The words before this word are: "+s2); System.out.println("Enter a word to add: "); String addWord = input.next().trim(); s1.add(addWord); System.out.println("The words are now: "+s1); System.out.println("The words before the split are now: "+s2); } }
s1
and s2
are given as
type SortedSet<String>
rather than
TreeSet<String>
, as SortedSet
is an
interface type which gives the operations needed for a sorted set
which are actually implemented by TreeSet
. Note also this
code demonstrates a for-each loop, the lines
for(String word : words) s1.add(word);mean the same as
for(int i=0; i<words.length; i++) a.add(words[i]);but do not explictly rely on
words
being an array and do not
explicitly use any indexing on words
.
Another operation returning a view of a collection is subList
,
which you can find documented
here.
It is found in the interface List
so may be applied to objects of
actual type ArrayList
and LinkedList
. If ls
is a variable of type List<E>
and start
and
finish
are of type int
, then
ls.subList(start,finish)
returns an object of type
List<E>
which contains all those
objects from the original list starting at the one indexed by start
and going up to but not including the one indexed by finish
.
Since this is a view of the original list, any change made to it will also
change the original list. If the original list is changed in the portion
viewed by the sublist, the sublist object will change in the same way.
The following code demonstrates this:
import java.util.*; class SubListTest { public static void main(String[] args) throws Exception { Scanner input = new Scanner(System.in); System.out.println("Enter some words (all on one line):"); String line = input.nextLine(); String[] words = line.split(" "); List<String> a = new ArrayList<String>(); for(String word : words) a.add(word); int count=0; System.out.println("The words are: "); for(String str : a) { System.out.println(count+": "+str); count++; } System.out.print("Enter a start index: "); int start = input.nextInt(); System.out.print("Enter a finish index: "); int finish = input.nextInt(); List<String> sublist = a.subList(start,finish); System.out.println("The sublist in this range is: "); count=0; for(String str : sublist) { System.out.println(count+": "+str); count++; } System.out.print("Enter a position in this sublist you wish to add at: "); int addPos = input.nextInt(); System.out.print("Enter a word you wish to add: "); String word = input.next(); sublist.add(addPos,word); System.out.println("The sublist is now: "); count=0; for(String str : sublist) { System.out.println(count+": "+str); count++; } System.out.println("The full list is now: "); count=0; for(String str : a) { System.out.println(count+": "+str); count++; } } }If you run this, you will see that adding an item to the list referred to by
sublist
, which is a portion of the initial list referred to
by a
, causes both lists to change, due to sublist
being a view of a
rather than an independent object.
An issue with the subList
method is that since its return
type is List<E>
, you cannot expect a variable of type
ArrayList<E>
or LinkedList<E>
to
be set without casting to refer to the result of a call of it, even if that
call is on a variable of the same type. For example, if you had a
declaration ArrayList<String> a
, and later had an
assignment a=a.subList(0,a.size()-1)
which you expected to
reset a
to the list of all except its last element, you might
be surprised to get a compiler error saying that an
ArrayList<String> a
was expected but a
List<String> a
was found. That is because the
return type of subList
even when called on an ArrayList
object is List
with the element type of that object, and not
ArrayList
with the element type. This is an indication of why
variables should always be declared as of the interface type. There would
be no problem if the declaration of a
was
List<String> a
, it can still be set to an object
of type ArrayList<String>
, but why limit it to that
when it is more general and avoids this and other problems to declare it
of the interface type?
It is also possible to get a list which is a view of an array. The static
method asList
in the class Arrays
(as mentioned
previously, this is a class which provides some routine operations on arrays
expressed as static methods) does this. If a
is of type
T[]
then Array.asList(a)
returns an object of
type List<T>
, which is a view of a
, that is
changes to the list will change the array, and vice versa.
Here is some code which demonstrates this:
import java.util.*; class AsListTest1 { public static void main(String[] args) throws Exception { Scanner input = new Scanner(System.in); System.out.println("Enter some words (all on one line):"); String line = input.nextLine(); String[] words = line.split(" "); List<String> list = Arrays.asList(words); System.out.println("The words viewed as a list are: "); int count=0; for(String str: list) { System.out.println(count+": "+str); count++; } System.out.print("Enter a position: "); int pos = input.nextInt(); System.out.print("Enter a word: "); String word = input.next(); words[pos] = word; System.out.println("After changing word at position in array, list is: "); count=0; for(String str: list) { System.out.println(count+": "+str); count++; } } }Here, the array is changed, and you will find it means the view as a list gets changed as well. If the list is changed, the array will change, but only if the list is changed in a way that does not change its length. An
add
operation on
the list changes its length, but if you attempted an add
operation on a list that was a view of an array, it would not change the
length of the array. Here is an example:
import java.util.*; class AsListTest2 { public static void main(String[] args) throws Exception { Scanner input = new Scanner(System.in); System.out.println("Enter some words (all on one line):"); String line = input.nextLine(); String[] words = line.split(" "); ListIf you attempt to run this, the calllist = Arrays.asList(words); System.out.println("The words viewed as a list are: "); int count=0; for(String str: list) { System.out.println(count+": "+str); count++; } System.out.print("Enter a position: "); int pos = input.nextInt(); System.out.print("Enter a word: "); String word = input.next(); list.add(pos,word); System.out.println("After adding word at position in list, array is: "); for(int i=0; i<words.length; i++) System.out.println(i+": "+words[i]); } }
list.add(pos,word)
will cause an UnsupportedOperationException
to be thrown.
However, if you change the statement list.add(pos,word)
to
list.set(pos,word)
it will work. The reason for this is that
list.add(pos,word)
adds a new word to the list, so it changes
the size of the list, whereas list.set(pos,word)
does not change
the size of the list, it just changes the word stored at a particular position to
another word. You will see this causes the array of which the list is a view
to change the word at the given position.
Previously, we saw how we could
create our own comparator object which we could pass into a method to
tell the method how it is to order objects it is dealing with. So we could
pass in an object of type LengthComparer
which implemented
Comparator<String>
; it would need to have a method
called compare
which took two String
objects
as its arguments, and returned an int
value. This was
defined in a separate class which extended the interface type
Comparator
with its type argument set to String
.
It may seem a nuisance to have a large number of classes, each of which
exists just to be used in one place, and is short, consisting of just a
short method. Actually, the Java language recognises this, and allows us
to avoid having to create a class if it is only going to be used once.
The way it does this is the idea of an anonymous class. An
object of an anonymous class is created by the word new
followed by the name of the class or interface it extends and the
()
to make it a constructor call, followed by
the code it extends it with enclosed within {
and }
brackets. So instead of having, as we did
previously, the separate class
LengthComparer
, we could replace the call which created an
object of that type:
Comparator<String> comparer = new LengthComparer();by the creation of an anonymous object which is defined as extending
Comparator<String>
by the method which was previously
in that separate class. This gives the following code:
import java.util.*; class Longest2 { public static void main(String[] args) throws Exception { Scanner input = new Scanner(System.in); System.out.println("Enter some words (all on one line):"); String line = input.nextLine(); String[] words = line.split(" "); List<String> a = new ArrayList<String>(); for(int i=0; i<words.length; i++) a.add(words[i]); Comparator<String> comparer = new Comparator<String>() { public int compare(String str1,String str2) { return str1.length()-str2.length(); } }; String longest = CollectionOps.biggest(a,comparer); System.out.println("The longest word is \""+longest+"\""); } }In fact we don't even need to have a separate variable just to hold the
Comparator
object and pass it into the biggest
method, so we could just have the creation of the anonymous object
directly as the argument to the biggest
method, giving us:
import java.util.*; class Longest3 { public static void main(String[] args) throws Exception { Scanner input = new Scanner(System.in); System.out.println("Enter some words (all on one line):"); String line = input.nextLine(); String[] words = line.split(" "); List<String> a = new ArrayList<String>(); for(int i=0; i<words.length; i++) a.add(words[i]); String longest = CollectionOps.biggest(a,new Comparator<String>() { public int compare(String str1,String str2) { return str1.length()-str2.length(); } }); System.out.println("The longest word is \""+longest+"\""); } }It's a matter of taste whether you choose to express it this way. You have to look carefully to see that the call to the method
biggest
extends over eight lines, starting with
String longest = CollectionOps.biggest(a,newand ending with
});with what would previously have been in a separate class now actually inside a method call. In fact, we could go even further than this, and note that since the variable
longest
is only used to
hold the result of the call to the method biggest
and pass
it into the System.out.println
call, we could just have the
method call to biggest
directly where previuously we had the
use of the variable longest
. This gives the code you can
see in Longest4.java
, but now it
maybe has got too convoluted due to trying too much to remove “use once”
names, although it still compiles and executes fine.
As another example of extending Comparator
with an anonymous
class, instead of defining a separate class Scrabble
as
we did previously to find the word from an arrayList with the highest
Scrabble score, we could use an anonymous class to do it. Then we
have to pass the object of this anonymous class
as the Comparator
object argument to the call to the
static method max
from Java's Collections
class,
giving the following code:
import java.util.*; class BestScrabble3 { public static void main(String[] args) throws Exception { Scanner input = new Scanner(System.in); System.out.println("Enter some words (all on one line):"); String line = input.nextLine(); String[] words = line.split(" "); List<String> a = new ArrayList<String>(); for(int i=0; i<words.length; i++) a.add(words[i]); Comparator<String> comparer = new Comparator<String>() { int[] scores = {1,3,3,2,1,4,2,4,1,8,5,1,3,1,1,3,10,1,1,1,1,4,4,8,4,10}; int score(String str) { String str1 = str.toUpperCase(); int score = 0; for(int i=0; i<str1.length(); i++) score+=scores[str1.charAt(i)-'A']; return score; } public int compare(String str1,String str2) { return score(str1)-score(str2); } }; String best = Collections.max(a,comparer); System.out.print("The word with the highest Scrabble score is \""+best); System.out.println("\" with score "+Scrabble.score(best)); } }The code in this anonymous class is like the code in the class
Scrabble
, with the extra method score
which
the method compare
uses, and the array scores
which the method score
uses. However, they cannot be
declared as static
because an anonymous class cannot have
static variables or methods declared inside it.
Last modified: 7 July 2019