Further aspects of Java's Collections Framework

Java Maps

Another interface provided as part of the Java Collections Framework is Map. You can find this documented here. This is a generic type with two type parameters, so its type is given as Map<K,V>. Like List and Set which we saw in the previous set of notes, this interface type is implemented by two main actual types. For Map, the two main implementing types are HashMap and TreeMap.

The idea of an object of type Map<K,V> is that it is a collection of objects of type V indexed by values of type K. In some ways you can think of it as like an arrayList where the components can be indexed by values of any type rather than only by integers. Objects of the indexing type are known as keys. Commonly, the indexing type will be String, but Java allows it to be any type.

Consider that for an object of type ArrayList<E> referenced by a variable of type List<E> called a, the call a.add(i,item), where i is of type int and item is of type E, causes item to be put into the collection referred to by a such that a.get(i) will return item. For an object of type TreeMap<K,V> or HashMap<K,V>, referenced by a variable m of type TreeMap<K,V>, the call m.put(key,item) where key is of type K and item is of type V, causes item to be put into the collection referred to by m such that a.get(key) will return item. One difference between lists and maps is that in lists if an object is added at a particular index, all objects with higher indexes are shifted up one index. With maps, when an item is added with a particular key, that same key will always be used to index it until the item is removed or another item is put in with the same key as its index. For a map referred to by variable m, the method call m.remove(key) will remove the item from the map indexed by key, in the same way that for a list referred to by variable a, the method call a.remove(i) will remove the item from the list indexed by integer i. However, a.remove(i) causes all items with indexes higher than i to be moved down by one place, but m.remove(key) does not change the index of any other item in the map. Also, both m.remove(key) and a.remove(i) return the item that has been removed. However, if there is no item with index key in the map referred to by m, then m.remove(key) returns null and does not change the map. The call a.remove(i) causes an IndexOutOfBoundsException to be thrown if there is no item in the list referred to by a which is indexed by i because i is greater than or equal to the size of the list.

Maps are commonly used to store data where each possible item of data has a unique key which can be used to identify it. For example, a student record system will store records of students with all the data about an individual student being indexed by the student number, since we know there is a different student number for every student. A database for cars would probably use the car registration number (which is actually a combination of digits and alphabetic characters, hence it would be represented by a string not an integer) as the key.

We saw with sets previously, collections of type TreeSet store their elements in order, so that an iterator derived from a TreeSet object will return the elements in order; collections of type HashSet do not store their elements in any order. With maps, collections of type TreeMap store their elements in order of their key (as given by the natural ordering of the key type), while collections of type HashMap do not have any ordering. This is balanced by the operations for HashMap being more efficient than the operations for TreeMap. The reason for these characteristics is given by the underlying representations, trees for TreeSet and TreeMap, hash tables for HashSet and HashMap, but there is not time to discuss these data structures in this course. As with TreeSet, a TreeMap object may be constructed with a comparator object as an argument to the constructor, this will result in a map which is ordered according to the ordering the comparator object gives to the keys which index items in the map, rather than according to the natural ordering of the keys.

It is not possible to produce an iterator directly from a map to show the ordering, but it can be done in two stages. If m refers to a map of type TreeMap<K,V>, then the call m.keySet() returns an object of type Set<K> which gives the keys of the items in the map in their order, and m.values() returns an object of type Collection<V> which gives the items in the map in the order (given by their keys) in which they are stored. An iterator can be obtained from the objects returned from m.values() and m.keySet() to go through the items in them in order. These methods may also be called on objects of type HashMap, but in this case the sets returned would not be in any particular order.

The method entrySet called on a Map object returns a set of key/value pairs. Formally, if m refers to an object of type Map<K,V> (which could be a HashSet or a TreeSet object), then m.entrySet() returns an object of type Set<Map.Entry<K,V>>. This is a set of objects of type Map.Entry<K,V>. This is an example of a nested class, it means that Entry is a type defined within the type Map. If p refers to an item of type Map.Entry<K,V>, then p.getKey() will return the key of the pair, and p.getValue() will return the value of the pair, where the value was put into the original map with the key as its index.

HashMap and TreeMap are implemented using the same data structures as HashSet and TreeSet. This means that HashMap has the advantage of its operations to find or place a particular object with a particular key are O(1), whereas with TreeMap it's O(log N). So HashMap should be used unless we know there are occasions where we need to go through the entire collection of data in key ordering or an ordering given by a comparator. Note that if we generalise our code so that we always use just Map for variable types, and only use HashMap when actually creating a new object (as we must then use an actual type), it is easier to start off using HashMap and change to TreeMao if we find that becomes necessary as it would only be the code that creates the new object which needs to be changed.

Below is an example of a program that uses the Map type. Here it is Map<String,Integer>, a map from strings to integers. If the variable table refers to an object of this type (maps are also often known as "tables"), and str refers to a string, then map.get(str) will return the integer with value n added to the table by table.put(str,n). In the example below, the strings are student exam numbers, and the integers are marks.

import java.util.*;
import java.io.File;

class Marks
{
 public static void main(String[] args) throws Exception
 {
  Scanner screen = new Scanner(System.in);
  System.out.print("Enter file name for marks: ");
  String fileName = screen.nextLine();
  Scanner file = new Scanner(new File(fileName));
  Map<String,Integer> table = new HashMap<String,Integer>();
  while(file.hasNext())
     {
      String examNo = file.next();
      Integer mark = file.nextInt();
      table.put(examNo,mark);
     }
  System.out.print("Do you want to know any mark (yes/no) ? ");
  boolean flag = getYesNo(screen);
  while(flag)
      {
       System.out.print("Enter student number: ");
       String examNo = screen.next();
       Integer mark = table.get(examNo);
       if(mark==null)
          System.out.println("No student with number "+examNo);
       else
          System.out.println(examNo+" has mark "+mark);
       System.out.print("Do you want to know any other mark (yes/no) ? ");
       flag = getYesNo(screen);
      }
 }

 private static boolean getYesNo(Scanner input)
 {
  String word = input.next();
  while(!word.equals("yes")&&!word.equals("no"))
      {
       System.out.print("Please enter \"yes\" or \"no\": ");
       word = input.next();
      }
  return word.equals("yes");
 }
}

Note also this program shows an example of a scanner which reads from a file. The statement:

Scanner file = new Scanner(new File(fileName));

creates a scanner referred to by the variable file which reads, not from the screen, but from the file whose name is given by the value of the string variable fileName. Note it is necessary to have an import statement for the type File from the java.io package to use this. A sample file for use with this program can be found in the file marks.txt. If this file is in the same directory as the Marks.class file being run, then entering marks.txt in response to the prompt for a file name will open a scanner linked to this file. The program expects the file to have the format of a student number (which can actually be an string which does not contain white spaces) followed by an integer. The following are the first three lines in the example file:

AN528   25
AN884   29
CJ653   27

although the program does not test that the student numbers are in this format of two alphabetic characters followed by three numeric ones. The program reads in data from this file, storing marks indexed by exam numbers. It then allows the user to access marks as indexed by exam numbers.

View of collections

The methods unmodifiableList, unmodifiableSet and UnmodifiableCollection provided by class Collections are useful in that they provide immutable copies of the collections they take as arguments. Thus if a is of type List<T> (meaning the object it refers to is of type ArrayList<T> or LinkedList<T>), then

List<T> b = Collections.unmodifiableList(a);

will cause b to refer to an object which appears to be a copy of the object which a refers to. Since b is of stated type List<T> then later code can call on b any method from the class . However, if an attempt is made to execute a call on b of any method which would cause a change to the contents of the list it refers to, the list is unchanged and an UnsupportedOperationException is thrown. This would be useful if we want to pass a copy of the list to another part of the program, but we did not know how that part of the program operated and we wanted to make sure it couldn't change the actual list. So even though Java does not provide a range of extra types or immutable collection objects, such objects can be produced.

Actually, unmodifiableList, unmodifiableSet and UnmodifiableCollection do not produce copies of the original collection, they produce a view of it. This means they share the data with the original collection, so if it is changed, the view changes, So with our declaration of b above, although we cannot make a change to the collection viewed through b, if we make a change to it viewed through a, the collection viewed through b will also change. Here is some code which demonstrates this:

import java.util.*;

class UnmodifiableTreeSetTest
{
 public static void main(String[] args) throws Exception
 {
  Scanner input = new Scanner(System.in);
  System.out.println("Enter some words (all on one line):");
  String line = input.nextLine();
  String[] words = line.split(" ");
  Set<String> a = new TreeSet<String>();
  for(int i=0; i<words.length; i++)
     a.add(words[i]);
  System.out.println("The words are: "+a);
  Set<String> b = Collections.unmodifiableSet(a);
  System.out.println("Unmodifiable view of the words is: "+b);
  System.out.println("Enter a word to delete: ");
  String delWord = input.next().trim();
  a.remove(delWord);
  System.out.println("Changes words to: "+a);
  System.out.println("Unmodifiable view is: "+b);
 }
}

If the code were the same but with b.remove(delWord) instead of a.remove(delWord), it would compile, but when it was run and reached that statement, it would throw an

UnsupportedOperationException.

The method headSet which we mentioned
previously returns a view
of the original set rather than copying. Given s1 of type
TreeSet<E> and item of type
E, then the call s2=s1.headSet(item) sets s2
to a view of all items in the set referenced by s1 which
are less than item. This is according to the ordering of
this set, which could be the natural ordering of E or
some other ordering given by a comparator argument when the set was constructed.
But as it is a view, if a new item less than item is added to
the set referenced by s1, it will also be added to the set
referenced by s2. Here is code that demonstrates this:
import java.util.*;

class HeadSetTest
{
 public static void main(String[] args) throws Exception
 {
  Scanner input = new Scanner(System.in);
  System.out.println("Enter some words (all on one line):");
  String line = input.nextLine();
  String[] words = line.split(" ");
  SortedSet<String> s1 = new TreeSet();
  for(String word : words)
     s1.add(word);
  System.out.println("The words are: "+s1);
  System.out.println("Enter a word to split at: ");
  String splitWord = input.next().trim();
  SortedSet<String> s2 = s1.headSet(splitWord);
  System.out.println("The words before this word are: "+s2);
  System.out.println("Enter a word to add: ");
  String addWord = input.next().trim();
  s1.add(addWord);
  System.out.println("The words are now: "+s1);
  System.out.println("The words before the split are now: "+s2);
 }
}

Here the variables s1 and s2 are given as
type SortedSet<String> rather than
TreeSet<String>, as SortedSet is an
interface type which gives the operations needed for a sorted set
which are actually implemented by TreeSet. Note also this
code demonstrates a for-each loop, the lines
  for(String word : words)
     s1.add(word);

mean the same as
    for(int i=0; i<words.length; i++)     
       a.add(words[i]);

but do not explictly rely on words being an array and do not
explicitly use any indexing on words.

Another operation returning a view of a collection is subList,
which you can find documented
here.
It is found in the interface List so may be applied to objects of
actual type ArrayList and LinkedList. If ls
is a variable of type List<E> and start and
finish are of type int, then 
ls.subList(start,finish) returns an object of type 
List<E> which contains all those
objects from the original list starting at the one indexed by start
and going up to but not including the one indexed by finish.
Since this is a view of the original list, any change made to it will also
change the original list. If the original list is changed in the portion
viewed by the sublist, the sublist object will change in the same way.
The following code demonstrates this:
import java.util.*;

class SubListTest
{
 public static void main(String[] args) throws Exception
 {
  Scanner input = new Scanner(System.in);
  System.out.println("Enter some words (all on one line):");
  String line = input.nextLine();
  String[] words = line.split(" ");
  List<String> a = new ArrayList<String>();
  for(String word : words)
     a.add(word);
  int count=0;
  System.out.println("The words are: ");
  for(String str : a)
      {
       System.out.println(count+": "+str);
       count++;
      }
  System.out.print("Enter a start index: ");
  int start = input.nextInt();
  System.out.print("Enter a finish index: ");
  int finish = input.nextInt();
  List<String> sublist = a.subList(start,finish);
  System.out.println("The sublist in this range is: ");
  count=0;
  for(String str : sublist)
      {
       System.out.println(count+": "+str);
       count++;
      }
  System.out.print("Enter a position in this sublist you wish to add at: ");
  int addPos = input.nextInt();
  System.out.print("Enter a word you wish to add: ");
  String word = input.next();
  sublist.add(addPos,word);
  System.out.println("The sublist is now: ");
  count=0;
  for(String str : sublist)
      {
       System.out.println(count+": "+str);
       count++;
      }
  System.out.println("The full list is now: ");
  count=0;
  for(String str : a)
      {
       System.out.println(count+": "+str);
       count++;
      }
 }
}

If you run this, you will see that adding an item to the list referred
to by sublist, which is a portion of the initial list referred to
by a, causes both lists to change, due to sublist
being a view of a rather than an independent object.

It is also possible to get a list which is a view of an array. The static
method asList in the class Arrays (as mentioned
previously,
this is a class which provides some routine operations on arrays expressed
as static methods). If a is of type T[] then
Array.asList(a) returns an object of type List<T>,
which is a view of a, that is changes to the list will change
the array, and vice versa. Here is some code which demonstrates this:
import java.util.*;

class AsListTest1
{
 public static void main(String[] args) throws Exception
 {
  Scanner input = new Scanner(System.in);
  System.out.println("Enter some words (all on one line):");
  String line = input.nextLine();
  String[] words = line.split(" ");
  List<String> list = Arrays.asList(words);
  System.out.println("The words viewed as a list are: ");
  int count=0;
  for(String str: list)
     {
      System.out.println(count+": "+str);
      count++;
     }
  System.out.print("Enter a position: ");
  int pos = input.nextInt();
  System.out.print("Enter a word: ");
  String word = input.next();
  words[pos] = word;
  System.out.println("After changing word at position in array, list is: ");
  count=0;
  for(String str: list)
     {
      System.out.println(count+": "+str);
      count++;
     }
 }
}

Here, the array is changed, and you will find it means the view as
a list gets changed as well. If the list is changed, the array
will change, but only if the list is changed in a way that does not
change its length. An add operation on
the list changes its length, but if you attempted an add
operation on a list that was a view of an array, it would not change the
length of the array. Here is an example:
import java.util.*;

class AsListTest2
{
 public static void main(String[] args) throws Exception
 {
  Scanner input = new Scanner(System.in);
  System.out.println("Enter some words (all on one line):");
  String line = input.nextLine();
  String[] words = line.split(" ");
  List list = Arrays.asList(words);
  System.out.println("The words viewed as a list are: ");
  int count=0;
  for(String str: list)
     {
      System.out.println(count+": "+str);
      count++;
     }
  System.out.print("Enter a position: ");
  int pos = input.nextInt();
  System.out.print("Enter a word: ");
  String word = input.next();
  list.add(pos,word);
  System.out.println("After adding word at position in list, array is: ");
  for(int i=0; i<words.length; i++)
      System.out.println(i+": "+words[i]);
 }
}

If you attempt to run this, the call list.add(pos,word)
will cause an UnsupportedOperationException to be thrown.
However, if you change the statement list.add(pos,word) to
list.set(pos,word) it will work. The reason for this is that
list.add(pos,word) adds a new word to the list, so it changes
the size of the list, whereas list.set(pos,word) does not change
the size of the list, it just changes the word stored at a particular position to
another word. You will see this causes the array of which the list is a view
to change the word at the given position.


Matthew Huntbach

Last modified: 17 November 2005