Algorithms and Data Structures in an Object-Oriented Framework (“ADSOOF”)

Further aspects of Java's Collections Framework

Java Maps

Another interface provided as part of the Java Collections Framework is Map. You can find this documented here. This is a generic type with two type parameters, so its type is given as Map<K,V>. Like List and Set which we saw in the previous set of notes, this interface type is implemented by two main actual types. For Map, the two main implementing types are HashMap and TreeMap.

The idea of an object of type Map<K,V> is that it is a collection of objects of type V indexed by values of type K. In some ways you can think of it as like an arrayList where the components can be indexed by values of any type rather than only by integers. Objects of the indexing type are known as keys. Commonly, the indexing type will be String, but Java allows it to be any object type.

Consider that for an object of type ArrayList<E> referenced by a variable of type List<E> called a, the call a.add(i,item), where i is of type int and item is of type E, causes item to be put into the collection referred to by a such that a.get(i) will return item. For an object of type TreeMap<K,V> or HashMap<K,V>, referenced by a variable m of type TreeMap<K,V>, the call m.put(key,item) where key is of type K and item is of type V, causes item to be put into the collection referred to by m such that a.get(key) will return item. One difference between lists and maps is that in lists if an object is added at a particular index, all objects with higher indexes are shifted up one index. With maps, when an item is added with a particular key, that same key will always be used to index it until the item is removed or another item is put in with the same key as its index. For a map referred to by variable m, the method call m.remove(key) will remove the item from the map indexed by key, in the same way that for a list referred to by variable a, the method call a.remove(i) will remove the item from the list indexed by integer i. However, a.remove(i) causes all items with indexes higher than i to be moved down by one place, but m.remove(key) does not change the index of any other item in the map. Also, both m.remove(key) and a.remove(i) return the item that has been removed. If there is no item with index key in the map referred to by m, then m.remove(key) returns null and does not change the map. The call a.remove(i) causes an IndexOutOfBoundsException to be thrown if there is no item in the list referred to by a which is indexed by i because i is greater than or equal to the size of the list. Similarly, m.get(key) returns null if there is no item with index key in the map referred to by m.

Maps are commonly used to store data where each possible item of data has a unique key which can be used to identify it. For example, a student record system will store records of students with all the data about an individual student being indexed by the student number, since we know there is a different student number for every student. A database for cars would probably use the car registration number (which is actually a combination of digits and alphabetic characters, hence it would be represented by a string not an integer) as the key.

We saw with sets previously, collections of type TreeSet store their elements in order, so that an iterator derived from a TreeSet object will return the elements in order; collections of type HashSet do not store their elements in any order. With maps, collections of type TreeMap store their elements in order of their key (as given by the natural ordering of the key type), while collections of type HashMap do not have any ordering. This is balanced by the operations for HashMap being more efficient than the operations for TreeMap. The reason for these characteristics is given by the underlying representations, trees for TreeSet and TreeMap, hash tables for HashSet and HashMap, but there is not time to discuss these data structures in this course. As with TreeSet, a TreeMap object may be constructed with a Comparator object as an argument to the constructor, this will result in a map which is ordered according to the ordering the comparator object gives to the keys which index items in the map, rather than according to the natural ordering of the keys.

It is not possible to produce an iterator directly from a map to show the ordering, but it can be done in two stages. If m refers to a map of type TreeMap<K,V>, then the call m.keySet() returns an object of type Set<K> which gives the keys of the items in the map in their order, and m.values() returns an object of type Collection<V> which gives the items in the map in the order (given by their keys) in which they are stored. An iterator can be obtained from the objects returned from m.values() and m.keySet() to go through the items in them in order. These methods may also be called on objects of type HashMap, but in this case the sets returned would not be in any particular order.

The method entrySet called on a Map object returns a set of key/value pairs. Formally, if m refers to an object of type Map<K,V> (which could be a HashSet or a TreeSet object), then m.entrySet() returns an object of type Set<Map.Entry<K,V>>. This is a set of objects of type Map.Entry<K,V>. This is an example of a nested class, it means that Entry is a type defined within the type Map. If p refers to an item of type Map.Entry<K,V>, then p.getKey() will return the key of the pair, and p.getValue() will return the value of the pair, where the value was put into the original map with the key as its index.

HashMap and TreeMap are implemented using the same data structures as HashSet and TreeSet. This means that HashMap has the advantage of its operations to find or place a particular object with a particular key being O(1), whereas with TreeMap they're O(log N). So HashMap should be used unless we know there are occasions where we need to go through the entire collection of data in key ordering or an ordering given by a comparator. Note that if we generalise our code so that we always use just Map for variable types, and only use HashMap when actually creating a new object (as we must then use an actual type), it is easier to start off using HashMap and change to TreeMap if we find that becomes necessary as it would only be the code that creates the new object which needs to be changed.

Below is an example of a program that uses the Map type. Here it is Map<String,Integer>, a map from strings to integers. If the variable table refers to an object of this type (maps are also often known as “tables”), and str refers to a string, then map.get(str) will return the integer with value n added to the table by table.put(str,n). In the example below, the strings are student exam numbers, and the integers are marks.

import java.util.*;
import java.io.File;

class Marks
{
 public static void main(String[] args) throws Exception
 {
  Scanner screen = new Scanner(System.in);
  System.out.print("Enter file name for marks: ");
  String fileName = screen.nextLine();
  Scanner file = new Scanner(new File(fileName));
  Map<String,Integer> table = new HashMap<String,Integer>();
  while(file.hasNext())
     {
      String examNo = file.next();
      Integer mark = file.nextInt();
      table.put(examNo,mark);
     }
  System.out.print("Do you want to know any mark (yes/no) ? ");
  boolean flag = getYesNo(screen);
  while(flag)
      {
       System.out.print("Enter student number: ");
       String examNo = screen.next();
       Integer mark = table.get(examNo);
       if(mark==null)
          System.out.println("No student with number "+examNo);
       else
          System.out.println(examNo+" has mark "+mark);
       System.out.print("Do you want to know any other mark (yes/no) ? ");
       flag = getYesNo(screen);
      }
 }

 private static boolean getYesNo(Scanner input)
 {
  String word = input.next();
  while(!word.equals("yes")&&!word.equals("no"))
      {
       System.out.print("Please enter \"yes\" or \"no\": ");
       word = input.next();
      }
  return word.equals("yes");
 }
}

Note also this program shows an example of a scanner which reads from a file. The statement:

Scanner file = new Scanner(new File(fileName));

creates a scanner referred to by the variable file which reads, not from the screen, but from the file whose name is given by the value of the string variable fileName. Note it is necessary to have an import statement for the type File from the java.io package to use this. A sample file for use with this program can be found in the file marks.txt. If this file is in the same directory as the Marks.class file being run, then entering marks.txt in response to the prompt for a file name will open a scanner linked to this file. The program expects the file to have the format of a student number (which can actually be an string which does not contain white spaces) followed by an integer. The following are the first three lines in the example file:

AN528   25
AN884   29
CJ653   27

although the program does not test that the student numbers are in this format of two alphabetic characters followed by three numeric ones. The program reads in data from this file, storing marks indexed by exam numbers. It then allows the user to access marks as indexed by exam numbers.

Views of collections

The methods unmodifiableList, unmodifiableSet and unmodifiableCollection provided by class Collections are useful in that they provide what appear to be immutable copies of the collections they take as arguments. Thus if a is of type List<T> (meaning the object it refers to is of type ArrayList<T> or LinkedList<T>), then

List<T> b = Collections.unmodifiableList(a);

will cause b to refer to an object which appears to be a copy of the object which a refers to. Since b is of stated type List<T> then later code can call on b any method from the interface <List>. However, if an attempt is made to execute a call on b of any method which would cause a change to the contents of the list it refers to, the list is unchanged and an UnsupportedOperationException is thrown. This would be useful if we want to pass a copy of the list to another part of the program, but we did not know how that part of the program operated and we wanted to make sure it couldn't change the actual list. So even though Java does not provide a range of extra types or immutable collection objects, such objects can be produced.

Actually, unmodifiableList, unmodifiableSet and unmodifiableCollection do not produce copies of the original collection, they produce a view of it. This means they share the data with the original collection, so if it is changed, the view changes, So with our declaration of b above, although we cannot make a change to the collection viewed through b, if we make a change to it viewed through a, the collection viewed through b will also change. This means that it is not really correct to describe the objects returned by these methods as “immutable”, because that suggests they can never change. They can change, but not through a call on the actual object returned. So they are described as “unmodifiable views” of a modifiable object.

Here is some code which demonstrates this:

import java.util.*;

class UnmodifiableTreeSetTest
{
 public static void main(String[] args) throws Exception
 {
  Scanner input = new Scanner(System.in);
  System.out.println("Enter some words (all on one line):");
  String line = input.nextLine();
  String[] words = line.split(" ");
  Set<String> a = new TreeSet<String>();
  for(int i=0; i<words.length; i++)
     a.add(words[i]);
  System.out.println("The words are: "+a);
  Set<String> b = Collections.unmodifiableSet(a);
  System.out.println("Unmodifiable view of the words is: "+b);
  System.out.println("Enter a word to delete: ");
  String delWord = input.next().trim();
  a.remove(delWord);
  System.out.println("Changes words to: "+a);
  System.out.println("Unmodifiable view is: "+b);
 }
}

If the code were the same but with b.remove(delWord) instead of a.remove(delWord), it would compile, but when it was run and reached that statement, it would throw an UnsupportedOperationException.

The method headSet which we mentioned previously returns a view of the original set rather than copying. Given s1 of type TreeSet<E> and item of type E, then the call s2=s1.headSet(item) sets s2 to a view of all items in the set referenced by s1 which are less than item. This is according to the ordering of this set, which could be the natural ordering of E or some other ordering given by a comparator argument when the set was constructed. But as it is a view, if a new item less than item is added to the set referenced by s1, it will also be added to the set referenced by s2. Here is code that demonstrates this:

import java.util.*;

class HeadSetTest
{
 public static void main(String[] args) throws Exception
 {
  Scanner input = new Scanner(System.in);
  System.out.println("Enter some words (all on one line):");
  String line = input.nextLine();
  String[] words = line.split(" ");
  SortedSet<String> s1 = new TreeSet();
  for(String word : words)
     s1.add(word);
  System.out.println("The words are: "+s1);
  System.out.println("Enter a word to split at: ");
  String splitWord = input.next().trim();
  SortedSet<String> s2 = s1.headSet(splitWord);
  System.out.println("The words before this word are: "+s2);
  System.out.println("Enter a word to add: ");
  String addWord = input.next().trim();
  s1.add(addWord);
  System.out.println("The words are now: "+s1);
  System.out.println("The words before the split are now: "+s2);
 }
}

Here the variables s1 and s2 are given as type SortedSet<String> rather than TreeSet<String>, as SortedSet is an interface type which gives the operations needed for a sorted set which are actually implemented by TreeSet. Note also this code demonstrates a for-each loop, the lines

  for(String word : words)
     s1.add(word);

mean the same as

    for(int i=0; i<words.length; i++)     
       a.add(words[i]);

but do not explictly rely on words being an array and do not explicitly use any indexing on words.

Another operation returning a view of a collection is subList, which you can find documented here. It is found in the interface List so may be applied to objects of actual type ArrayList and LinkedList. If ls is a variable of type List<E> and start and finish are of type int, then ls.subList(start,finish) returns an object of type List<E> which contains all those objects from the original list starting at the one indexed by start and going up to but not including the one indexed by finish. Since this is a view of the original list, any change made to it will also change the original list. If the original list is changed in the portion viewed by the sublist, the sublist object will change in the same way. The following code demonstrates this:

import java.util.*;

class SubListTest
{
 public static void main(String[] args) throws Exception
 {
  Scanner input = new Scanner(System.in);
  System.out.println("Enter some words (all on one line):");
  String line = input.nextLine();
  String[] words = line.split(" ");
  List<String> a = new ArrayList<String>();
  for(String word : words)
     a.add(word);
  int count=0;
  System.out.println("The words are: ");
  for(String str : a)
      {
       System.out.println(count+": "+str);
       count++;
      }
  System.out.print("Enter a start index: ");
  int start = input.nextInt();
  System.out.print("Enter a finish index: ");
  int finish = input.nextInt();
  List<String> sublist = a.subList(start,finish);
  System.out.println("The sublist in this range is: ");
  count=0;
  for(String str : sublist)
      {
       System.out.println(count+": "+str);
       count++;
      }
  System.out.print("Enter a position in this sublist you wish to add at: ");
  int addPos = input.nextInt();
  System.out.print("Enter a word you wish to add: ");
  String word = input.next();
  sublist.add(addPos,word);
  System.out.println("The sublist is now: ");
  count=0;
  for(String str : sublist)
      {
       System.out.println(count+": "+str);
       count++;
      }
  System.out.println("The full list is now: ");
  count=0;
  for(String str : a)
      {
       System.out.println(count+": "+str);
       count++;
      }
 }
}

If you run this, you will see that adding an item to the list referred to by sublist, which is a portion of the initial list referred to by a, causes both lists to change, due to sublist being a view of a rather than an independent object.

An issue with the subList method is that since its return type is List<E>, you cannot expect a variable of type ArrayList<E> or LinkedList<E> to be set without casting to refer to the result of a call of it, even if that call is on a variable of the same type. For example, if you had a declaration ArrayList<String> a, and later had an assignment a=a.subList(0,a.size()-1) which you expected to reset a to the list of all except its last element, you might be surprised to get a compiler error saying that an ArrayList<String> a was expected but a List<String> a was found. That is because the return type of subList even when called on an ArrayList object is List with the element type of that object, and not ArrayList with the element type. This is an indication of why variables should always be declared as of the interface type. There would be no problem if the declaration of a was List<String> a, it can still be set to an object of type ArrayList<String>, but why limit it to that when it is more general and avoids this and other problems to declare it of the interface type?

It is also possible to get a list which is a view of an array. The static method asList in the class Arrays (as mentioned previously, this is a class which provides some routine operations on arrays expressed as static methods) does this. If a is of type T[] then Array.asList(a) returns an object of type List<T>, which is a view of a, that is changes to the list will change the array, and vice versa. Here is some code which demonstrates this:

import java.util.*;

class AsListTest1
{
 public static void main(String[] args) throws Exception
 {
  Scanner input = new Scanner(System.in);
  System.out.println("Enter some words (all on one line):");
  String line = input.nextLine();
  String[] words = line.split(" ");
  List<String> list = Arrays.asList(words);
  System.out.println("The words viewed as a list are: ");
  int count=0;
  for(String str: list)
     {
      System.out.println(count+": "+str);
      count++;
     }
  System.out.print("Enter a position: ");
  int pos = input.nextInt();
  System.out.print("Enter a word: ");
  String word = input.next();
  words[pos] = word;
  System.out.println("After changing word at position in array, list is: ");
  count=0;
  for(String str: list)
     {
      System.out.println(count+": "+str);
      count++;
     }
 }
}

Here, the array is changed, and you will find it means the view as a list gets changed as well. If the list is changed, the array will change, but only if the list is changed in a way that does not change its length. An add operation on the list changes its length, but if you attempted an add operation on a list that was a view of an array, it would not change the length of the array. Here is an example:

import java.util.*;

class AsListTest2
{
 public static void main(String[] args) throws Exception
 {
  Scanner input = new Scanner(System.in);
  System.out.println("Enter some words (all on one line):");
  String line = input.nextLine();
  String[] words = line.split(" ");
  List list = Arrays.asList(words);
  System.out.println("The words viewed as a list are: ");
  int count=0;
  for(String str: list)
     {
      System.out.println(count+": "+str);
      count++;
     }
  System.out.print("Enter a position: ");
  int pos = input.nextInt();
  System.out.print("Enter a word: ");
  String word = input.next();
  list.add(pos,word);
  System.out.println("After adding word at position in list, array is: ");
  for(int i=0; i<words.length; i++)
      System.out.println(i+": "+words[i]);
 }
}

If you attempt to run this, the call list.add(pos,word) will cause an UnsupportedOperationException to be thrown. However, if you change the statement list.add(pos,word) to list.set(pos,word) it will work. The reason for this is that list.add(pos,word) adds a new word to the list, so it changes the size of the list, whereas list.set(pos,word) does not change the size of the list, it just changes the word stored at a particular position to another word. You will see this causes the array of which the list is a view to change the word at the given position.

Anonymous classes

Previously, we saw how we could create our own comparator object which we could pass into a method to tell the method how it is to order objects it is dealing with. So we could pass in an object of type LengthComparer which implemented Comparator<String>; it would need to have a method called compare which took two String objects as its arguments, and returned an int value. This was defined in a separate class which extended the interface type Comparator with its type argument set to String.

It may seem a nuisance to have a large number of classes, each of which exists just to be used in one place, and is short, consisting of just a short method. Actually, the Java language recognises this, and allows us to avoid having to create a class if it is only going to be used once. The way it does this is the idea of an anonymous class. An object of an anonymous class is created by the word new followed by the name of the class or interface it extends and the () to make it a constructor call, followed by the code it extends it with enclosed within { and } brackets. So instead of having, as we did previously, the separate class LengthComparer, we could replace the call which created an object of that type:

Comparator<String> comparer = new LengthComparer();

by the creation of an anonymous object which is defined as extending Comparator<String> by the method which was previously in that separate class. This gives the following code:

import java.util.*;

class Longest2
{
 public static void main(String[] args) throws Exception
 {
  Scanner input = new Scanner(System.in);
  System.out.println("Enter some words (all on one line):");
  String line = input.nextLine();
  String[] words = line.split(" ");
  List<String> a = new ArrayList<String>();
  for(int i=0; i<words.length; i++)
     a.add(words[i]);
  Comparator<String> comparer = new 
    Comparator<String>()
       {
        public int compare(String str1,String str2)
        {
         return str1.length()-str2.length();
        }
       };
  String longest = CollectionOps.biggest(a,comparer);
  System.out.println("The longest word is \""+longest+"\"");
 }
}

In fact we don't even need to have a separate variable just to hold the Comparator object and pass it into the biggest method, so we could just have the creation of the anonymous object directly as the argument to the biggest method, giving us:

import java.util.*;

class Longest3
{
 public static void main(String[] args) throws Exception
 {
  Scanner input = new Scanner(System.in);
  System.out.println("Enter some words (all on one line):");
  String line = input.nextLine();
  String[] words = line.split(" ");
  List<String> a = new ArrayList<String>();
  for(int i=0; i<words.length; i++)
     a.add(words[i]);
  String longest = CollectionOps.biggest(a,new
    Comparator<String>()
       {
        public int compare(String str1,String str2)
        {
         return str1.length()-str2.length();
        }
       });
  System.out.println("The longest word is \""+longest+"\"");
 }
}

It's a matter of taste whether you choose to express it this way. You have to look carefully to see that the call to the method biggest extends over eight lines, starting with

  String longest = CollectionOps.biggest(a,new

and ending with

});

with what would previously have been in a separate class now actually inside a method call. In fact, we could go even further than this, and note that since the variable longest is only used to hold the result of the call to the method biggest and pass it into the System.out.println call, we could just have the method call to biggest directly where previuously we had the use of the variable longest. This gives the code you can see in Longest4.java, but now it maybe has got too convoluted due to trying too much to remove “use once” names, although it still compiles and executes fine.

As another example of extending Comparator with an anonymous class, instead of defining a separate class Scrabble as we did previously to find the word from an arrayList with the highest Scrabble score, we could use an anonymous class to do it. Then we have to pass the object of this anonymous class as the Comparator object argument to the call to the static method max from Java's Collections class, giving the following code:

import java.util.*;

class BestScrabble3
{
 public static void main(String[] args) throws Exception
 {
  Scanner input = new Scanner(System.in);
  System.out.println("Enter some words (all on one line):");
  String line = input.nextLine();
  String[] words = line.split(" ");
  List<String> a = new ArrayList<String>();
  for(int i=0; i<words.length; i++)
     a.add(words[i]);
  Comparator<String> comparer = new
   Comparator<String>()
     {
      int[] scores = {1,3,3,2,1,4,2,4,1,8,5,1,3,1,1,3,10,1,1,1,1,4,4,8,4,10};

      int score(String str)
      {
       String str1 = str.toUpperCase();
       int score = 0;
       for(int i=0; i<str1.length(); i++)
          score+=scores[str1.charAt(i)-'A'];
       return score;
      }

      public int compare(String str1,String str2)
      {
       return score(str1)-score(str2);
      }
    };

  String best = Collections.max(a,comparer);
  System.out.print("The word with the highest Scrabble score is \""+best);
  System.out.println("\" with score "+Scrabble.score(best));
 }
}

The code in this anonymous class is like the code in the class Scrabble, with the extra method score which the method compare uses, and the array scores which the method score uses. However, they cannot be declared as static because an anonymous class cannot have static variables or methods declared inside it.

Matthew Huntbach

Last modified: 7 July 2019