Algorithms and Data Structures in an Object-Oriented Framework (“ADSOOF”)

`String` and related classes in Java

Strings in Java

You have already seen the idea of String as a type in Java. Since we use String values from early on just as names and so on, it is easy to forget that a value of type String in Java is actually an object. So values of type String are like values of type DrinksMachine which we saw previously: you can create new objects of type String using the constructors in class String, and you can call the non-static methods in class String with the calls attached to references to String objects.

However, DrinksMachine is not a built-in type of Java, to use it you had to have a copy of the file DrinksMachine.class in your directory. The class String is provided as part of Java, it is always available for you to use in any Java program you write. To use most types provided with Java you have to use an import statement (we mentioned these briefly here), but a few are regarded as so common that you don't even have to import them, String is one of these (the others are all those in the java.lang package). In fact every type in Java is an object type, with the exception of the numerical types, int, double and so on, the type boolean and the type char. These non-object types are known as the primitive types.

One reason you may not have thought of strings as objects is that, unlike any other type of object, there is a special way of representing string values, that is the representation using the quote characters enclosing the characters that make up the string. You can also make new string objects by Java's use of + to mean string joining. So there is little need to create new strings using a constructor. However, for example, there is a constructor in class String which takes an array of characters as its argument and produces a string where each character in the string is equal to the character in the same position in the array. So if a is a variable of type char[], and has been set to refer to an array object, then new String(a) will return the equivalent string.

A representation of a particular value, as opposed to a variable that holds a value is termed a literal value. So, for example, "hello" is the literal value for a string of length 5, first character 'h', second character 'e' and so on. Note the distinction between use of the double quote symbol for literal values of strings and the single quote symbol for literal values of characters. In Java, a character is a primitive value, not an object, although as mentioned below every primitive type has an equivalent object type. So, for example, 'x' represents the character, while "x" represents an object of type String with a length of 1 and the character at position 0 being 'x'. Java's primitive type char stores characters as their integer values in Unicode.

Strings in Java (unlike, for example, the language C) are not the same thing as arrays of characters. If we have a variable a of type char[] and a variable str of type String, then for any expression i which evaluates to an integer, a[i] gives the ith character of the array referred to by a, while str.charAt(i) gives the ith character of the string referred to by str. As we saw here, we can use a[i] as a variable, so we can put it on the right hand side of assignment statement and assign a value to it. For example, a[3]='x' will change the fourth character of the array referred to by a (remembering the first character is indexed by 0) to lower-case 'x'. However, str.charAt(i) is a method call, and you cannot assign a value to a method call. The Java compiler will give a error message if you attempted to compile code with the statement str.charAt(3)='x' in it. So string objects can't be changed in the way array objects can be changed, in fact string objects are immutable: they cannot be changed at all, you can only create new ones.

A minor but perhaps rather annoying difference between arrays and strings in Java is that the length of an array a is given by a.length and the length of a string str is given by str.length(). The difference is the () at the end of str.length() which indicates this is just a method call, the zero-argument method length, called on the object str.

Methods in class `String` in Java

When dealing with the libraries provided with Java, you can find full documentation of all the methods they provide on the official web site provided by the Oracle company here. The documentation for the class String is provided here. You should be aware that as this is the official documentation, it has to cover every aspect, including rarely used methods and methods which only make sense when you have become familiar with more advanced aspects of Java. It would be a good idea to get used to using this official documentation, but for the purposes of this module don't feel you have to become familiar with the vast range of classes and methods that are available. We will cover just a small number of them in the module, either because they are particularly fundamental, or because they illustrate an important point. Remember, the main purpose of this module is to introduce concepts of algorithms and data structures rather than to concentrate on detailed aspects of the Java language.

As an example, the method replace acts in a similar way to the method change we discussed previously with arrays. It takes a string and two characters and returns the result of replacing all occurrences of the first character by the second character in the string. It does this constructively rather than destructively, that is it returns a new string representing the change, it does not change the string it is called on. In fact the class String has no destructive methods, this ensures that String objects are immutable, there is no method that can be called on them to change them, there are only methods that can be called on them to return new strings representing a change.

With the constructive method change we developed for arrays of integers, if a1 and a2 were of type int[] and n1 and n2 were of type int then the statement a2=change(a1,n1,n2) causes a2 to be set to an array which represents the effect of changing all occurrences of the integer n1 to the integer n2 in the array a1. With the method replace in class String, if str1 and str2 are variables of type String and ch1 and ch2 are variables of type char, then the statement str2=str1.replace(ch1,ch2) causes str2 to be set to a string which represents the effect of changing all occurrences of the character ch1 to the character ch2 in the string str1. For example, if str1 holds a reference to the string "happy", and ch1 hold the character 'p' and ch2 holds the character 'r', then str1.replace(ch1,ch2) will return a reference to the string "harry".

The difference in construct between str1.replace(ch1,ch2) and change(a1,n1,n2) is because our change was a static method while the replace method in class String is not a static method. So the array being changed had to be passed as one of the arguments to the method change, and this method was not called attached to any object. The method replace has to be called attached to a String object, which is the object to be changed, and this is done instead of the object being passed as one of the arguments to the method call.

Other useful methods in class String which return strings based on transforming the string they are called on include:

trim returns the string with all blank characters at either end removed. So if str represents " Hello World " then the call str.trim() will return the string "Hello World".
toUpperCase returns the string with all lower case letters replaced by the equivalent upper case letters. So if str represents "Hello World" then str.toUpperCase() will return the string "HELLO WORLD".
toLowerCase returns the string with all upper case letters replaced by the equivalent lower case letters. So if str represents "Hello World" then str.toLowerCase() will return the string "hello world".
substring takes two integers as arguments and returns the portion of the string it is called on which starts with the character indexed by the first integer and goes up to but does not include the character indexed by the second integer. So if str represents "Hello world", and p1 stores 3 and p2 stores 8 then str.substring(p1,p2) will return "lo wo". A second version of substring takes just one integer argument and returns the portion of the string indexed by that integer argument up to the end of the string. So with the same value of str and with p storing 4, the call str.substring(p) would return "o world".
format is a static method which is used for formatting⁽¹⁾.

If these methods did not already exist in the code library provided as part of Java, we could write our own methods to do the same tasks. They would have to be static methods, however, since we cannot add new methods to classes already provided by Java (in a sense we can when we come to the object-oriented concept of inheritance, although Java declares the class String to be final meaning it can't be extended by inheritance anyway). Here is an example of a static method which has the effect of changing one character to another in a string:

static String replace(String str1,char ch1,char ch2)
// Returns the string resulting from replacing all occurrences
// of ch1 by ch2 in str.
{
 String str2="";
 for(int i=0; i<str1.length(); i++)
    if(str1.charAt(i)==ch1)
       str2=str2+ch2;
    else
       str2=str2+str1.charAt(i);
 return str2;
}

So if str references "happy" and c1 stores 'p' and c2 stores 'r' then replace(str,c1,c2) will return "harry".

Where methods to do a particular job already exist in the Java code library, it makes sense to use them rather than write our own method to do the job. Not only does it save us the time and effort, it makes our code easier to follow by those already familiar with the Java code library, and we can be sure that the distributors of Java have implemented their methods in the best way possible and they will be efficient and free of errors. In some cases in this module, however, we will show a method that does a job where there is already a method in the Java library to do the job. This will be done because looking at the code for the method ourselves helps develop coding skills and an understanding of algorithms and data structures. It should not be regarded as good practice in general to write our own code when we can call code that already exists in the Java library. When studying at university, however, do not confuse software re-use with plagiarism.

The method indexOf in class String takes a character and returns the position of the lowest indexed occurrence of that character in the string it is called on, or -1 if the character does not occur in the string. So it is very similar to the method we gave previously for finding the position of an integer in an array of integers. Once again, as it is not a static method, str.indexOf(ch) gives the position of character ch in string str whereas with our static method for arrays of integers, position(a,n) gave the position of integer n in array a. A separate method lastIndexOf in class String gives the position of the highest indexed occurrence of a character in a string. So if str is "Hello World" then str.indexOf('o') returns 4 and str.lastIndexOf('o') returns 7.

The methods startsWith and endsWith both take string arguments and are called on strings, and return boolean values as indicated by their names. So str1.startsWith(str2) returns true if str1 starts with str2 and false otherwise, and str1.endsWith(str2) returns true if str1 ends with str2 and false otherwise. So if str1 is "Hello World", str2 is "Hell" and str3 is "rld", then str1.startsWith(str2) evaluates to true, str1.startsWith(str3) evaluates to false, str1.endsWith(str3) evaluates to true and so on.

An important method in class String is compareTo, which like startsWith and endsWith takes another string as an argument. It is used to compare strings alphabetically, so str1.compareTo(str2) says whether str1 comes before str2 in standard alphabetic ordering or not. You might suppose that compareTo would return a boolean value, but its return type is actually int. This is so it can be used to distinguish three possibilities from the call str1.compareTo(str2): either str1 is before str2 alphabetically, or str2 is before str1 alphabetically, or str1 and str2 are equal. If str1 and str2 are equal, the call str1.compareTo(str2) returns 0. It returns a negative integer if str1 is before str2 alphabetically, and a positive integer if str1 is after str2 alphabetically. You can find the complete definition of the String method compareTo in the Java documentation here. Many other classes also have their own compareTo method which works similarly to the one in class String and enables objects of their class to be considered in some sort of ordering with each other. If an object is of a class that has its own compareTo method, it has what is termed natural order, which we will look at in more detail later.

The method compareTo can only be called on an object of a class that has that method, so it is not like the method equals that can be called on any object. The method equals can be called on any object because it is in the class Object, so if a class does not have its own equals method, it inherits the one from class Object. The class String has its own version of the method equals, see below for more details on this, and here for the issue of inheriting or overriding methods from Object.

The reverse of the constructor for class String which takes an array of strings as its argument and produces the equivalent String object is the method toCharArray. If str references a string, then str.toCharArray() returns an object of type char[], that is an array of characters of the same length as the string, where the characters in the array are the same and indexed in the same order as the characters in the string.

You can find code which demonstrates some of these methods in the code folder. See the files StringTest1.java, StringTest2.java, StringTest3.java, StringTest4.java, StringTest5.java, StringTest6.java and StringTest7.java. There is also a list of the code files used as examples in this section, with a brief explanation of the relevance of each in the code index for this section.

Strings and equality

Since strings are represented by objects in Java, if you want to test that two variables holding strings store the same strings, you should use the method equals and not the symbol ==. Remember that if v1 and v2 are two variables, v1==v2 gives true only if v1 and v2 are aliases, that is they are both set to refer to the same object. However, v1.equals(v2) may give true even if they refer to different objects. It depends how the method equals is implemented for the class of the object that v2 refers to. With class String it is implemented so that if str1 and str2 are both of class String then str1.equals(str2) will give true not only if they are aliases, but also if they refer to different String objects, but the lengths of the two different String objects are the same, and in each position they have the same character.

Here is a simple piece of code you can use to demonstrate this:

import java.util.*;

public class StringTest14
{
    public static void main (String [] args)
    {
     Scanner input = new Scanner(System.in);

     System.out.println("Enter two words:");
     String word1=input.next();
     String word2=input.next();
     System.out.println("The two words are:");
     System.out.println("   \""+word1+"\"");
     System.out.println("   \""+word2+"\"");
     System.out.println("Testing them for equality using == gives: "+(word1==word2));
     System.out.println("Testing them for equality using equals gives: "+word1.equals(word2));
    }
}

If you run it and type the same word twice, you will see the word1==word2 test gives false but the word1.equals(word2) test gives true. That is because the method next called on a Scanner will always create and return a new String object.

However, if you run the following main method:

    public static void main (String [] args)
    {
     String word1="Hello";
     String word2="Hello";
     System.out.println("The two words are:");
     System.out.println("   \""+word1+"\"");
     System.out.println("   \""+word2+"\"");
     System.out.println("Testing them for equality using == gives: "+(word1==word2));
     System.out.println("Testing them for equality using equals gives: "+word1.equals(word2));
    }

you will find it does give true in both cases. The reason for this is that if a literal value is used for a String object, and the same literal is used later, Java uses an alias to the first use rather than creating a new object.

We will consider the issue of the equals method in more detail later.

Input in Java

Java's full text input-output mechanisms are complex, with many different variations. One cause of complexity is that although Java stores characters as their Unicode values, which are of 16 bits, many text files are stored using the ASCII format in which characters are stored as 8 bit values. Since this module is about code that interacts with other code, not code that interacts with directly with files or databases or networks or human users, we do not need to consider all this. We can just use Scanner, and that is just so that we can provide simple front-end code for testing purposes.

We have seen the use of a Scanner to input text in our example code, you saw it first in this module here. An object of type Scanner is created by a call to one of its constructors, if the argument to the call is System.in it creates a Scanner object which reads from the console window. As we saw from our example, the class Scanner is in the java.util package, but nothing else from that package is required to make it work, so importing just that class with

import java.util.Scanner;

is enough to make it available for use. We can declare a variable of type Scanner called input and set it to refer to a Scanner object which reads from the console window by:

Scanner input = new Scanner(System.in);

Then input.nextInt() reads the next text typed in up to the next blank space or end-of-line, and returns the integer that it represents. If it happens that the next text typed in does not consist of all numerical characters it throws an exception of type InputMismatchException. However, it is possible to test in advance of reading whether the next text types in represents and integer, this would be done by the call input.hasNextInt() which returns a boolean. If you wanted to read value of type double there is the similar method nextDouble to read the next text typed in, convert it to a double value and return that value, together with hasNextDouble which tests whether the next unread text can be interpreted as a double value. The method next returns the next text typed in, starting with the next non-blank character and up to but not including the following blank character, as a string. Here “blank character” means either a space or an end-of-line. The method nextLine is equivalent to the method readLine for a BufferedReader object, it returns the whole line typed as a string value. The methods of Scanner do not throw checked exceptions, so unlike BufferedReader, there is no need to add exception catching or throws annotations to the code. However, Scanner does not provide a method like BufferedReader's read which reads and returns a single character.

Full documentation on the class Scanner can be found here. As we have noted previously with Java's official documentation, because it has to cover everything, this covers far more than is necessary to know for this module, so please look at it out of interest or for reference, but don't think you need to know anything more about Scanner for the purposes of this module apart from what appears in these on-line notes.

If you are interested in full details of Java's handling of input/output, you could look at the tutorial which the distributors of Java have written for it, here. Once again, none of this tutorial is information you are expected to know for the purposes of this module. The class Scanner was introduced in Java in 2004, in earlier version of Java the following was recommended to read text from a console window: create an object of the type BufferedReader using the following construction:

BufferedReader input = new BufferedReader(new InputStreamReader(System.in));

This is a standard variable declaration combined with an initialisation. The variable is called input, but it could be called anything. It is set to a new BufferedReader object created with a constructor, that constructor takes a new InputStreamReader object as its argument and the constructor for that takes the special value System.in as its argument. But you need not worry why this is done. All you need to know is that the object referred to by the variable input created in this way has two zero-argument methods that can be called on it, read and readLine. The first reads a single character, and the second reads a whole line of characters. So input.read() returns the next character typed, while input.readLine() returns a string consisting of all the characters which have been typed in but not yet read up to but not including the “end-of-line” character, and it will wait until the end-of-line character is typed in before returning.

The classes BufferedReader, InputStreamReader and IOException are all in the Java library package java.io. So they need to be imported to any file which uses them, this can be done by adding the line

import java.io.*;

at the beginning of the file. This has the effect of importing all the classes from the package java.io.

Splitting Strings

As we saw above, a Scanner object can split a line of text into individual words treating space characters as “word delimiters” rather than as part of the words. In fact, it is possible to use a Scanner object just to split up a string rather than to read text and split it up. This is done by using an alternative constructor for Scanner which takes as its argument a string rather than a reference to the input from the console window. Below is some simple code (in the file ScannerTest1.java in the code folder) which shows using BufferedReader to read a line of text, then Scanner to break it into words:

import java.util.Scanner;
import java.io.*;

class ScannerTest1
{
 public static void main(String[] args) throws IOException
 {
  BufferedReader input = new BufferedReader(new InputStreamReader(System.in));
  System.out.println("Enter some words: ");
  String words = input.readLine();
  Scanner splitter = new Scanner(words);
  System.out.println("\nThe words you entered are: ");
  while(splitter.hasNext())
      System.out.println(splitter.next());
 }
}

The method hasNext on a Scanner object returns true until the point where the Scanner object has gone through all its input and reached the end.

A similar effect can be obtained by the use of the method split in class String. Calling split on an object of class String returns an object of type String[], that is an array of strings, where the strings in the array are the separate word the string it was called on breaks up into. So if the variable str refers to the string "Hello world, how are you?", the variable declaration and assignment

String[] words = str.split(" ");

causes the variable words to refer to an array of strings of length 5, where words[0] is "Hello", words[1] is "world,", words[2] is "how", words[3] is "are" and words[4] is "you?".

Here the punctuation symbols are treated as normal characters and so are treated as part of the words they are next to. The argument to the method split tells it what it must treat as word delimiters. So str.split(" ") tells it to treat the space character as spacing between words, and all other characters as normal characters. If we made it str.split(",") it would tell it to split the string up treating the comma character as the separating characters. So in the above example, it would split the string into two parts, words[0] would be "Hello world" and words[1] would be " how are you?".

In the front-end code used to set up examples previously, you will have see that str.split(" +") is used rather than str.split(" "). The reason for this is that using " " tells split to treat a single space character as the spacing between two words, but using " +" tells it to treat one or more space characters as the spacing between two words. So using " +" means the front-end would still work if you decided to put two spaces rather than one between the numbers or words you types in. The full rules for how the argument to split works use a concept known "regular expressions" which is explained a little more below.

If you want to convert a string, perhaps obtained from a call of the method split, which represents an integer to the equivalent integer, you need to use the method parseInt which is a static method in the class Integer. Because the class Integer is in the java.lang package it does not have to be imported. If you want to call a static method from another class, the call is attached to the name of the class. So if str refers to a string object all of whose characters are numerical, then Integer.parseInt(str) returns the value of type int it represents. If ch represents the string "6315" then int n=Integer.parseInt(str) will declare the variable of type called n and set it to the number 6315. You need to be aware that "6315" is just the string of four characters '6', '3', '1' and '5', it is an entirely separate thing from the number 6315 which on the computer isn't even represented using decimal digits. If you call Integer.parseInt(str), but str is not a string which represents an integer, it will cause an exception of type NumberFormatException to be thrown. The static method parseDouble in class Double works in a similar way to parseInt, but converts strings that represent double values (that is, floating point numbers) to the actual double value they represent.

Wrapper Classes

The class Integer is one of the wrapper classes. It is the wrapper class for the type int. There is also the class Character which is the wrapper class for the type char. The other primitive types all have wrapper classes which are the same name as the primitive type except for an initial upper-case letter, so Double is the wrapper class for the type double for example.

Wrapper classes exist because in some places in Java you can only use types which are not primitive types. A variable of type Integer refers to an object which holds a value of type int inside it, so it is a different thing from a variable of type int which holds the value directly. In older versions of Java, if i is of type int, to create the Integer equivalent you would have to use the constructor, so Integer n = new Integer(i) would create a new variable of type Integer called n and set it to refer to an object which contains the value of i. The reverse conversion would use the method called intValue in class Integer, so int j = n.intValue() would declare a variable of type int called j and set it to the int value inside the object referred to by n. In Java 5 and later versions, however, the conversion is automatic. Integer n=i has the same effect as Integer n = new Integer(i), and int j=n has the same effect as int j = n.intValue(). In general in Java 5 and later versions, you can use references to objects of type int whenever a value of type Integer is required, and it will be converted automatically (referred to as boxing), and you can use a value of type Integer whenever a value of type int is required (referred to as unboxing).

The wrapper classes also contain a number of static methods dealing with values of the primitive type they wrap. For example, the static method toBinaryString in class Integer takes an int argument and returns a string of '1' and '0' characters representing its binary equivalent. So if variable n of type int holds the value 21 then Integer.toBinaryString(n) will return "10101". In the class Character, the static method toUpperCase takes a value of type char and returns its upper case equivalent, so if variable ch holds 'q' then Character.toUpperCase(ch) will return 'Q'. Class Character has various static methods for determining the category of characters. For example, Character.isUpperCase(ch) will return true if ch stores an upper case letter, and false otherwise.

In the code folder the file StringTest8.java demonstrates the toBinaryString method of class Integer and StringTest9.java demonstrates some methods from class Character.

Regular Expressions

It was noted previously that str.split(" ") returns an array of strings which comes from the the string referred to by str broken up treating the space character ' ' as a separator between strings and not part of the strings. Actually, it means a substring consisting of one space character is treated as the separator. However, with str.split(" +") the separator is any number of spaces. This is because the string argument to split is treated as a “regular expression”. In a regular expression, some characters have special meanings. The character '+' has the meaning “repeated one or more times”.

The Java code library provides some complex built-in code for pattern-matching with strings, in particular the classes Pattern and Matcher are provided for this. These classes enable Java to be used for tasks which are otherwise handled by programming languages in the class of scripting languages which are tailored to text manipulation, such as Perl. Although there is no time to cover any of this material in this module, if you are using Java extensively for text handling, it would pay to familiarise yourself with it, as the chances are that much of the code you might otherwise write yourself can be done by using these classes and their methods. The basic idea is to use regular expressions, in which certain characters are given a special meaning to match with various patterns of characters in ordinary strings. A tutorial to Java's regular expressions capability can be found here.

Please note, again, that as this module is on programming basic algorithms and data structures, if you are asked to write code for a task in an assessment for this module, ECS510, you are expected to write it only using the basic Java covered directly in the module. So you will not gain any marks by instead using built-in code from Java's code libraries unless you are told that you are meant to use it. The above links are provided out of interest and for future reference for you. If you are writing Java code for some other task, of course you should use all that the Java library provides, but if you are being assessed on your abilities in basic programming, you are not showing those abilities if instead of writing code to solve a problem, you use pre-existing code provided as part of the Java library.

Recursion with Strings

An alternative way of programming with strings to the way using loops we considered above is to use recursion. We looked at recursion with arrays previously, and we shall consider it in more detail when we look at its use with Lisp lists in the next section. Recursion means thinking of a solution to a problem in terms of a solution to a smaller version of the same problem. For example, above we considered the test that one string starts with another. If we had to program a static method to test whether str1 starts with str2, one way we could think of it is as follows. If str2 is the empty string, obviously str1 does start with str2. If str1 is the empty string, and we have already tested str2 and found it is not the empty string, then obviously str1 does not begin with str2. If str1 and str2 start with different characters, then obviously str1 cannot start with str2, we need consider no further. If str1 and str2 start with the same character, then str1 starts with str2 if the string consisting of everything except the first character of str1 starts with the string consisting of everything except the first character of str2. So, for example, we can tell that "woodpecker" starts with "wood" by first noting both have the same first character 'w', and then testing that "oodpecker" starts with "ood". This leads to the following code:

 public static boolean startsWith(String str1,String str2)
 // Returns true if str1 starts with str2, false otherwise
 {
  if(str2.length()==0)
     return true;
  else if(str1.length()==0)
     return false;
  else if(str1.charAt(0)!=str2.charAt(0))
     return false;
  else
     return startsWith(str1.substring(1),str2.substring(1));
 }

since str.substring(1) returns the string which is all of str except the first character. So we have the method startsWith making a call to the method startsWith, this “method calling itself” is what is referred to as “recursion”. You can find this code and supporting code to run it in the file StringTest11.java. Although it may seem odd if you are not used to this style of programming, it works. If you want to think through why it works, remember the call of startsWith inside the code for startsWith will execute in its own environment, as we discussed previously. This will be one where there is a separate variable called str1 which holds the value of str1.substring(1) from the previous environment, and a separate variable called str2 which holds the value of str2.substring(1) from the previous environment.

So although we tend to say that recursion is when “a method calls itself”, a better way of thinking about it might be to say it is when “a method call makes another call to the same method”. This makes it clearer that the method code is the pattern for a method call, just as a class is a pattern for an object, and just as every object of a class has its own variables, every call to a method has its own variables. A variable of a particular name in one method call is not the same variable as the variable of the same name in another call to the same method. So, you cannot assign a value to a variable of a particular name which is a parameter variable or local variable declared inside a method and cause the variable of the same name in another call to the same method to have its value changed.

In this method startsWith, the process of the method “calling itself”, won't go on forever because it will eventually get to the case where either str1 is the empty string or str2 is the empty string, or the two strings have initial characters which are not equal. When a method which contains a recursive call has arguments which mean the recursive call is not made, it is known as a base case. A recursive method must always have at least one base case, and each recursive call must be closer to a base case, otherwise it will be the equivalent of an infinite loop. With startsWith there are three base cases.

To contrast with “recursion” the use of loops is called iteration). For comparison, here is the same startsWith operation implemented iteratively:

 public static boolean startsWith(String str1,String str2)
 // Returns true if str1 starts with str2, false otherwise
 {
  int i;
  for(i=0 ;i<str1.length()&&i<str2.length(); i++)
      if(str1.charAt(i)!=str2.charAt(i))
         break;
  return i==str2.length();
 }

You can find this and supporting code to run it in the file StringTest10.java. Another way of doing it using loops is more similar to the recursive way:

 public static boolean startsWith(String str1,String str2)
 // Returns true if str1 starts with str2, false otherwise
 {
  while(str2.length()!=0&&str1.length()!=0&&str1.charAt(0)==str2.charAt(0))
      {
       str1=str1.substring(1);
       str2=str2.substring(1);
      }
  return str2.length()==0;
 }

You can find this and supporting code to run it in the file StringTest12.java. As you can see, it has a loop where the condition to stay in the loop is the opposite of the base case conditions in the recursive version. Instead of setting up a separate environment with new variables str1 and str2, the existing variables of these names are changed to hold the values the variables of the names would have in the recursive call. The value the method returns is the same value the base case returns in the recursive version (true if the length of str2 is 0, false otherwise).

Not all recursive code can be converted so easily to iterative code. The startsWith code is an example of what is called tail recursion, which is when the return statement has just a recursive call as its return value. A more complicated example is when something is done with the result of the recursive call to get the return value. For example, above, we saw an iterative method which took a string and two characters and returned the string resulting from changing all occurrences of the first character to the second. A recursive method which does the same thing is given below:

 public static String replace(String str,char ch1,char ch2)
 // Returns the string resulting from replacing all occurrences
 // of ch1 by ch2 in str.
 {
  if(str.equals(""))
     return "";
  else
     {
      String str1 = replace(str.substring(1),ch1,ch2);
      if(str.charAt(0)==ch1)
         return ch2+str1;
      else
         return str.charAt(0)+str1;
     }
 }

You can find this and supporting code to run it in the file StringTest13.java. Thinking about this code logically, if you want to change all occurrences of ch1 to ch2 in a string, if the string is empty, you just return the empty string. Otherwise, you get the result of changing all occurences of ch1 to ch2 in the string which consists of all but the first character of str. Then, if it happens the first character of the original string is ch1, you add ch2 to the front, otherwise you add the first character of the original string to the front.

The reason this can't be converted so easily into a loop is that you need to keep the old value of the string argument variable, str in this case, for use after the recursive call, so you cannot just reassign the variable and thus lose its old value. Tail recursion can convert easily into iteration, because we do not need to go back and use values from previous environments. With recursion where something is done after each recursive call, it will be done in the previous environment as it is returned to.

Footnote

(1) A call to format has a String argument which contains “format specifiers”, followed by a number of further arguments equal to the number of format specifiers in the first argument. It returns a string in which is like the first argument except that the first format specifier is replaced by a formatted version of the second argument, the second format specifier is replaced by a formatted version of the third argument, and so on. A format specifier consists of the character % followed by some further characters and ending in a code character specifying the type of formatting. The full details of what can be done with this formatting are complex, but it is most useful purpose is in laying out numerical information. The format % followed by a number ended by d matches with an integer value and gives the integer with extra spaces padding it out to make the length equal to the number. So you can use, for example %4d if you have a number you know will be between 0 and 9999 and you want it to take up exactly four characters. The format %, followed by a number, followed by . followed by another number, followed by f, matches with a floating point value and gives the value padded out with spaces to a full length of the first number in the format specifier and with the number of decimal places as given by the second number in the format specifier. The format %% gives the % symbol.

As an example, if units is an int variable holding 9 and mean is a double variable holding 57.66666666666, then the statement

str = String.format("You have %2d units and average %5.2f%% marks",units,mean);

causes str to be set to "You have 9 units and average 57.67% marks".

You can also use formatting in direct printing. If you have the statement System.out.printf(...), where the ... is the string and further arguments in the same way as used for the method format above, it will cause the formatted string to be printed out in the command window the program is running in.

An example of string formatting, and also the use of a StringTokenizer can be found in the file StringTokenizerTest1.java.

Matthew Huntbach

Last modified: 23 February 2019

String and related classes in Java