Algorithms and Data Structures in an Object-Oriented Framework (“ADSOOF”)
String
and related classes in Java
You have already seen the idea of String
as a type in Java.
Since we use String
values from early on just as names and so on,
it is easy to forget that a value of type String
in Java is actually an object. So values of type String
are like values of type DrinksMachine
which we saw
previously: you can create new
objects of type String
using the constructors in class
String
, and you can call the non-static methods in
class String
with the calls attached to references to
String
objects.
However, DrinksMachine
is not a built-in type of Java,
to use it you had to have a copy of the file DrinksMachine.class
in your directory. The class String
is provided as part
of Java, it is always available for you to use in any Java program you
write. To use most types provided with Java you have to use an
import statement (we mentioned these briefly
here), but a few are
regarded as so common that you don't even have to import them, String
is one of these (the others are all those in the java.lang
package).
In fact every type in Java is an object type, with the exception of the
numerical types, int
, double
and so on, the type
boolean
and the type char
. These non-object
types are known as the primitive types.
One reason you may not have thought of strings as objects is that, unlike any
other type of object, there is a special way of representing string values,
that is the representation using the quote characters enclosing the
characters that make up the string. You can also make new string
objects by Java's use of +
to mean string joining. So
there is little need to create new strings using a constructor.
However, for example, there is a constructor in class String
which takes an array of characters as its argument and produces a
string where each character in the string is equal to the character
in the same position in the array. So if a
is a variable
of type char[]
, and has been set to refer to an array
object, then new String(a)
will return the
equivalent string.
A representation of a particular value, as opposed to a variable that holds a value is
termed a literal value. So, for example, "hello"
is the literal
value for a string of length 5, first character 'h'
, second character
'e'
and so on. Note the distinction between use of the double quote symbol
for literal values of strings and the single quote symbol for literal values of characters.
In Java, a character is a primitive value, not an object, although as mentioned
below every primitive type has an equivalent object type.
So, for example, 'x'
represents the character, while "x"
represents
an object of type String
with a length of 1 and the character at position 0
being 'x'
. Java's primitive type char
stores characters as
their integer values in Unicode.
Strings in Java (unlike, for example, the language C) are not the same
thing as arrays of characters. If we have a variable a
of
type char[]
and a variable str
of type
String
, then for any expression i
which evaluates to an integer, a[i]
gives
the i
th character of the array referred to by
a
, while str.charAt(i)
gives the
i
th character of the string referred to by
str
. As we saw here,
we can use a[i]
as a variable, so we can put
it on the right hand side of assignment statement and assign a value to it.
For example, a[3]='x'
will change the fourth character of the
array referred to by a
(remembering the first character
is indexed by 0
) to lower-case 'x'. However,
str.charAt(i)
is a method call, and you cannot
assign a value to a method call. The Java compiler will give a
error message if you attempted to compile code with the statement
str.charAt(3)='x'
in it. So string objects can't
be changed in the way array objects can be changed, in fact string
objects are immutable: they cannot be changed at all, you can only
create new ones.
A minor but perhaps rather annoying difference between arrays and strings
in Java is that the length of an array a
is given by
a.length
and the length of a string str
is given by
str.length()
. The difference is the ()
at
the end of str.length()
which indicates this is just a
method call, the zero-argument method length
, called on the
object str
.
String
in Java
When dealing with the libraries provided with Java, you can find full
documentation of all the methods they provide on the official web
site provided by the Oracle company
here.
The documentation for the class String
is provided
here.
You should be aware that as this is the official documentation, it
has to cover every aspect, including rarely used methods and methods
which only make sense when you have become familiar with more
advanced aspects of Java. It would be a good idea to get used to using
this official documentation, but for the purposes of this module don't
feel you have to become familiar with the vast range of classes and
methods that are available. We will cover just a small number of them in
the module, either because they are particularly fundamental, or because
they illustrate an important point. Remember, the main purpose of this
module is to introduce concepts of algorithms and data structures
rather than to concentrate on detailed aspects of the Java language.
As an example, the method replace
acts in a
similar way to the method change
we discussed
previously with arrays.
It takes a string and two characters and returns the result
of replacing all occurrences of the first character by the
second character in the string. It does this constructively
rather than destructively, that is it returns a new string
representing the change, it does not change the string it is called
on. In fact the class String
has no destructive
methods, this ensures that String
objects are
immutable, there is no method that can be called on them to
change them, there are only methods that can be called on them to
return new strings representing a change.
With the constructive method change
we developed
for arrays of integers, if a1
and a2
were of type int[]
and n1
and
n2
were of type int
then the
statement a2=change(a1,n1,n2)
causes a2
to be set to an array which represents the effect of changing
all occurrences of the integer n1
to the integer
n2
in the array a1
. With the method
replace
in class String
, if
str1
and str2
are variables of type
String
and ch1
and ch2
are variables of type char
, then the statement
str2=str1.replace(ch1,ch2)
causes str2
to be set to a string which represents the effect of changing
all occurrences of the character ch1
to the
character ch2
in the string str1
.
For example, if str1
holds a reference to
the string "happy"
, and ch1
hold the character
'p'
and ch2
holds the character 'r'
,
then str1.replace(ch1,ch2)
will return a reference to
the string "harry"
.
The difference in construct between str1.replace(ch1,ch2)
and change(a1,n1,n2)
is because our change
was a static method while the replace
method in class
String
is not a static method. So the array being changed had
to be passed as one of the arguments to the method change
,
and this method was not called attached to any object. The method
replace
has to be called attached to a String
object, which is the object to be changed, and this is done instead
of the object being passed as one of the arguments to the method call.
Other useful methods in class String
which return
strings based on transforming the string they are called on include:
trim
returns the string with all blank characters at
either end removed. So if str
represents
" Hello World "
then the call str.trim()
will return the string "Hello World"
.
toUpperCase
returns the string with all lower case
letters replaced by the equivalent upper case letters. So if str
represents "Hello World"
then
str.toUpperCase()
will return the string "HELLO WORLD"
.
toLowerCase
returns the string with all upper case
letters replaced by the equivalent lower case letters. So if str
represents "Hello World"
then
str.toLowerCase()
will return the string "hello world"
.
substring
takes two integers as arguments and returns
the portion of the string it is called on which starts with the character
indexed by the first integer and goes up to but does not include the
character indexed by the second integer. So if str
represents "Hello world"
, and p1
stores
3
and p2
stores 8
then
str.substring(p1,p2)
will return "lo wo"
.
A second version of substring
takes just one integer argument
and returns the portion of the string indexed by that integer argument
up to the end of the string. So with the same value of str
and with p
storing 4
, the call
str.substring(p)
would return "o world"
.
format
is a static method which is used for
formatting(1).
String
to be
final
meaning it can't be extended by inheritance anyway).
Here is an example of a static method which has the effect of changing
one character to another in a string:
static String replace(String str1,char ch1,char ch2) // Returns the string resulting from replacing all occurrences // of ch1 by ch2 in str. { String str2=""; for(int i=0; i<str1.length(); i++) if(str1.charAt(i)==ch1) str2=str2+ch2; else str2=str2+str1.charAt(i); return str2; }So if
str
references "happy"
and
c1
stores 'p'
and c2
stores
'r'
then replace(str,c1,c2)
will
return "harry"
.
Where methods to do a particular job already exist in the Java code library, it makes sense to use them rather than write our own method to do the job. Not only does it save us the time and effort, it makes our code easier to follow by those already familiar with the Java code library, and we can be sure that the distributors of Java have implemented their methods in the best way possible and they will be efficient and free of errors. In some cases in this module, however, we will show a method that does a job where there is already a method in the Java library to do the job. This will be done because looking at the code for the method ourselves helps develop coding skills and an understanding of algorithms and data structures. It should not be regarded as good practice in general to write our own code when we can call code that already exists in the Java library. When studying at university, however, do not confuse software re-use with plagiarism.
The method indexOf
in class String
takes
a character and returns the position of the lowest indexed occurrence of that
character in the string it is called on, or -1
if the
character does not occur in the string. So it is very similar to the
method we gave previously for
finding the position of an integer in an array of integers. Once again,
as it is not a static method, str.indexOf(ch)
gives the
position of character ch
in string str
whereas
with our static method for arrays of integers, position(a,n)
gave the position of integer n
in array a
. A
separate method lastIndexOf
in class String
gives the position of the highest indexed occurrence of a character
in a string. So if str
is "Hello World"
then str.indexOf('o')
returns 4
and
str.lastIndexOf('o')
returns 7
.
The methods startsWith
and endsWith
both
take string arguments and are called on strings, and return boolean
values as indicated by their names. So
str1.startsWith(str2)
returns true
if
str1
starts with str2
and false
otherwise, and str1.endsWith(str2)
returns true
if
str1
ends with str2
and false
otherwise. So if str1
is "Hello World"
,
str2
is "Hell"
and str3
is
"rld"
, then str1.startsWith(str2)
evaluates
to true
, str1.startsWith(str3)
evaluates to
false
, str1.endsWith(str3)
evaluates to true
and so on.
An important method in class String
is compareTo
,
which like startsWith
and endsWith
takes
another string as an argument. It is used to compare strings
alphabetically, so str1.compareTo(str2)
says whether
str1
comes before str2
in standard alphabetic
ordering or not. You might suppose that compareTo
would
return a boolean value, but its return type is actually int
.
This is so it can be used to distinguish three possibilities from the
call str1.compareTo(str2)
: either str1
is before
str2
alphabetically, or str2
is before str1
alphabetically, or str1
and str2
are equal.
If str1
and str2
are equal, the call
str1.compareTo(str2)
returns 0
. It returns a
negative integer if str1
is before str2
alphabetically, and a positive integer if str1
is after
str2
alphabetically. You can find the complete definition of the
String
method compareTo
in the Java documentation
here.
Many other classes also have their own compareTo
method which
works similarly to the one in class String
and enables
objects of their class to be considered in some sort of ordering with
each other. If an object is of a class that has its own compareTo
method, it has what is termed natural order, which we will look at in more detail
later.
The method compareTo
can only be called on an object of a class that has
that method, so it is not like the method equals
that can be called on
any object. The method equals
can be called on any object because
it is in the class Object
, so if a class does not have its own equals
method, it inherits the one from class Object
. The class String
has its own version of the method equals
, see below for
more details on this, and
here
for the issue of inheriting or overriding methods from Object
.
The reverse of the constructor for class String
which
takes an array of strings as its argument and produces the equivalent
String
object is the method toCharArray
.
If str
references a string, then str.toCharArray()
returns an object of type char[]
, that is an array of
characters of the same length as the string, where the characters in
the array are the same and indexed in the same order as the characters in the
string.
You can find code which demonstrates some of these methods in the
code folder. See the files
StringTest1.java
,
StringTest2.java
,
StringTest3.java
,
StringTest4.java
,
StringTest5.java
,
StringTest6.java
and
StringTest7.java
.
There is also a list of the code files used as examples in this section, with a brief explanation of the relevance of each in the
code index for this section.
Since strings are represented by objects in Java, if you want to test that two variables
holding strings store the same strings, you should use the method equals
and
not the symbol ==
. Remember that if v1
and v2
are two
variables, v1==v2
gives true
only if v1
and v2
are aliases, that is they are both set to refer to the same object. However,
v1.equals(v2)
may give true
even if they refer to different
objects. It depends how the method equals
is implemented for the class of the
object that v2
refers to. With class String
it is implemented so
that if str1
and str2
are both of class String
then
str1.equals(str2)
will give true
not only if they are aliases,
but also if they refer to different String
objects, but the lengths of the
two different String
objects are the same, and in each position they have the
same character.
Here is a simple piece of code you can use to demonstrate this:
import java.util.*; public class StringTest14 { public static void main (String [] args) { Scanner input = new Scanner(System.in); System.out.println("Enter two words:"); String word1=input.next(); String word2=input.next(); System.out.println("The two words are:"); System.out.println(" \""+word1+"\""); System.out.println(" \""+word2+"\""); System.out.println("Testing them for equality using == gives: "+(word1==word2)); System.out.println("Testing them for equality using equals gives: "+word1.equals(word2)); } }If you run it and type the same word twice, you will see the
word1==word2
test gives false
but the word1.equals(word2)
test gives
true
. That is because the method next
called on a Scanner
will always create and return a new String
object.
However, if you run the following main
method:
public static void main (String [] args) { String word1="Hello"; String word2="Hello"; System.out.println("The two words are:"); System.out.println(" \""+word1+"\""); System.out.println(" \""+word2+"\""); System.out.println("Testing them for equality using == gives: "+(word1==word2)); System.out.println("Testing them for equality using equals gives: "+word1.equals(word2)); }you will find it does give
true
in both cases. The reason for this is that
if a literal value is used for a String
object, and the same literal is used
later, Java uses an alias to the first use rather than creating a new object.
We will consider the issue of the equals
method in more detail
later.
Java's full text input-output mechanisms are complex, with many different variations.
One cause of complexity is that although Java stores characters as their Unicode values,
which are of 16 bits, many text files are stored using the ASCII format in which characters
are stored as 8 bit values. Since this module is about code that interacts with other
code, not code that interacts with directly with files or databases or networks or human
users, we do not need to consider all this. We can just use Scanner
, and that
is just so that we can provide simple front-end code for testing purposes.
We have seen the use of a Scanner
to input text in our example code, you
saw it first in this module
here.
An object of type Scanner
is created by a call to one
of its constructors, if the argument to the call is System.in
it creates a Scanner
object which reads from the
console window. As we saw from our example, the class Scanner
is in the java.util
package, but nothing else from that
package is required to make it work, so importing just that class with
import java.util.Scanner;is enough to make it available for use. We can declare a variable of type
Scanner
called input
and set it
to refer to a Scanner
object which reads from the console
window by:
Scanner input = new Scanner(System.in);Then
input.nextInt()
reads the next text typed in up to the next
blank space or end-of-line, and returns the integer that it represents.
If it happens that the next text typed in does not consist of all
numerical characters it throws an exception of type
InputMismatchException
. However, it is possible to test
in advance of reading whether the next text types in represents and
integer, this would be done by the call input.hasNextInt()
which returns a boolean.
If you wanted to read value of type double
there is the
similar method nextDouble
to read the next text typed in,
convert it to a double
value and return that value, together
with hasNextDouble
which tests whether the next unread
text can be interpreted as a double
value. The method
next
returns the next text typed in, starting with the
next non-blank character and up to but not including the following
blank character, as a string. Here “blank character” means either a
space or an end-of-line. The method nextLine
is equivalent
to the method readLine
for a BufferedReader
object, it returns the whole line typed as a string value. The methods
of Scanner
do not throw checked exceptions, so unlike
BufferedReader
, there is no need to add exception catching
or throws
annotations to the code. However, Scanner
does not provide a method like BufferedReader
's
read
which reads and returns a single character.
Full documentation on the class Scanner
can be found
here.
As we have noted previously with Java's official documentation, because it
has to cover everything, this covers far more than is necessary to know for
this module, so please look at it out of interest or for reference, but don't
think you need to know anything more about Scanner
for the
purposes of this module apart from what appears in these on-line notes.
If you are interested in full details of Java's handling of input/output,
you could look at the tutorial which the distributors of Java have written
for it, here.
Once again, none of this tutorial is information you are expected to know for the
purposes of this module. The class Scanner
was introduced in Java in 2004,
in earlier version of Java the following was recommended to read text from a console window:
create an object of the type BufferedReader
using the following construction:
BufferedReader input = new BufferedReader(new InputStreamReader(System.in));This is a standard variable declaration combined with an initialisation. The variable is called
input
, but it could be called
anything. It is set to a new BufferedReader
object created
with a constructor, that constructor takes a new InputStreamReader
object as its argument and the constructor for that takes the special
value System.in
as its argument. But you need not
worry why this is done. All you need to know is that the object
referred to by the variable input
created in this
way has two zero-argument methods that can be called on it,
read
and readLine
. The first reads a single
character, and the second reads a whole line of characters. So
input.read()
returns the next character typed, while
input.readLine()
returns a string consisting of all the
characters which have been typed in but not yet read up to but not
including the “end-of-line” character, and it will wait until the
end-of-line character is typed in before returning.
The classes BufferedReader
, InputStreamReader
and IOException
are all in the Java library package
java.io
. So they need to be imported to any file which
uses them, this can be done by adding the line
import java.io.*;at the beginning of the file. This has the effect of importing all the classes from the package
java.io
.
As we saw above, a Scanner
object can split a line of
text into individual words treating space characters as “word
delimiters” rather than as part of the words. In fact, it is possible to
use a Scanner
object just to split up a string rather
than to read text and split it up. This is done by using an alternative
constructor for Scanner
which takes as its argument
a string rather than a reference to the input from the console window.
Below is some simple code
(in the file
ScannerTest1.java
in the code folder)
which shows using BufferedReader
to read a line of text, then Scanner
to break it into words:
import java.util.Scanner; import java.io.*; class ScannerTest1 { public static void main(String[] args) throws IOException { BufferedReader input = new BufferedReader(new InputStreamReader(System.in)); System.out.println("Enter some words: "); String words = input.readLine(); Scanner splitter = new Scanner(words); System.out.println("\nThe words you entered are: "); while(splitter.hasNext()) System.out.println(splitter.next()); } }The method
hasNext
on a Scanner
object
returns true
until the point where the Scanner
object has gone through all its input and reached the end.
A similar effect can be obtained by the use of the method
split
in class String
. Calling
split
on an object of class String
returns
an object of type String[]
, that is an array of strings,
where the strings in the array are the separate word the string
it was called on breaks up into. So if the variable str
refers to the string "Hello world, how are you?"
,
the variable declaration and assignment
String[] words = str.split(" ");causes the variable
words
to refer to an array of strings
of length 5, where words[0]
is "Hello"
,
words[1]
is "world,"
,
words[2]
is "how"
,
words[3]
is "are"
and
words[4]
is "you?"
.
Here the punctuation symbols are treated as normal characters and so
are treated as part of the words they are next to. The
argument to the method split
tells it what it
must treat as word delimiters. So str.split(" ")
tells it to treat the space character as spacing between words, and
all other characters as normal characters. If we made it
str.split(",")
it would tell it to split the string
up treating the comma character as the separating
characters. So in the above example, it would split the string into
two parts, words[0]
would be "Hello world"
and words[1]
would be " how are you?"
.
In the front-end code used to set up examples previously, you will
have see that str.split(" +")
is used rather than
str.split(" ")
. The reason for this is that using
" "
tells split
to treat a single
space character as the spacing between two words, but using
" +"
tells it to treat one or more space characters
as the spacing between two words. So using " +"
means
the front-end would still work if you decided to put two spaces
rather than one between the numbers or words you types in. The full rules
for how the argument to split
works use a concept known
"regular expressions" which is explained a little more
below.
If you want to convert a string, perhaps obtained from a call of the
method split
, which represents an integer to the
equivalent integer, you need to use the method parseInt
which is a static method in the class Integer
.
Because the class Integer
is in the java.lang
package
it does not have to be imported. If you want to call a static method
from another class, the call is attached to the name of the class.
So if str
refers to a string object all of whose characters
are numerical, then Integer.parseInt(str)
returns
the value of type int
it represents. If ch
represents the string "6315"
then
int n=Integer.parseInt(str)
will declare the variable
of type n
and set it to the number
6315
. You need to be aware that "6315"
is
just the string of four characters '6'
, '3'
,
'1'
and '5'
, it is an entirely separate thing
from the number 6315
which on the computer isn't even
represented using decimal digits. If you call Integer.parseInt(str)
,
but str
is not a string which represents an integer, it
will cause an exception of type NumberFormatException
to be thrown.
The static method parseDouble
in class Double
works in a similar way to parseInt
, but converts strings
that represent double
values (that is, floating point numbers)
to the actual double
value they represent.
The class
Integer
is one of the wrapper classes.
It is the wrapper class for the type int
. There is
also the class
Character
which is the wrapper class for
the type char
. The other primitive types all have wrapper
classes which are the same name as the primitive type except for an initial
upper-case letter, so Double
is the wrapper class for the
type double
for example.
Wrapper classes exist because in some places in Java you can only use
types which are not primitive types. A variable of type Integer
refers to an object which holds a value of type int
inside it,
so it is a different thing from a variable of type int
which
holds the value directly. In older versions of Java, if i
is
of type int
, to create the Integer
equivalent
you would have to use the constructor, so
Integer n = new Integer(i)
would create
a new variable of type Integer
called n
and
set it to refer to an object which contains the value of i
.
The reverse conversion would use the method called intValue
in
class Integer
, so
int j = n.intValue()
would declare a variable of type int
called j
and set it to the int
value inside the object referred
to by n
. In Java 5 and later versions, however, the conversion
is automatic.
Integer n=i
has the same effect as
Integer n = new Integer(i)
, and
int j=n
has the same effect as
int j = n.intValue()
.
In general in Java 5 and later versions, you can use references to objects
of type int
whenever a value of type Integer
is required, and it will be
converted automatically (referred to as boxing), and you can use
a value of type Integer
whenever a value of type int
is required (referred to as unboxing).
The wrapper classes also contain a number of static methods dealing with values
of the primitive type they wrap. For example, the static method
toBinaryString
in class Integer
takes an
int
argument and returns a string of '1'
and
'0'
characters representing its binary equivalent. So if
variable n
of type int
holds the value
21
then Integer.toBinaryString(n)
will return
"10101"
. In the class Character
, the static
method toUpperCase
takes a value of type char
and returns its upper case equivalent, so if variable ch
holds 'q'
then Character.toUpperCase(ch)
will
return 'Q'
. Class Character
has various static
methods for determining the category of characters. For example,
Character.isUpperCase(ch)
will return true
if ch
stores an upper case letter, and false
otherwise.
In the code folder the file
StringTest8.java
demonstrates the toBinaryString
method of class
Integer
and
StringTest9.java
demonstrates some methods from class Character
.
It was noted previously that str.split(" ")
returns
an array of strings which comes from the the string referred to by
str
broken up treating the space character ' '
as a separator between strings and not part of the strings. Actually,
it means a substring consisting of one space character is treated
as the separator. However, with str.split(" +")
the separator
is any number of spaces. This is because the string argument to
split
is treated as a “regular expression”. In a regular
expression, some characters have special meanings. The character
'+'
has the meaning “repeated one or more times”.
The Java code library provides some complex built-in code for pattern-matching
with strings, in particular the classes
Pattern
and
Matcher
are provided for this. These classes enable Java to be used for tasks which
are otherwise handled by programming languages in the class
of scripting languages
which are tailored to text manipulation, such as
Perl.
Although there is no time to cover any of this material in this module,
if you are using Java extensively for text handling, it would pay to
familiarise yourself with it, as the chances are that much of the code you
might otherwise write yourself can be done by using these classes and their
methods. The basic idea is to use
regular expressions,
in which certain characters are given a special meaning to match with various
patterns of characters in ordinary strings. A
tutorial to Java's regular expressions capability can be found
here.
Please note, again, that as this module is on programming basic algorithms and data structures, if you are asked to write code for a task in an assessment for this module, ECS510, you are expected to write it only using the basic Java covered directly in the module. So you will not gain any marks by instead using built-in code from Java's code libraries unless you are told that you are meant to use it. The above links are provided out of interest and for future reference for you. If you are writing Java code for some other task, of course you should use all that the Java library provides, but if you are being assessed on your abilities in basic programming, you are not showing those abilities if instead of writing code to solve a problem, you use pre-existing code provided as part of the Java library.
An alternative way of programming with strings to the way using loops
we considered above is to use
recursion. We looked at recursion with arrays
previously,
and we shall consider it in more detail when we look at its use with Lisp lists in the
next section. Recursion
means thinking of a solution to a problem in terms of a solution to a
smaller version of the same problem. For example,
above we considered the test
that one string starts with another. If we had to program a static
method to test whether str1
starts with str2
,
one way we could think of it is as follows. If str2
is the
empty string, obviously str1
does start with str2
.
If str1
is the empty string, and we have already tested
str2
and found it is not the empty string, then obviously
str1
does not begin with str2
.
If str1
and str2
start with different characters,
then obviously str1
cannot start with str2
, we
need consider no further. If str1
and str2
start with the same character, then str1
starts with
str2
if the string consisting of everything except the first
character of str1
starts with the string consisting of
everything except the first character of str2
. So, for
example, we can tell that "woodpecker"
starts with
"wood"
by first noting both have the same first character
'w'
, and then testing that "oodpecker"
starts
with "ood"
. This leads to the following code:
public static boolean startsWith(String str1,String str2) // Returns true if str1 starts with str2, false otherwise { if(str2.length()==0) return true; else if(str1.length()==0) return false; else if(str1.charAt(0)!=str2.charAt(0)) return false; else return startsWith(str1.substring(1),str2.substring(1)); }since
str.substring(1)
returns the string which is all
of str
except the first character. So we have the method
startsWith
making a call to the method startsWith
,
this “method calling itself” is what is referred to as “recursion”.
You can find this code and supporting code to run it in the file
StringTest11.java
.
Although it may seem odd if you are not used to this style of programming,
it works. If you want to think through why it works, remember the call of
startsWith
inside the code for startsWith
will
execute in its own environment, as we discussed
previously.
This will be one where there is a separate variable called
str1
which holds the value of str1.substring(1)
from the previous environment, and a separate variable called
str2
which holds the value of str2.substring(1)
from the previous environment.
So although we tend to say that recursion is when “a method calls itself”, a better way of thinking about it might be to say it is when “a method call makes another call to the same method”. This makes it clearer that the method code is the pattern for a method call, just as a class is a pattern for an object, and just as every object of a class has its own variables, every call to a method has its own variables. A variable of a particular name in one method call is not the same variable as the variable of the same name in another call to the same method. So, you cannot assign a value to a variable of a particular name which is a parameter variable or local variable declared inside a method and cause the variable of the same name in another call to the same method to have its value changed.
In this method startsWith
, the process of the method
“calling itself”, won't go on forever because it will
eventually get to the case where either str1
is the empty
string or str2
is the empty string, or the two strings
have initial characters which are not equal. When a method which
contains a recursive call has arguments which mean the recursive call
is not made, it is known as a base case. A recursive method must
always have at least one base case, and each recursive call must be closer
to a base case, otherwise it will be the equivalent of an infinite loop.
With startsWith
there are three base cases.
To contrast with “recursion” the use of loops is called
iteration). For comparison, here is the same startsWith
operation implemented iteratively:
public static boolean startsWith(String str1,String str2) // Returns true if str1 starts with str2, false otherwise { int i; for(i=0 ;i<str1.length()&&i<str2.length(); i++) if(str1.charAt(i)!=str2.charAt(i)) break; return i==str2.length(); }You can find this and supporting code to run it in the file
StringTest10.java
.
Another way of doing it using loops is more similar to the
recursive way:
public static boolean startsWith(String str1,String str2) // Returns true if str1 starts with str2, false otherwise { while(str2.length()!=0&&str1.length()!=0&&str1.charAt(0)==str2.charAt(0)) { str1=str1.substring(1); str2=str2.substring(1); } return str2.length()==0; }You can find this and supporting code to run it in the file
StringTest12.java
. As
you can see, it has a loop where the condition to stay in the loop
is the opposite of the base case conditions in the recursive version.
Instead of setting up a separate environment with new variables
str1
and str2
, the existing variables
of these names are changed to hold the values the variables of the
names would have in the recursive call. The value the method returns
is the same value the base case returns in the recursive version
(true
if the length of str2
is 0,
false
otherwise).
Not all recursive code can be converted so easily to iterative
code. The startsWith
code is an example of what
is called tail recursion, which is when the
return
statement has just a recursive call as its return value.
A more complicated example is when something is done with the result of
the recursive call to get the return value.
For example, above, we
saw an iterative method which took a string and two characters and
returned the string resulting from changing all occurrences of the
first character to the second. A recursive method which does the
same thing is given below:
public static String replace(String str,char ch1,char ch2) // Returns the string resulting from replacing all occurrences // of ch1 by ch2 in str. { if(str.equals("")) return ""; else { String str1 = replace(str.substring(1),ch1,ch2); if(str.charAt(0)==ch1) return ch2+str1; else return str.charAt(0)+str1; } }You can find this and supporting code to run it in the file
StringTest13.java
.
Thinking about this code logically, if you want to change all occurrences
of ch1
to ch2
in a string, if the string is
empty, you just return the empty string. Otherwise, you get the result
of changing all occurences of ch1
to
ch2
in the string which consists of all but the first
character of str
. Then, if it happens the first character of
the original string is ch1
, you add ch2
to the front, otherwise you add the first character of the original
string to the front.
The reason this can't be converted so easily into a loop is
that you need to keep the old value of the string argument
variable, str
in this case, for use after the recursive call,
so you cannot just reassign the variable and thus lose its old value.
Tail recursion can convert easily into iteration, because we do not need
to go back and use values from previous environments. With recursion where
something is done after each recursive call, it will be done in the previous
environment as it is returned to.
(1) A call to format
has a String
argument which contains
“format specifiers”, followed by a number of further arguments equal to the
number of format specifiers in the first argument. It returns a string in
which is like the first argument except that the first format specifier is
replaced by a formatted version of the second argument, the second format
specifier is replaced by a formatted version of the third argument, and so on.
A format specifier consists of the character %
followed by some
further characters and ending in a code character specifying the type of
formatting. The full details of what can be done with this formatting are
complex, but it is most useful purpose is in laying out numerical information.
The format %
followed by a number ended by d
matches
with an integer value and gives the integer with extra spaces padding it out
to make the length equal to the number. So you can use, for example
%4d
if you have a number you know will be between 0
and 9999
and you want it to take up exactly four characters.
The format %
, followed by a number, followed by .
followed by another number, followed by f
, matches with a floating
point value and gives the value padded out with spaces to a full length of
the first number in the format specifier and with the number of decimal places
as given by the second number in the format specifier. The format %%
gives the %
symbol.
As an example, if units
is an int
variable holding
9
and mean
is a double
variable holding
57.66666666666
, then the statement
str = String.format("You have %2d units and average %5.2f%% marks",units,mean);causes
str
to be set to
"You have 9 units and average 57.67% marks"
.
You can also use formatting in direct printing. If you have the statement
System.out.printf(...)
, where the ...
is the
string and further arguments in the same way as used for the method
format
above, it will cause the formatted string to be printed
out in the command window the program is running in.
An example of string formatting, and also the use of a StringTokenizer
can be found in the file
StringTokenizerTest1.java
.
Last modified: 23 February 2019