Java has a special syntax which is used just for dealing with arrays,
and for nothing else. It employs the square brackets, [
and ]
. The first thing to note is that the combination
[]
is the way to make an array type out of any
other type. Just add it to the end of the type name, and you have a
new type which means "array of ..." where "..." is the type name you
added it to. So int[]
is the type "array of integer",
String[]
is the type "array of strings",
Can[]
is the type "array of cans", where Can
refers to a type defined in the Java class Can
, perhaps
the class we used in the previous
set of notes. You can use a type name which ends in []
in just the same way as any other type name: to declare variables,
to declare arguments to methods, and to give a return type for methods.
If we have a type name which ends in []
then we refer
to the name which comes before the []
as the "base type"
of the array type.
Arrays are a form of object. As with other objects, you need to distinguish between a variable which refers to the object and the object itself, and you need to be aware that assignment causes aliasing. For example,
int[] a,b;declares two variables,
a
and b
which are
of type int[]
, that is "array of integer". It doesn't,
however, cause any array objects to be created. Suppose that later
in the code, a
and b
have been made to
refer to some objects - which must be arrays of integers. In this
case, executing the assignment b=a
will cause b
to stop referring to the array it was referring to before the assignment
and start referring to the array a
was referrring to before
the assignment. This doesn't stop a
continuing to refer
to the same array it referred to before the assignment, and it doesn't
stop any other variable which refers to the array b
referred
to before the assignment from still referring to that array after the
assignment. Note that if you want to refer to the whole array object,
you just use the variable name, as with a
and
b
here, it is not needed and it is not even correct to
add []
to the variable name to indicate that the variable is
of an array type. Remember that the only use of []
as a single symbol is to make an array type out of some other type.
To create a new array object you need to use the construction consisting
of the word new
, followed by the base type of the array
object, followed by [
followed by an expression which
evaluates to an integer, followed by ]
. This creates a
new array object whose length is the integer value the expression
evaluated to. So, for example,
a = new int[10];
will cause a
, assuming a
has already
been declared to be of type int[]
as above, to be
set to refer to an array of integers of length 10. Note that if
a
had previously been set to refer to another array,
it would stop referring to that array and start referring to the
new array of length 10. Declaration of a variable of an array type
and its initialisation to a particular array are often combined,
as in:
int[] a = new int[10];There is no restriction on the length of an array a variable of an array type may refer to, so a variable which refers to an array object of one length can be assigned to refer to another array object of another length, so long as it is of the same type. However, array objects themselves cannot change length. When an array object is created, its length is fixed at that time.
If a
refers to an array object, then a[i]
refers to the component of the array object indexed by i
.
Here i
can be any expression which evaluates to
an integer, it needn't be a single variable of type int
.
You can treat a[i]
exactly like a variable of the
base type of a
, you can assign a value to it, for
example as in a[i]=n
, or you can use it as an argument
to a method, or if the base type is a numerical type, as part of
an arithmetic expression, as in n=a[i]+1
. The flexible
thing is that the variable a[i]
refers to changes as
i
changes its value. Note that the length of
the array which a
refers to is given by a.length
.
You can use a.length
in any expression or as a method
argument which requires a value of type int
, but you
can't assign a value to it. If an array object is referred to by the
variable a
, then its component parts are referred to
by a[0]
, a[1]
and so on up to
a[a.length-1]
, or, of course, by a
followed
by [
followed by an expression which evaluates to any
of the values from 0
to a.length-1
, followed
by ]
.
Because you can change the value of the content of an array by the
assignment statement a[i]=expr
,
where a
is any reference to an array,
i
is any expression which evaluates to
an int
value, and expr
is any
expression which evaluates to a value of the base type of
a
, arrays are mutable
objects.
This means that methods are sometimes written where arrays are
passed in as arguments with the intention that executing the
method will change the array, as that change will be passed on
once the method has finished. It also means you have to be careful
if two different variables refer to the same array to remember that
changing the array through one variable will cause the array to
be changed as it is viewed through the other.
Searching for an item in an array
A common array operation is to find if a particular value occurs in
an array. You can do this by looking at each value in the array in
turn, until either you have found the one you want or you have gone
through the whole array and not found it. Here is a code
fragment that, given an integer in the variable n
and an array of integers referred to by a
finds if
the integers n
is stored in the array a
:
for(int i=0; i<a.length&&a[i]!=n; i++) {}A variable called
i
is set to run through the
numbers 0
, 1
, 2
in turn.
Note the test condition here, i<a.length&&a[i]!=n
,
first you have to test that you have not reached the end of
the array which happens when i
reaches the value
a.length
, then you have to test that the integer in the part
of the array indexed by the variable i
is equal to
n
. In Java, when you have
expr1&&expr2
where expr1
and expr2
evaluate to booleans (that is, to true
or false
),
if expr1
evaluates to false
, the
joint expression is given the value false
without
any attempt to evaluate expr2
. This is acceptable
logically, since "P AND Q" viewed as a statement in logic is only TRUE
if both P and Q are TRUE, so must be FALSE if P is FALSE regardless of
the value of Q. But it is also essential if the value of
expr1
tells us we can't evaluate
expr2
. If i<a.length
is
false
then i
is beyond the maximum
size for referencing a component of a
, so we shouldn't
even try to evaluate a[i]!=n
since attempting to
look at the value of a[i]
will cause an error.
The {}
indicates this is a for-loop without a body.
All it does is update and test. We could write it as the while loop:
int i=0; while(i<a.length&&a[i]!=n) { i++; }Another way of writing it would be:
for(int i=0; i<a.length; i++) { if(a[i]==n) { break; } }Here I have put the opening
{
and closing }
for both the for
loop and the if
statement.
Note you can omit them if the code they enclose is just a single statement.
That is the case here - inside the if
statement, inside the if
statement
(which here does not have an else
part) is a single
break
statement. So this could be written as:
for(int i=0; i<a.length; i++) if(a[i]==n) break;It's a matter of taste whether you put omit the brackets when they are not required due to enclosing just one statement. My preference is to omit them in this case as I feel it makes the code look less cluttered, but other authors suggest they should always be used. Remember it makes no difference to how the program executes, it's just a matter of what makes the code look clearer to the human reader.
It's also a matter of taste whether you use the loop test with the
two parts as given at first, or prefer a simpler loop test with
an alternative exit from the loop using a break
statement.
A break
statement has the effect of execution
immediately leaving the loop it is in and starting on the statement
following the loop. If there is very little code in the loop body it
may make the code clearer if the two ways of exiting the loop are
separated in this way. A break
statement hidden in
a lengthy piece of code, however, could be easily missed so on
glancing at the loop header the human reader may not realise there
is an alternative way of exiting the loop other than the test there
becoming false. So, in general, use break
with caution.
Java also has a "labelled break" which enables execution to jump
out of a loop within a loop, or even more layers of loops. Just very
occasionally you may find this construct helps avoid what would otherwise
be very convoluted code, but it's not something you should make a
habit of using.
Now we have seen a loop which halts either when we have gone through
the whole array or found the integer we are looking for, what are
we going to do with it? Note that when the loop terminates, the
loop index variable i
either has the value
a.length
in which case an integer with the value
equal to that in variable n
hasn't been found in the array,
or the variable i
has the value
of the index of the component of the array where the integer with
value n
has been found. But an immediate problem
with the code we wrote is that as the integer variable i
is declared in the initialisation part of the for
loop,
it goes out of scope after the for
loop. So if we
want to access it after the loop, it should be declared before it
rather than in it, as in:
int i=0; for(; i<a.length; i++) if(a[i]==n) break; if(i==a.length) System.out.println("The integer "+n+" is not in the array"); else System.out.println("The integer "+n+" is in the array");Of course, the
System.out.println
statements could
be replaced by whatever it is we want to do which varies depending
on whether the integer is in the array or not.
The operation of testing whether an integer is in a particular
array is so common we might want to make it a separate static method.
If we call the method isIn
it must take an integer
and an array of integers as an argument, and return a boolean.
Then the above could be written just:
if(isIn(a,n)) System.out.println("The integer "+n+" is not in the array"); else System.out.println("The integer "+n+" is in the array");We might initially think of writing the method as:
public static boolean isIn(int[] a,int n) { int i=0; for(; i<a.length; i++) if(a[i]==n) break; if(i<a.length) return true; else return false; }But remember that
i<a.length
is itself a boolean value,
so we could make our code a little neater by writing it as:
public static boolean isIn(int[] a,int n) { int i=0; for(; i<a.length; i++) if(a[i]==n) break; return i<a.length; }Whenever you find yourself with a method which returns a boolean and you are writing something of the form
if(test) return true; else return false;remember you can always write it as:
return test;However, another way of writing the method is:
public static boolean isIn(int[] a,int n) { for(int i=0; i<a.length; i++) if(a[i]==n) return true; return false; }Remember that
return
in a method acts to halt execution
of the method, so in this case it combines breaking out of the loop
and returning the value true
. The final statement
return false
is only executed if the loop
terminates because its test is false
, so the condition
where a[i]==n
which would have caused it to exit before
never occured. It is important to note that this final statement is
outside the loop, if the brackets enclosing the loop body were put
in, it would look like:
public static boolean isIn(int[] a,int n) { for(int i=0; i<a.length; i++) { if(a[i]==n) return true; } return false; }which makes this clearer. Do not make the mistake of confusing this with:
public static boolean isIn(int[] a,int n) // THIS CODE IS SILLY! { for(int i=0; i<a.length; i++) if(a[i]==n) return true; else return false; }where the
return false
is inside the loop.
Here, when a[0]==n
is true, the method halts and
returns true
but when a[0]==n
is
false, the method halts and returns false
without
checking the rest of the array, which is obviously not what we want.
Now, suppose a
is of type String[]
,
and we are searching for whether a particular string, given by
variable str
of type String
is in the
array. Here is a code fragment which does this, and prints a message
saying whether the string is in the array:
int i=0; for(; i<a.length; i++) if(a[i].equals(str)) break; if(i==a.length) System.out.println("The string "+str+" is not in the array"); else System.out.println("The str "+str+" is in the array");Or we could write a static method that tests whether a string is in an array of strings:
public static boolean isIn(String[] a,String str) { for(int i=0; i<a.length; i++) { if(a[i].equals(str)) return true; } return false; }You should be able to spot the similarity to the previous code. The types have to be changed, and also to test whether two integers
m
and n
are equal we use m==n
,
but to test whether two strings str1
and str2
are equal, we use str1.equals(str2)
, so if a
is an array of strings to test whether the string in the component of
a
indexed by i
is equal to str
we use a[i].equals(str)
.
What is happening here is that the algorithm is exactly the same since it does not depend on the types in the array. The word "algorithm" means "a way of solving a problem". The problem here is finding whether a particular item is in an array, the algorithm is to look at the components in the array one at a time in the order in which they are indexed until we have either found the item or gone through the whole array.
You might wonder whether we have to write separate method
isIn
for every possible type of item where we want
to test whether a particular item of that type is stored in an
array of items of that type. Java (since the version called
Java 5, which was introduced in 2004) offers a way round this
which enables us to write a generic version of isIn
that can be specialised to work for objects of any type. But this
is something to be discussed later.
In the directory ~mmh/DCS128/code/arrays
you will
find two files,
UseArrays1.java
and
UseArrays2.java
with a demonstration of testing for
membership of arrays, one with arrays of integers, the other with
arrays of strings. Supporting code is needed to read in the contents
of the arrays, but at this point you need not be concerned with how
this code works.
Finding a position in an array and the importance of specification
Suppose you want to find not just whether an item appears in an
array, but its actual position. You could use the same code as
before, but return the value of the loop index when it finds the
item being searched for. Obviously, your method must now return
an int
rather than a boolean. Here is a version which
uses the loop without a body we considered first:
public static int position(int[] a,int n) { int i=0; for(; i<a.length&&a[i]!=n; i++) {} return i; }What happens if the integer is not in the array? Here the value returned is equal to the length of the array, but this is probably not a good way of dealing with the problem. It would be more clear that we are dealing with the special case of an integer not occurring in the array if in that case we returned a value which could not otherwise be returned, a fairly standard way of dealing with this would be to return
-1
. Here is some code which does this, this
time using the technique of a return
statement inside the loop:
public static int position(int[] a,int n) { for(int i=0; i<a.length; i++) { if(a[i]==n) return i; } return -1; }Again, remember the final
return
statement only gets
executed if the return
statement inside the loop
never gets executed because at no stage is a[i]==n
true.
Suppose we decide to go through the array starting at the highest indexed component and working down:
public static int position(int[] a,int n) { int i=a.length-1; for(; i>=0&&a[i]!=n; i--) {} return i; }In this case it happens that
i
will have the value
-1
if n
is not in the array a
.
But there is a subtle difference between this code and the above.
What if the integer n
appears more than once
in the array a
? In the first case what will be returned
is the lowest of all the indexes of occurrences of n
,
in the second case the highest. When we draw diagrams of arrays
we generally show the contents listed from the lowest indexed on
the left to the highest indexed on the right, as in:
Here, the integer 54 occurs in the array component indexed by
1
and in the array component indexed by 6
.
Depending on how we wrote the code, the method positions
could return either 1
or 6
if its arguments
were a reference to the array shown diagrammatically above and the
integer value 54. Sometimes we actually refer to the lower indexed
components of an array as being to its "left" and the higher indexed
components as being to its "right". So we may say our method
to return the position of an integer in an array of integers will
return the position of the "leftmost" or "rightmost" occurrence.
If we describe the behaviour of some code in terms of diagrams
we have drawn up to help us visualise it, we should make sure the
person we are describing it to also understands it in terms of the
same diagram. We should remember the diagrams are not how it is
"really" represented on the computer.
The issues we have encountered with this problem indicate that when we write a piece of code to solve some problem, we should be careful to make sure we cover every possible circumstance. Here we started off saying we wanted a method to return the position of an integer in an array of integers, but when we came to write the method we found out we needed to decide how to deal with the case of the integer not occurring at all in the array, and the integer occurring more than once in the array. When a large program is being written, it will often be the case that one person writes the code that uses a method and states what they want the method to do, while another person writes the code for that method. The description of what the method is meant to do is called the specification and the actual code that does it is called the implementation. If what we are told about what a method should do does not cover every possibility, we term that method underspecified. A full specification of a method to return the position of an integer in an array of integers would say what is returned when the integer does not occur in the array, and what position is returned if it occurs more than once.
The danger with underspecified methods is that the person who uses
the method may just assume that in cases not specifically covered one
way of dealing with them will be used, while the person writing the
code may choose another way. This could lead to problems when the
code for the method and the code that uses it is put together to
make a complete program. In the above example it may be that the person
who wanted to use the method position
just assumed it
would return 1 and never supposed the person writing the method
would write it so that it returned 6. It is for reasons like this
that in large scale programming, writing the specification for
a piece of code is as important a skill as writing the code itself.
So we see here, as we saw with the drinks machines example, the importance of specification. We need to think of our methods and classes as things which are designed to do a particular job, and not just as arbitrary pieces of Java code put together. Then when in one piece of code we write a call to a Java method, we think of that call in terms of the job it is supposed to do rather than in terms of the Java code that it executes when the call is made. Quite often, thinking about the exact job we want a method or class to perform helps us to write better code. But sometimes also when we come to write the code we find there were aspects which weren't specified originally, and so the task of writing the code leads us to refine the specification.
In the directory ~mmh/DCS128/code/arrays
you will
find two files,
UseArrays3.java
and
UseArrays4.java
which demonstrate finding the
position of an integer in an array of integers. Note, the
important issue here is the code in the method position
which actually implements the algorithm to find the position.
The code in the method main
is just support code
which enables a demonstration of a call to the method to be run.
The two files show starting search of the array in different
directions, UseArrays3.java
from the lowest indexed to
the highest and UseArrays4.java
the other way round.
Remember when running the demonstrations that the position of components
of an array starts at position 0, then position 1 and so on. So the
number given may be one less than you are expecting if you didn't
realise this.
You will also find a file
UseArrays5.java
in the directory, which demonstrates finding the position of a string
in an array of strings. You can see that the method which does
this is identical in pattern to the method in UseArrays3.java
,
differing only in the name of the base type and the use of equals
for equality of strings rather than ==
for equality of
integers.
Last modified: 16 June 2005