Algorithms and Data Structures in an Object-Oriented Framework: Introduction

This is a level 5 programming module. Most people taking it will have taken two level 4 programming modules taught in the department, Procedural Programming and Object-Oriented Programming. If not, it is assumed you are joining the Queen Mary Computer Science degree programme on the basis of having studied something similar elsewhere.

This section of notes introduces this module, Algorithms and Data Structures in an Object Oriented Framework (ADSOOF), and puts it in the context of modules you have already taken and modules you will be taking alongside ADSOOF.

Structured Programming

In Procedural Programming you start seeing computers as instructed to obey commands based on manipulating a set of variables. As a programmer, you write those commands. The commands do things like take values from the variables they are stored in, calculate new values, put values into variables. There are also commands to read values from the computer screen and to write values to the computer screen. This way of seeing computing is very close to the way computers work according to the electronics inside them, the hardware, which you may have studied in the module Computer Systems and Networks. So the variables are like the locations in the computer's memory, except that you refer to them just by the names you have given to them, and not by their position in the computer's memory. Also, you think of them as holding integers, or floating point numbers, or characters, or strings and so on, not as just holding binary values. You can do things like put together complex mathematical formulas, you do not have to break them down into individual calculation steps as you would if you were writing instructions in machine code, which is the language directly obeyed by computers as it works directly on their hardware.

Actually, the instructions you write in a programming language like the one you used in the Procedural Programming module are changed by the computer into something which is closer to machine code, so the computer can obey them. For the purposes of this module, you do not need to worry about how this is done (a specialised area of Computer Science called compilers and interpreters covers this). You do not have to worry about what the actual hardware of the computer is doing underneath. The whole point of programming languages like Java, designed for human use, and sometimes termed high level programming languages, is that they enable instructions to be written in a way which corresponds more to the way humans think than to the electronics of a computer.

Also moving away from how computers work at the electronics level, you were encouraged to think in terms of structured programming, which means instead of seeing a computer program as just one long list of instructions, you see it as a structure made up from loops and selections. So a piece of program like:

A;
while(T1)
   {
    B;
    if(T2)
       {
        C;
        D;
       }
    else
       E;
    F;
   }
G;
should be considered as consisting of three parts, A, a while loop and G. The while loop has a test, T1, and three parts inside it, B, an if statement and F. The if statement has a test, T2 and two branches, one branch consisting of C followed by D, the other consisting of just E. It helps us human programmers to indent the code, meaning we put some spaces before the instructions to show how they fit inside each other. This is what is done above, it is not part of the Java language's requirement that it is laid out in this way, you could actually use any arrangement of blank spaces and line breaks, and it would still work.

The rules for laying out the code above are that the braces which form part of a while loop or if statement are put several spaces to the right of the while, if or else keyword they come with, and on a new line. The } which matches a { is always put the same number of spaces to the right. Everything within the { and } is put the same number of spaces plus one to the right, or even further if it is part of another while loop or if statement. Different people may use slightly different styles of layout (for example, some put the { on the same line as the word while or if or else it comes with), but the basic principle is the same - clear layout with indentation makes the structure clear when a human being who knows about programming looks at it.

You could think of this structure as being something like:

      A;   while;   G
           /  \
          /    \
         /      \
       T1     B; if; F
                /|\
               / | \
              /  |  \
            T2  C;D  E
where the pattern
   while
    / \
   /   \
  /     \
Test    Body
means “do Test, if it is true then do Body, and repeat”, and
      if
     /|\
    / | \
   /  |  \
Test B1  B2
means “do Test, if it is true then do B1, if it is false then do B2”. In Java, when Body or B1 or B2 consists of two or more instructions following each other, they have to be combined together by putting a { before them and a } after them. You don't need these symbols if Body or B1 or B2 is just a single instruction (including one which itself is actually a while loop or an if statement, but that's a single instruction), though some people recommend putting them in anyway when that's the case. What you should not see this code as is:
 1: A
 2: if not T1 go to 11
 3: B
 4: if not T2 go to 8
 5: C
 6: D
 7: go to 9
 8: E
 9: F
10: go to 2
11: G
even though this is how the computer represents it. It's much harder for a human to see a more general structure when it looks like this. You should be thinking of how the whiles and ifs and fors and { and }s structure code, and not looking at it in terms of just lines around which computer execution jumps.

Structuring using static methods

Following learning to see computer instructions in this structured form, you learned to use methods, or more strictly static methods. A method is like a mini program. You write a method for example like:

public static Type1 instruction(Type2 a,Type3 b)
followed by a program within { and } brackets. then in your original program something like:
u = instruction(v,w);
will cause the main program the method call is in to be halted and the program for the method to be obeyed, with variable a set to the value of the variable v from the main program and variable b set to the value of the variable w from the main program. The types of the variables have to match, so v must be of type Type2 and w must be of type Type3, also u must be of type Type1 (when we consider inheritance, we will see that Java's use of types is actually a little more complex than this, but this will serve as a first explanation). When the program for the method obeys the instruction
return val;
the effect is that u is given the value val (which must be of type Type1) and then we carry on with the main program. What we have done is invented a new instruction which takes two things, of type Type2 and Type3, and gives a thing of type Type1. Obviously, we can give any name to our new instruction, and it can take whatever number of things of whatever different types we define. If the return type (which is Type1 in our example) is given as void, it means an instruction which doesn't actually give anything back, though it may do some other action.

So you should really think of defining a method as inventing a new instruction, and then just use that instruction just as if it was provided for you by Java rather than written by yourself. The only thing is that since you wrote it yourself, it might go wrong, whereas we can be sure the people who wrote Java made sure the commands it provides never go wrong. So if you write a program which makes use of your own methods, and it goes wrong, you have to check whether that's because the program goes wrong, or the program in the method definition goes wrong.

Breaking up a program into method definitions and method calls is another way of structuring a program. Instead of looking at the code as just one big list of instructions, you can break it down into its individual method definitions. It's important to know that when you call a method, it works just like a mini program, it doesn't remember the values of its variables from other calls of the same method. So in our instruction example, there is actually a separate variable called a and a separate variable called b for every time the method is called, not a single a and b which they all share. Because of this, you can even have a method call itself (inside the program for the method there's a call to the same method), an important technique known as recursion. It works because the inside method call has different variables (even though they have the same names!) to the variables in the program that called it. When the inside method call has finished, you go back to the outside method call and back to using that method call's variables.

Also, although for introductory programming exercises you may think of a program as mainly centred round a main method, for more realistic programs there are method calls making method calls making method calls ... . A method called main is actually just the way Java sets up a complete program running from a console window. You shouldn't think, despite its name, that it should be the main focus of writing the whole program, where the whole program consists of all classes and methods which together make it up.

Object Oriented Programming

Object oriented programming is a further way of structuring programs. The idea is that now we define and use our own types as well as our own instructions. A call of a method which is non-static, sometimes called an instance method, has to be a call “on an object”. Suppose I have a variable obj which is of type MyType, and my method called instruction instead of being static as before, is now not static and is defined inside the class MyType, so:
public Type1 instruction(Type2 a,Type3 b)
That means my method call has to be attached to an object of type MyType, as in:
u = obj.instruction(v,w)
The mini-program for the method instruction can now not only use variables named a and b, and any variables it defines itself, it can also use variables which aren't defined inside any method, but are defined just inside the class MyType. Unlike a and b and variables defined inside the program for the method instruction, these variables will remain in existence after the method call has finished. They are inside the object called obj and will be used again when another method call is made on obj. So a method call on the object might be intended to change the object rather than return a result.

It's important to distinguish between an object and a class. A class can be considered a “blueprint” for an object. Every object of its type is built to the format given by the blueprint, but they are separate entities. So, for example if class MyType started:

class MyType
{
 private Type4 p;
 private Type5 q;
it would mean that every object of type MyType has its own variables called p and q inside it, there isn't just one variable called p and one variable called q. If I call obj1.instruction(v,w) and obj2.instruction(v,w), each call has its own variables a and b, both set at the start to the value of v and w though the method program for instruction could change them. If the method mini-program for instruction makes use of the variable names p and q, for the two different calls this will refer in the one case to the variables called p and q inside the object called obj1 and in the other case to the variables called p and q inside the object called obj2. As an additional complexity, it is possible in Java for an object to have more than one name, this is called aliasing. So it may be the case that obj1 and obj2 are two different names for the same object of type MyType, in which case the p and q variables used by the two calls obj1.instruction(v,w) and obj2.instruction(v,w) will be the same variables.

Although it is not necessary to put the word private in front of the declaration of variables in a class as in MyType above, it is good practice always to do so. The reason is that there should be a clear separation between the internal structure of an object, and how it interacts with other objects. When using an object you should think of it in terms of the methods it provides, not in terms of the variables inside it. This helps break down a program into separate parts which can be understood on their own without reliance on other parts. When a variable in a class is declared as private, the only code that can access and change the value of that variable is methods in the same class. Code outside the class can only access and change the variables indirectly by calling those methods. Knowing that these limitations apply makes code much easier to follow.

Object-oriented programming has been successful as an approach to building programs because the objects in a program often represent real objects in the world which we are simulating in some way in our program, or with which we are interacting. An early step in designing a large scale program often involves object-oriented analysis in which the objects which a program simulates or works with are decided after looking at its general requirements. This is something which is covered in the Software Engineering module, which is discussed further below.

Note that we have stopped thinking in terms of how these things are done on the actual computer hardware. All these things will be translated to instructions in machine code, but it really doesn't help us to know how and what they are, as that just makes the whole thing more complicated. Just think of them as doing what we say they do here, leave it to the computer to make them work that way. It's the same as working any piece of machinery: some specialists might need to know what happens when you take the front cover off and look at the mechanisms inside, but most people just trust it all works according to instruction. Being able to think in this way: abstraction, as it is called, is the key to becoming a good computer programmer. You are able to think of a piece of computer program just in terms of the level you are using it at, you don't have to go into the complexity of the other bits of computer program that make it work, let alone the computer hardware underneath. That is how you can break down the hugely complicated thing which is a computer program into separate parts and get a feeling for its general structure.

So writing computer programs is a matter of defining your own types of objects using classes, and your own instructions using methods, as well as just writing instruction calls. But also you will be making use of classes and methods which are provided as part of the Java language, there are huge numbers of them, which you can find listed here. Don't worry, you don't need to know all of these, or even a large proportion of them. A few of them are so general that any good Java programmer should know them, but many of them are for more specialist use so you would only need to know them if you were writing a specialist program.

Software Engineering

In fact most professional programmers don't just write programs on their own, they write them as part of a team which together is working on building a big system. So you may uses classes and method provided by other members of the team as well as those provided by the Java library, and you may write classes and methods yourself which other members of the team may use. This whole area of building software systems as part of a team is another aspect covered by the Software Engineering module which you are likely to be taking alongside ADSOOF.

As was said above, with the sort of large scale programming that is done in practice, you need to drop the idea that a program consists of a “main” program and mini-programs in methods, and that the main thing it does is read and write things from the computer console window. You may have got used to thinking of programs in this way when writing the tiny programs which got you started in your introductory programming modules, but those were only exercises. In a more realistic approach to programming, you may be writing classes that are designed to interact with other classes. You may be using classes written by other people, or writing a class to be used by other people. Code which interacts with human users is likely to do so through a complex graphical interface, rather than through text in a console window. There is a whole module, Graphical User Interfaces, which covers in detail the interaction of computers with humans through graphical interfaces, some of this module is about programming them, but much of it is about the human issues of interacting with computer programs. This whole vast area isn't covered at all in ADSOOF, for simplicity we won't look at all at Java's classes for graphical user interfaces, and our examples will use just simple text-based interaction with the user. The object-oriented way of programming, however, fits in very well with graphical user interfaces, since the objects which appear on a screen can be represented directly by objects in the programming language.

This module on Algorithms and Data Structures in an Object-Oriented Environment is about those aspects of a computer program which form its internal structure rather than its interface with the outside world. So you should think about it as writing components which will be fitted together to form a larger program. Designing that larger program as a whole, and deciding what components are required and how they are to fit together is covered in the Software Engineering module. In order to demonstrate and test the components which are developed in this ADSOOF, we will use simple “front end” programs where there's a method called main which starts code execution and interacts with the user using text. But don't pay too much attention to these front ends, they are not meant to be anything you can learn from or need to study.

Please feel free to use the BlueJ system to manage and run your code which you are likely to be familiar with it from your first year object oriented programming module. In order to be accessible to everyone, and to concentrate on the basics, the notes for ADSOOF do not assume the use of any Interactive Development Environment (IDE) and the programs developed are small enough that managing the files just using Linux shouldn't be difficult. But in realistically size programs, developing and managing the code files through a tailored environment is essential. It will give you tools for displaying the structure of the program, testing and debugging it, and so on. BlueJ is designed for student use, but in the Software Engineering module you will use an industrial standard IDE such as Netbeans or Eclipse. You could experiment with using such an IDE on the ADSOOF material, but it is probably best also to handle your code directly. An advanced IDE can fill in much of the basic code structure in a semi-automatic way, which is great if you already have a good feel for basic coding, maybe not so good if it writes code for you that you don't really understand.

If you are writing pieces of a computer program which interact with other pieces of a computer program, it is important that what each piece does is defined exactly. This is referred to as its specification. When you are using someone else's piece of program, it's best if you can just assume it works exactly according to its specification, and you don't need to know what happens inside it to make it work that way. This is how you use the classes provided by the Java code library, and as a programmer writing a component you will be asked to write code that works in the same way - it shouldn't be defined by what's inside it, but instead by its specification. Someone else using your component should be confident it will work according to its specification, and that specification should fully explain it without reference to what is inside it. One of the things this means is that you can change what is inside your component without this affecting the other parts of the program which use it, so long as the way it interacts with those other parts is according to its specification. The reason why you might want to change what is inside a component even though its specfication doesn't change is efficiency. You may have thought of a way of writing your component so that it behaves the same in terms of the values of results it gives when methods are called, but it returns those values much faster using the changed code.

Real scale Java programs will also commonly interact with databases, with the detailed handling of databases done by a specialist database language such as SQL. The module Database Systems covers this. Java's code library provides classes to link in with databases, which you can find details of here. The complex collection of classes and interfaces in the API specification may look intimidating, a big jump from the introductory programming you saw in your first year. By covering the basics of object oriented programming in more detail, ADSOOF will strengthen your understanding of various concepts which you will need to make effective use of code libraries used in modern software development.

Algorithms

An algorithm is a “way of doing something”. The study of algorithms has been described as the “Spirit of Computing”. So in that way it's more fundamental to Computing than learning how the electronics of a computer works, or learning how to write instructions which make those electronics work, or learning how a computer system fits in with human requirements. The topic of algorithms is a vast one, it involves not just learning about well-known algorithms, but also developing new algorithms, proving that algorithms do the tasks they are designed to do, and proving they do those tasks within a particular time or memory usage limit. There is a final year module, Algorithms and Complexity which looks at these issues in much more detail. It is important to note that an algorithm is not the same as a computer program. An algorithm is a general description of a way to solve a problem, perhaps written in English, or in some mathematical notation, or described using diagrams, or a mixture of these. You can write a computer program which works in the way described by an algorithm (described as an implementation of the algorithm), but the details of the program, or the particular programming language used aren't part of the algorithm. When you are writing a computer program to solve some problem, you should think in terms of the algorithm first, and then in terms of how you would represent that algorithm in the programming language you are using.

This module, Algorithms and Data Structures in an Object-Oriented Framework, can only give a very brief introduction to the topic of algorithms. It will show there may be more than one algorithm to solve a problem. It will show that the choice of algorithm used can make a big difference to the time taken to solve a problem. You will see two basic approaches to designing an algorithm. One, iteration, considers repeating a step which changes some structure until that structure has reached a desired form, for example, changing the position of numbers in a list until the list is sorted. The other, recursion considers solving a problem by breaking it down into smaller version of the same problem, solving those, and then putting together the solutions to make a solution to the whole problem. For example, thinking about sorting a list in a recursive manner might involve dividing the list into two parts, sorting each of those parts, and then putting them together to make a whole sorted list.

ADSOOF is a practical module, so we will move from thinking about algorithms very quickly to writing programs which implement them. This is a step towards making practical use of the programming skills you learnt in Procedural Programming. There you learnt the basic structures of programming, loops, selections and methods. You learnt how they are represented in the Java language, though many other programming languages are similar. Now in ADSOOF you start to learn how to make use of those structures to solve problems.

Data Structures

Some algorithms are purely numerical, they work on just numbers. But many of the things we would want to do with a computer work with collections of information. A collection is an object which can be considered as a whole, but also in terms of the parts which make it up. Different forms of collections may have different ways of accessing the parts. For example, a queue is a structure where the only way something can join it is to join at the back end, while the only way something can leave it is to leave from the front end. Or at least we can define a queue as something which works in that way, and write a program component which implements queues working in just that way. If we wanted the sort of queue where things could leave before reaching the front, or push into the middle, we would have to write a program component which works that way. Computers do not have minds of their own, so they work only as we have programmed them to work. As mentioned previously, when writing program components, we should have a precise description of how they should work in terms of their interaction with the rest of the program, and we should make sure that what is inside then makes them work that way. We can distinguish between what is called an abstract data type, which is a description of a program component representing a collection in terms of how it interacts with other program components, and how the data inside the component is actually stored. Also, we need to distinguish between a description of how a data structure might be, and an actual case of that data structure - this is similar to the distinction noted above between a class and an object.

The term “concrete data structure” is used when we want to mean a data structure as it is actually represented in terms of variables in order to distinguish that concept from a data structure expressed as an abstract data type. So, in an object-oriented language like Java we will have a class which implements an abstract data type by providing public methods which is how the objects of that class interact with other code. The variables inside that class are the concrete data structure used to make the methods work, but they should be declared private so that no other code can access them directly. Just as there may be more than one algorithm to solve a particular problem, for example there are several different sorting algorithms, so in some cases there may be more than one concrete data structure that could be used to implement an abstract data type.

There is a close relationship between algorithms and data structures. Many algorithms are defined as working on particular data structures. In both cases we distinguish between a general description, which could be programmed in any programming language, and a particular program in a particular language which implements it. As with algorithms, starting to think in terms of data structures and using them to implement abstract data types is a step away form thinking about programming just in terms of the programming language we are using towards using it to do a particular task.

Just as one algorithm may be more efficient than another to solve a problem, so one concrete data structure may be more efficient than another to implement an abstract data type. Efficiency can mean better use of time or better use of space. In some cases a concrete data structure that is efficient in terms of time usage is inefficient in terms of space and vice versa. In some cases, a particular concrete data structure is efficient for some of the methods of an abstract data type it implements but inefficient for others.

Data structures have two basic forms, which can be compared to the two forms of algorithm. The array form corresponds directly to a block of store in computer memory, and can be considered as the data structure analogue of iteration. It's a structure in which there are many data items each of which can be accessed from its individual position in the array. A linked cell structure is the analogue of recursion. In this case we have a structure which contains a piece of data and links to further structures of the same type (so they too may contain data and further links). We can use these basic data structures to build up more complex data structures, just as we can use the basic loop and selection structures of procedural programming to build up more complex structures of instructions.

An Object-Oriented Framework

Although part of the purpose of the Algorithms and Data Structure in an Object-Oriented Framework module is to introduce basic concepts of algorithms and data structures, it is also intended to build up your general programming skills, in the object-oriented style, and using the programming language Java.

Java is used as a general example of an object oriented progamming language. Much of what is learnt in this module will have very similar equivalents in other object-oriented programming languages, and programming skills learnt with the use of one programming language may easily be transferred to another. The module is not a general course on the Java programming language. If the module was more oriented towards training in Java, we would spend more time on aspects of the Java code library. An expert in a particular programming language is someone familiar with the most commonly used parts of its code library as well as its basic structuring mechanisms. It is important when programming in an object-oriented language, however, to be aware that a major aspect of the object-oriented programming style is an encouragement of re-use, meaning that wherever possible we will use existing code rather than write our own. So we will spend some time looking at one aspect of Java's code library, its Collections Framework.

The Collections Framework provides for you code already written for the most common algorithms and data structures you are likely to need to use. Although in this module we study how to write code yourself to provide these algorithms and data structures, that is done just to introduce the topic of algorithms and data structures and because it is useful to know how the library code works underneath. As a practical programmer, however, there is no need to write code yourself when code to do what you what exists in the code library provided with the programming language you are using. This is also a good example of the principle of information hiding, which is what we described above - that is you can make use of class in Java without knowing the code underneath, so the information in objects of the class is hidden. All you need to know is the public methods of the class.

If code is written so that it may be re-used, it should be written in a way that makes it as flexible as possible so that it may be re-used in a variety of circumstances. For example, a method that can be used to sort a collection of any type of data is much more open to re-use than a method which can be used only to sort a collection of integers. The inheritance aspect of object-oriented programming helps with this. It means you can define a general type, which covers just those aspects of an object which it needs to be able to work with an algorithm or data structure, then you can use inheritance to extend that to more specific types to cover you actual data when you use the code built around the general type. A sorting algorithm, for example, requires the data being sorted to have a notion of ordering and the collection being sorted to have the notion of items in it being in a particular position. Both of these notions can be expressed using a general type. Java uses type variables1, in which you can write code where the type the code works with is given by a variable rather than set to a particular type when the code is written. We shall give some coverage to this issue, which is known as generics.

Another way of generalising code is through parameterisation. In the previous paragraph it was suggested that if you want to sort a collection of items, those items need to have a notion of ordering between themselves. But an alternative way would be to introduce an extra parameter into the sorting method which is used to provide an object which gives an ordering between the items. So consider you were sorting a collection of data representing students and their marks in an exam. This extra parameter could be used to make the same code sort in order of student name, or in order of mark, or in any other order.

Structure of the module

This module is provided with an extensive set of notes. Although there is a recommended textbook all the material you need to pass the module can be found in the notes. The notes will be covered at the pace of roughly one section per week.

Much of the notes cover material, perhaps in more detail and in a different way, and definitely using different examples, but the same general principles, which you have already covered in previous programming modules. You still need to work carefully through it and make sure you fully understand it. Do not be tempted to slack off because you think you know it already, and miss when new ideas are introduced. If you have found programming difficult in the past, as many do, work particularly carefully and methodically through the notes and examples. It is extremely important that you cover the material as it is given, and do not be tempted to put it off with a promise to yourself that you will “catch up later”. Experience suggests that people who do this rarely do manage to catch up and often get completely lost. This is not material it is easy to grasp with last minute revision.

The notes start off covering the topic of using objects, then the simple data structures provided by Java, and then move onto an abstract data type you won't have seen before, the “Lisp List”. One reason for introducing Lisp lists is that they are a good way of giving examples of the use of recursion and recursive thinking in algorithm design, which is a topic many people find particularly difficult to grasp. There is then a section on implementing objects, including a first look at building our own collection object.

The name “Lisp List” comes from one of the earliest programming languages, Lisp, which used something similar as its basic data structure when other early programming languages just used arrays. You may think it is strange to refer to such an old programming language, but there has been a recent growth of interest in the ideas it introduced, with some companies switching to Scala, Clojure and similar languages.

The sections on inheritance, interface types and generics are deliberately stretched out over two weeks as the principles covered here can be quite tricky so need to be considered carefully and slowly. Following this, there is coverage of linked data structures, including using these as an alternative way of building collection objects. Finally there are notes on details of Java's Collections Framework.

With the notes for each week there is an accompanying section giving all the code used in the notes, plus a programming exercise. Please note, the code is used to illustrate the principles in the notes, and so that you can experiment yourself by running it. It is not necessary or recommended to try and memorise the code, there is too much of it to do that! If you are still at the stage where you confuse memorisation with learning, you need to get out of that and take a more mature approach to education. What you need to be doing is understanding the general principles, and then when you need to write code it will come naturally from that. Think of it as like learning another human language. If you just tried to memorise useful sentences without understanding the structure of those sentences, you wouldn't get very far in the language. On the other hand, if you studied carefully and experimented by talking the language and reading it, and came to understand its structures, you would eventually be able to make up your own sentences and understand other people's sentences in the language as and when you need them.

It is important that you try to tackle the programming exercises in the week they are given. You will not necessarily be able to complete them in the one or two hour official lab slot you will be allocated, you will be expected to spend some extra time outside those hours working on them as well. While it is not essential that you finish every part of every exercise, you should try your best to do what you can allowing perhaps four hours a week to do it. Remember, as with learning a human language, or driving a car, or many other practical skills, the only way you can really learn to program well is to practice, practice and practice.

Tests and assessment

The main assessment for this module will be the exam which you take at the end of the academic year, but there will also be one or more exam-style tests during the time it is taught, marks for lab work, and marks for a mini-project.

Some parts of the tests and the final exam will ask you to write explanations in English. Writing clear explanations of technical subjects is an important skill. You have to be able to explain something in a way that means other people who didn't understand it or know about it before they read what you wrote do know and understand it afterwards. So you have to place yourself in the position of someone who isn't you, and see how something you write might not make sense or could be interpreted in some other way because it relies on something you have in your mind that someone else wouldn't know. You have to cover everything necessary, but nothing else. You must avoid using words you don't really understand and sentences which don't really mean anything but you put them in because you think they sound clever. In technical writing trying to sound clever to hide the fact that you don't really know what you are writing about almost always has the opposite effect. Good technical writing skills are something which obviously have applications beyond Computer Science!

Many people like to answer questions which ask for explanations in English because they think it is just a matter of memorising and “regurgitating” explanations. Unfortunately, a lot of poor pre-university education encourages this attitude. My experience is that a “memorising” approach to learning never works, and in the long run requires more effort than simply understanding the material. Questions which require a written explanation are meant to test your understanding, and my experience is that if you don't really understand, it shows in your answers to them. Questions which require written explanation rather than written computer code are popular because many people are a bit scared of programming, and feel more confident they can bluff their way through an exam question which involves written English rather than programming code. As a result of this mistaken attitude, the non-coding questions on exams and tests in this module are often the ones which are most poorly done.

Why people who have chosen to do a Computer Science degree are scared of programming is a bit of a mystery, but it may again have to do with poor pre-university education where “Information Technology” is a popular subject and it often seems to mean “everything to do with computers, but with the programming bit taken out”. Although Computer Science covers much more and not everyone with a Computer Science degree goes on to work as a programmer, programming is really the core topic of Computer Science, and this module is intended to cover enough of that core so that you can be qualified as a Computer Scientist. For that reason, a large part of the test and exams will be questions which ask you to write code.

I am often asked why tests and exams where you are asked to write code on paper are such a big part of the assessment of this module when in practice most people program by writing code on the computer, running it, and modifying it until it works. The answer to this is that in my experience the ability to write correct code on paper is just about the best test of deep understanding of programming. You should develop a confidence with programming so that you can write down some code and have a reasonable feel for how it would work, and you can have a feel for an algorithm and be able to translate that algorithm into programming code. Of course, it is easy to make mistakes, so even expert programmers will write code, run it, find it doesn't work quite as they expected, and modify it. However, a good programmer will generally write code which is close to correct and understand what is required to modify it when it doesn't quite work first time round. An approach to programming which consists of writing something down, or taking some existing code, which you don't really understand, and then applying fairly random modifications until it seems to work, is very bad. It is time-consuming, and because the programmer doesn't really know why the code works, is likely to be full of undiscovered errors. Having to develop the skill of writing correct code by thinking about it first, even if it is a bit artificial to have to write it down in an exam, will make you a much better programmer. The time constraints of an exam too, while a bit artificial, help test your understanding, if you don't have to think too long because it comes naturally to you, that is a sign of a deep level of understanding. Again, all this only comes with practice, practice, practice, the exam is just a test that you have this practice.

In fact there is a growing consensus in the commercial world which agrees with me on this. It is becoming standard among companies which recruit for software development positions to have what is known as a programming interview, or “coding interview”, meaning an interview in which applicants will be asked to write code by hand in a short time. There is now a whole range of books written to help with this task. Beware, however, it is good to use such books for example practice in small programming exercises, but it defeats the purpose of the programming interview (and it won't work anyway) if you use them to try and memorise solutions to common problems. A good book on this topic will make this clear, any book which suggests you can use it to “cheat” on programming interviews by memorising is a bad book, do not use it, it will take you down the wrong path. I have collected together some further links on coding interviews here.

A small part of the marks for this module is for assessed labwork. This is partly to make sure you do come to the lab and do the exercises - I would like students to do work because they understand it is necessary to develop the required understanding, but my experience is that attaching a small mark which contributes to the final mark serves as an incentive not to put off doing it. The mark will not be just for attendance, you will need to demonstrate some working code and be able to talk about it in a way that suggests you did not just copy it from someone else.

The mini-project is intended to test research, experimentation and presentation skills which are not easily tested in exams, as well as programming skills something requiring more code than could be expected in an exam question. The research means good marks for the mini-project require you to learn by yourself about an algorithms and data structures topic which is not directly covered in the notes for this module, though it will use and take further what is covered in the module. The experimentation means youneed to devise tests which show the efficiency as well as correctness of your code. The presentation skills are using technical writing at a greater length than can be expected in an exam, and also good explanation using appropriate diagrams and tables.

The tests in this module are done under exam conditions, they contribute enough final marks to ensure you will take them seriously, but not so many that your chances of passing the module are finished if you do badly in them. The tests are not just for assessment, you should treat them as an important part of learning. Answers for the tests will be provided after they have been taken, and I will give a general feedback which points out the most common misunderstandings found in the test scripts. A mistake in a test is more likely to indicate you have understood something wrongly than that you have just written one thing when you meant something else. So you should try and work out what your wrong understanding is and learn from your mistake and correct it. In fact, this is very much like running a program, finding an error, and correcting the code to remove the error. Except that the “program” is the understanding inside your head, so only you can work out exactly what is wrong and correct it.


Footnote

(1) Type variables were introduced into Java in 2004, with a major update of the language called Java 5. You may come across old textbooks and websites that ignore this aspect of Java. However, type variables have been a part of Java for long enough now that I feel it is better to introduce them from the start rather than treat them as an advanced topic.


Matthew Huntbach

Last modified: 23 September 2016