Next: Related activities during the Up: final Previous: Background/Context

Key Advances

At the time the research proposal was submitted, game semantics was still very much a new branch in the semantics of programming languages. The work on the fully abstraction for PCF [1] was just beginning to be known among researchers and we thought it would have been interesting to investigate possible developments of this new discipline in the field of semantics. In the proposal I hence listed a series of topics worth studying, namely:

Five years on the status of Game Semantics in Theoretical Computer Science is significantly improved. Games have now become a well known field of semantics, with its own session in major international conferences (e.g. LICS). As a consequence of its succes the amount of work generated has allowed to give an answer to almost all of the issues raised in the proposal.

For example Danos-Herbelin and Regnier have related game semantics with geometry of interaction via abstract machines; several results of full abstraction for various linear sub logics have been achieved; Ralph Loader has shown that there is no simple (i.e. recursive) presentation of any fully abstract model for PCF hence implying that synthetic descriptions of the fully-abstract model are impossible. Also precise relations between Hyland-Ong games and our approach have been investigated, for example by Herbelin, and fully abstract models for extensions of PCF with various imperative and object oriented features have been built mainly by Abramsky and McCusker.

As for me, as I began to investigate further the first topic, I soon realized that by exploiting the relation between Games and Geometry of the Interaction via abstract machines, thing of which I was aware thanks some unpublished work done in 92 [5], a new domain of investigation was available because of the operational insights that games provide on program evaluation. This perspective seemed quite exciting because it seemed to go beyond the traditional semantic issues above cited and could provide a bridge between theoretical and application oriented research.

My line of thought could be summarized as follows: If we consider semantics as a process where we gain knowledge about what is a program (as an object independent from its infinite representations) and if this knowledge includes detailed evaluation procedures, then a test of how good is the knowledge provided by such a semantics could be to develop algorithms based on the semantics for the areas where automatised knowledge about the structure of a program is required; examples of these algorithms are in the field of program analysis, compilers, optimisers, program transformation, etc., where a program is transformed to an equivalent one, in general aiming to get a more efficient code. As a very simple example, consider the sequential program ...;x:=1;x:=0;.... Here it is clear that the command x:=1 is just a waste of time so if the compiler would know this it could reduce the code to ...;x:=0;....

So the aim became: Can we use game semantics to derive good algorithms in these areas?

In early 97, I hence started a collaboration with an international expert in the field of static analysis, namely Chris Hankin and developed Game Semantics based analysis algorithms having as target a Typed Higher--Order Object--Oriented Language which was the target language of the TCOOL project (with S.Abramsky and C.Hankin investigators and me co-investigator).

The first topic we investigated was control flow analysis: When considering programming languages with higher--order procedures, like LISP, ML (or at some extent also object oriented languages like JAVA) one of the biggest problems in devising algorithms to operate program transformations to be used, for example in compiler optimizers, is to find a way to follow the flow of the program, i.e., what follows what in the program. Typically the problem of finding exactly all possible procedures that a procedure can call during evaluation is not easy; it is actually non recursive. Any recursive algorithm which uses knowledge about the flow of a program has to settle therefore for a safe (here a safe set means a set containing all procedures that will actually be called) approximation of this flow information. An example of this phenomenon can be seen also in the case of a simple imperative language: When building the flow-chart for conditionals it is agreed that both branches of the conditional can be reached through the guard of the conditional, because it is impossible to know which branch will actually be chosen without evaluating the program.

A simple solution to the problem of flow analysis in presence of higher--order procedures would be to say that a procedure can call all possible procedures in the program; this would be safe but not much useful; for example, we could have a procedure which is never called and we would like (in most case) to discard that code; however this is not the case if we take the approach we just discussed.

A good control flow analysis should hence be able to restrict the number of procedures that a procedure can call during evaluation to the smallest safe computable set.

It soon became clear that games could provide an original contribution to the problem. The root of this contribution can be found in a sentence of the original proposal: ``interpreting a program as an information particle travelling in a fixed structure (the code seen as a graph)'' which referred to the feature, common to games and geometry of interaction, of considering the program as a graph and the evaluation as a path in this graph. A simple consequence of this property is that however long is the evaluation process, this length is due to the reiteration of paths in the code, hence if we bound the number of passages we allow through the edges of the graph then we have a path of length polynomial w.r.t. the input and which should give a reasonable safe set for the analysis; safeness results from the fact that we visit at least once all nodes visited by the actual evaluation path. In this context, flow analysis amounts to detecting which nodes in the path represent procedures. This technique is the basis of the first work with Hankin, in which we define a control flow analyser for a PCF like language [10].

Two pleasing properties of this approach are that the analysis comes straight from the semantics and that the correctness holds ``by construction''. Moreover the theoretical approach doesn't increase the complexity of the algorithm, which is shown to be cubic, i.e., optimal.

A subsequent investigation was then started, aiming at combining control flow analysis and data flow analysis for Idealised Algol, a higher--order language with imperative features which was also the reference language for the TCOOL project. Following Reynolds work, we used an interpretation of Class-based languages into Idealised Algol to show that our analysis could be used for (some) object oriented feature.

Our main aim was to define a notion of flow-chart for higher order languages which was a generalisation of the usual one. This would make data-flow analysis work on these extended structures very similarly to the way it works for first order imperative languages. To this aim we added to our graphs nodes for imperative constants like if then else, assignments, local variables and modelled their effect on a particle of information passing through them following the game interpretation of those constants (using the fully abstract model of Idealised Algol [2]).

All along these works a relation between the algorithmic constructions and game semantics was stated but not fully explained. This task was the aim of the work presented at LICS 99 [8]. Here it is shown that the graph over which the analysis of a program in Idealised Algol is carried out can be seen as the image of a lax functor from the fully abstract game model to the category of non-deterministic games. What this functor does is, basically, to identify different occurrences of the same move and all the ``answer'' moves. It is lax because the image of the composition of two strategies is larger than the composition of the images of the strategies meaning that once we have identified different occurrences of the same move we create new paths, for example the one where a move following the second occurrence but not the first is now following the first, e.g. from the sequence m.a.b.a.d.n we get the sequences m.a.b.a.d.n, m.a.d.n

Apart from being mathematically pleasing, the role of this presentation in terms of a lax functor between two game categories plays a fundamental role in the theory because it implies the safeness of the analysis. Indeed all the original sequences of moves in a strategy in the original model will be in the image of the strategy in the non deterministic category which means that if a procedure A is called by a procedure B then a path from B to A will be present in the analysis based on games.

Following this work it seemed a good time to investigate the framework in a more practical sense and hence an implementation was developed. The aim of the implementation is to take an object--oriented concurrent higher--order program and perform a control flow analysis which is then showed as a graph representing dependencies between the several procedural part of the program (in this broad framework procedural means an object, a method, a procedure, a function). The core of this software tool, written in OCAML1, is the algorithm described in [10]; however, several extensions were required to deal with features which our game analysis didn't covers (i.e., higher order references, objects (mainly the self parameter), concurrency (in the form of multi-threading)).

The program is now publically available trough the web [7] with examples and usage informations. It is reasonable to think that by providing this tool with a front end translator from a wide distribution language like JAVA and by adding a user friendly interactive interface it could evolve into an interesting product for the software engineering community.

A major benefit of this journey from a purely theoretical work to an implementation has been the chance to better understand the different requirements that are relevant at different stages of the process. One way I came across this kind of problems is during the work in security [8]. Here theoretical efforts are devoted to devise language features which do not allow breach of security. Practitioners have however a more relaxed attitude; they consider security as a function of the effort which would require to break into the system. A theory of quantitative program analysis, capable for example to extract a quantity measuring the amount of interference between program variables (security leakage being a particular case of this setting) seems hence an exciting topic in the quest to reduce the gap between theory and practice, by providing practitioners a precise way of extracting quantities which are so far are mainly extracted via heuristics. I have hence started a collaboration with S.Hunt and D.Clark on this subject and an opening paper laying the foundations of the theory and the analysis is now under completion [3].



Next: Related activities during the Up: final Previous: Background/Context
Pasquale Malacaria
2001-04-24