Monday, January 07, 2008

The art of the science of software

Gregor Kiczales gave one of the keynotes at last year's OOPSLA (slides and audio). The abstract was promising, and it sounds like the talk was well received. Nonetheless, I think Kiczales is off target. But as the talk has a number of important themes running through it, and the subject matter is extremely pertinent to my own research interests, I will attempt some course correction, rather than just pick holes. As with my last post, I intend these remarks as friendly constructive criticism. The talk was in the exploratory spirit, after all, and it deserves a response in the same spirit. Unfortunately the various strands of the talk are woven together in ways that are both confusing, and confused.

Pluralism in software, pluralism in science

Kiczales' central observation is this: software abstractions are transient beasts, subservient to our equally transient purposes. Abstractions in a sense represent points of view, and thus flex and mutate as we shift our perspective. Philosophers sometimes call this kind of "perspective-friendly" meta-perspective pluralism. Pluralism acknowledges that there can be a plurality of distinct descriptions of a system - perhaps with radically different ontologies - without requiring that at most one of these descriptions is "correct". Put like this, pluralism sounds like plain common sense. And it seems inevitable that one day our languages and tools will reflect this pluralism. What's less clear of course is how painful the journey will be.

Kiczales also makes the interesting point that the shift towards perspectivalism in software in many respects mirrors a similar shift in how we think about natural systems. (At one point he asks the audience, leadingly, whether anyone believes in "scientific objectivity".) Here it seems that Kiczales has been heavily influenced by Brian Smith's book On the Origin of Objects. Smith was in the office next door to Kiczales for several years, and some of his ideas seem to have cross-pollinated. I'll come back to Smith's book shortly. But first let's consider this analogy between software systems and natural systems. (I apologise for the brief philosophical digression, but I think it's one of the threads of the talk which hits on an important point.)

Twentieth-century philosophy of science, certainly since the decline of logical empiricism, was dominated by various flavours of a metaphysical position called realism. The basic supposition of realism is that there exist theory-independent facts about the macroscopic organisation of the physical world, and that these facts determine whether a given successful theory is really true, or "merely" capable of accounting for all the empirical data one could, in principle, obtain. A realist might, for example, claim that there is an objective fact about exactly how many hairs there are on my head, and that this fact exists independently of - is ontologically prior to - any particular theory of what hairs are, or indeed what heads are, or what it is for a hair to be attached to a head. Some such commitment to an objective, theory-independent, "natural kind" ontology - what Nelson Goodman famously called a "ready-made world" - is the cornerstone of any realist world-view.

What's wrong with the realist picture is that there is something smelly about the idea of a fact which is in principle beyond the reach of empirical science. Scientific theories typically "parse" low-level ontologies into higher-level ontologies; these macro-ontologies are really nothing more than patterns swirling in the low-level structure, and the "truth", or otherwise, of such theories is, in scientific terms, fully exhausted by the empirical success of that theory. From science's point of view, "molecules" are just patterns in the quantum-mechanical substrate (or whatever) which satisfy a certain behavioural or structural description, and the extent to which the theory of molecules is more (or less) objectively true than any other theory is just the extent to which that theory successfully (or unsuccessfully) systematises the phenomena. There are no further scientific facts - no "trans-empirical" facts - which determine which theory is "the one true theory".

I can't hope to have done justice to this topic in that one paragraph, even if I knew enough about the subject to give it proper treatment (and I don't). But my aim here is only to concede to Kiczales and Smith what I think is fair concession: that any plausible alternative to realism must be pluralistic. It must allow for there to be multiple descriptions of the same natural system - perhaps with radically differing ontologies - without imposing the requirement that at most one of them is "correct". A theory of quantum gravity, if we ever find one, will not reveal General Relativity to have been "false" - but mysteriously successful - all along. It will just be a better theory.

So let us grant the point that realism is a metaphysical red herring. And I think we can also agree that the analogy with what's wrong with our traditional conception of software is compelling. We tend to think of there as being a unique objective fact about what a piece of sofware "does" - a unique theory of its behaviour. Our awareness of the existence of some underlying source code tends to fuel this intuition. But really we need to be much more pluralistic, and accept that what a piece of software "does" inescapably depends on your point of view. A security engineer might have a completely different view of a system than an end user. Each end user probably has a different view than other users, in as much as she can't see what other users are doing. Reports or audits produced for management are really nothing more than abstractions of how the system behaves. Even a bug-fix, without too much of a stretch of the imagination, is just a view of an erroneous program that applies a correcting delta to its behaviour. And many of these views and perspectives aren't just design-time artifacts, but are live perspectives on a running program. This pluralistic way of thinking about software is even more dynamic and fluid than fluid AOP: let's call it superfluid AOP.

An exciting possibility, then, is that fixing our philosophy of the natural world and learning how to think properly about software might end up dovetailing rather nicely. God's not a mathematician, he's a programmer, right?

Deconvolve this

So far so good. But where Kiczales' talk goes awry is in its leap from pluralism, to the ushering in of a new era of "formality-free computing". In this fluffy new future, we will sit around engaging in social negotiation and situated action, interactions which will somehow manage not to be "formal all the way down". Unfortunately, these are just Smith's bad memes at play. I guess I can see how the kind of pluralism just discussed might be innocently mistaken for post-modernism of the sort offered by Smith in his cryptic book, but it's a serious mistake. All I know is that if there is a place for post-modernist, lit-crit, social constructivist thinking in the modern world, it's nowhere near the field of computing.

The following excerpt from the Amazon "review" of Smith's book (presumably written by his publisher) captures the sickly flavour of Smith's vision:
Critics of programming practice have compared it to alchemy and Smith recalls the characterisation of Newton as the last of the magicians. Is this a pre-Newtonian phase, lacking "Laws", awaiting the differential calculus? Another position is suggested:

"... that we are post-Newtonian, in the sense of being inappropriately wedded to a particular reductionist form of scientism, inapplicable to so rich an intentional phenomenon. Another generation of scientists may be the last thing we need. Maybe, instead, we need a new generation of magicians". [p362]

Magician? Magus? Seeking the secret of how it is we "deconvolve the deixis" - plus ça change, plus c'est la même chose. The Alchemist: not a charlatan, but one possessed of much empirical wisdom stumbling after the scheme of things; as this new Science of the Artificial must do, self constructed, self referential, post-post-modern, a metaphysics for the 21st century.
I'm sorry, what?? When exactly did Gary Gygax get together with Jacques Derrida? It's somewhere between uninformative and downright misleading to attach significance to the idea that software is intentional (in the philosophical sense originally popularised by Dennett, and somewhat misappropriated by Smith). We can, too, skip gaily past Smith's notions of registration and zest with no fear that we're missing any useful insights. Like it or not, the bedrock of computing is the scientific world view, and Smith's anti-scientistic stance and vaguely Continental-style philosophy are about as compatible with this world view as creationism. Indeed, with the situation arguably inverting - and computation gradually becoming the conceptual foundation upon which science is built - it is even more important that we keep computing free of this kind of pretentious twaddle. It matters too much.

And while it may be true that interfaces are, unsurprisingly, often socially negotiated, we must be careful what we infer from this. So is the spelling of identifiers, the pattern of whitespace in a source file, the arrangement of plant pots in an office, after all. What we must cleanly demarcate are the forces that define a particular technical problem, and any particular solution to that problem. The problem that Kiczales has quite rightly identified is just this: abstractions are essentially dynamic and context-sensitive. There is no unique "correct" ontology for any man-made system, any more than there is a unique correct theory of any natural system. And one of the key forces that happens to drive this dynamism and context-sensitivity - but only one among many - is social interaction. ("One man's constant is another man's variable", as Alan Perlis nicely put it.) But it is a mistake to think that any observations about computing as a social activity offer insight into potential solutions to this problem.

Formality all the way down

This leads us to the final Smithesque strand we need to extract from Kiczales' talk and lay to one side. We are all familiar with the observation that simple interactions between parts often give rise to "emergent" phenomena, behaviours that are somehow novel or surprising, such as the macroscopic behaviour of ant colonies or eBay shoppers, but which are not in any way mystical or magical. As Figure 1 attempts to show, emergent behaviours are in a sense dual to the requirements on a solution. Requirements are known and obligate the system in certain ways, whereas emergent behaviours ("emergents", one could call them) are those which are permitted by the system, but which were not known a priori.

Figure 1: Required behaviours vs. emergent behaviours


Emergence is an important topic. But again, we must be careful not to make the leap from the uncontroversial phenomenon of emergence, to the highly controversial idea that reality (and by analogy software) might not be "formal all the way down", as Kiczales, following Smith, suggests. Formal all the way down is exactly what reality is. What else could it possibly be?

Smith's new-age version of emergentism is just an invalid inference from the failure of the reductionist programme in science. In the 1960's, many scientists, as well as philosophers such as Ernest Nagel, were optimistic that we would eventually be able to deductively derive all of science from fundamental physics, by establishing the right "bridge laws" between theories. Half a century later, this optimism looks naive. There has been only limited success, for example, in deriving much of chemistry from quantum mechanics on a "first principles" basis.

But the failure of this kind of reductionist programme does not mean giving up on formalism. We simply need a more mature perspective on the relationship between two theories, perhaps seeing the relationship as closer to one of computational abstraction than one of deductive derivation. During the Q&A session after the talk, someone asked whether "biology rather than formalism" was a better model of software. Yet much of the recent success in biology has come from developing new technical perspectives and formalisms. Witness the fast-evolving population of process calculi, equipped with ambients, stochasticity, branes, and whatnot, which are fast becoming mainstream tools in systems biology. Biological reductionism may no longer be plausible, but there is no inference to the inadequacy of formalism.

Once we concede this, then as with social negotiation, we can see that emergence is only indirectly related to the technical problem of enabling "perspectival programming". We don't need to design for emergence; what we mean by "emergent" is, after all, just that which doesn't come built-in. There are no insights we can export from emergence itself to the foundations of computing. Emergence comes about from the way we use things, the way things contingently interact, not from the mechanisms of interaction per se, which can and indeed can only be strictly formal.

The technical challenge: a new paradigm for interactive computing

So at last, I think we can distill the central challenge lurking at the heart of Kiczales' talk. How do we expect to realise the task-centric, perspectival model of programming that we know is coming? If abstractions indeed need only exist only in the service of specific interactions the programmer or user has with the program, then in the future we may be abstracting and unabstracting as frequently as we switch between edit buffers today. In their various ways, systems like Mylyn, fluid AOP, and Subtext offer a glimpse of what this world might look like (although Subtext is the only one of these that offers a glimpse of just how fluid the new paradigm might be). But do we have the technical maturity to realise this superfluid, aspects-on-steroids vision?

I suspect that Kiczales would agree that the answer is no. We simply lack a compelling paradigm for building robust interactive systems. But contra Kiczales, and as I argued in my last post, working out this new paradigm will require us to to embrace the formal, not reject it. The answer is not going to be to make things less effective (in his semi-technical sense of the word - roughly synonymous with "executable", I think, or perhaps "live" or "connected") but precisely the opposite. My own suspicion is that to make this sort of fancy stuff work properly, we will ultimately need a paradigm where software components are intrinsically, persistently and bidirectionally connected, and where interactive computation is the automatic and incremental synchronisation of distributed state. I've talked about this before, and will hopefully have more to say about it in the future, but for now I only wish to suggest that this, or something like it, is where we should focus our attention. It's where the real challenge lies.

Conclusion: less pop, more sci

To sum up, I sincerely doubt that there is an impending "post-formalist" reconstruction of the foundations of computing. If we want things to be fluffy, they're damn well going to have to be fluffy in some kind of technical, mathematically robust sense, not in some...well, fluffy sense. We should embrace pluralism and perspectivalism, both in science and computing, but not at the price of sloppy pop sci or new-age philosophy. The member of the audience who wondered whether Kiczales' "radical thesis" had more in common with quantum mechanics than classical mechanics should be sent to bed without any tea.

What this kind of question, the popularity of Smith's book, and to a lesser degree, Kiczales' talk, ultimately brings home is perhaps this: that if it's socially negotiated artifacts we're after, we need look no further than the world of technical ideas, mediated by the situation action of conference attendance. Smart people often believe strange things; maybe that's the object lesson.

Labels: , , ,

Monday, October 01, 2007

Sweet Home West Midlands

Not my actual home, obviously. I'll still be living in Bristol. But Birmingham is going to be my spiritual home: where I go when I lose the faith and stop believing in life after Dynamic Aspects. My first day left me feeling refreshingly (if inaccurately) only mildly stupid. My supervisor Paul Levy seems to be an all round nice bloke; handy, as we've got to hang out for the next three years.

I'm currently learning modal logic. It's turning out to be awesome. I'm hoping, sometime (oh, in the next hundred years or so) to be able to use it to formalise the possible worlds view of interactive systems which I've been informally mincing about with for the last 6 years or so. The basic idea of modal logic is that a modal formula is a little automaton that can wander about in a structure of worlds which are mutually accessible through various relations, taking on different values at different worlds. Different access relations between worlds induce different kinds of structural properties on the space of worlds (or contexts, states, instants or points, depending on how you want to construe the relata).

Any kind of labelled transition system can be talked about modally, but so can many other things, such as time. A simple example: defining two modalities F and P (future and past) plus their respective duals G and H) such that F and P are inverses means that theorems such as

p -> GPp


and

p -> HFp


are universally true of such systems. (Respectively: if something is true now then at all future times it will have been true in the past; if something is true now, then at all past times it was going to be true in the future.) Temporal modal logic is obviously relevant to interaction [Pnu92], but modality in general seems to offer a nice framework for thinking about lots and lots of things: state, control, and dynamic scoping, to mention some existing applications, but also incremental computation, transactional concurrency, declarative concurrency, mobile code; the list goes on. And on.

That's the theory side of things, for now. On the practical front, I'm trying to understand monads and monad transformers well enough to implement a backtracking interpreter in Haskell that addresses the famously poor interaction between a normal-order or lazy reduction strategy and function caching or memoisation. The problem is simple: to be any good as an incrementalisation technique, memoisation needs to know the values of arguments to functions up front. But normal-order reduction defers the evaluation of arguments until they're needed. A prima facie conflict of interests.

I have a vague notion of a solution, an idea gleaned from Avi Pfeffer's work on IBAL [Pfe06], but much more investigation is required. The nice thing about implementing the interpreter using monads is that it should transfer easily into a modal language, as it turns out monads themselves are closely related to modal logic [Kob97][Pfe01].

(Actually call-by-need - the technique usually used to implement call-by-name - uses something like memoisation to avoid evaluating arguments more than once. But call-by-need is still a just-in-time evaluation strategy. What I'm looking for is something with the useful termination characteristics of call-by-name, for unneeded arguments, but the eagerness, for needed arguments, of call-by-value, without requiring a full global strictness analysis.)

Labels: , , , ,

Friday, March 02, 2007

FInCo 2007

Here is the draft of a FInCo 2007 paper which elaborates (although still entirely informally) the programming model called declarative interaction which I mentioned in my first posting. FInCo is one of the satellite workshops of ETAPS 2007.

The review comments were useful, and reasonably positive. I'm looking forward to some interesting discussions. I conspicuously failed to give even a passing nod to agent-oriented programming; given the likely focus at the workshop, this is something I need to learn about.

Labels: , , , , , , ,

Tuesday, January 02, 2007

Programming languages for interactive computing

Almost all software systems today are interactive, in that they maintain ongoing interactions with their environments, rather than simply producing a result on termination [GSW06]. Indeed, a consistent trend since the beginning of the digital era has been towards increasingly interactive systems. The trend has progressed on at least two fronts: enhancements to end-user interactivity, and increasingly integrated systems. The trend began with the first teletype and textual interfaces and continued through early GUIs and LAN-based operating systems. It continues with today's 3D virtual worlds and applications deployed over the wide-area network.

With the Internet emerging as the "global operating system", the pressure on our software to be interactive is greater than ever before. Consider how the following requirements can be understood in terms of enhanced interactivity:

  • Ability to reconfigure or repair applications without taking them offline → interaction with code as well as data
  • Long-running, continously-available applications → interaction must be robust
  • Sessions resumable from wherever we happen to be located → persistence of interactions
  • Transparent recovery from latency problems and intermittent connectivity → interaction should not be semantically sensitive to the network
  • Mobile code whose behaviour depends on execution context → dynamically scoped interactions

A variety of process algebras and other formalisms have been developed for modelling and reasoning about interactive systems. Yet despite the trend towards greater interactivity, we continue to lack a simple and coherent paradigm for building robust interactive systems. The main bugbear that has faced us has been what we might characterise as an "impedance mismatch" between traditional algorithmic programming languages and the way interactive systems abstractly work. Whereas an algorithmic language treats a program as a black box which produces a final value on termination, an interactive system allows other systems to observe and influence its behaviour as it executes, and must adjust its internal state in response to each interaction to maintain the consistency of the computation.

In current desktop systems, the mismatch is usually resolved by representing the state of the system as a set of mutable stores and then employing a notification scheme to maintain the consistency of the state. Rather than the host language being used to execute a single sequential program to termination, it is employed to execute fragments of imperative code as interactions occur. Each executed fragment must produce exactly the side-effects required to synchronise the state of the system correctly.

Unfortunately this near-universal reliance on the imperative to support interaction has come at an enormous cost in complexity. It will be useful here to recall Brooks' distinction between essential and accidental (or inessential) complexity [Bro87]. Complexity inherent in the problem itself (from a user's perspective), or which can be attributed to human factors, is essential; what remains is accidental or inessential. Interactive systems as currently implemented are dominated by inessential complexity, the main culprits being this ad hoc management of state, explicit concern with flow of control, and unnecessary use of non-deterministic concurrency. Imperative languages (ab)used in this way are the "Turing tar-pit in which everything is possible but nothing of interest is easy" [Per82]. Web applications only complicate things further, by adding a variety of ill-defined staging policies and interaction models into the mix.

Ironically, what unifies much of this inessential imperative complexity is that it exists in the service of essentially declarative ends. Indeed, I suggest that the reason the current paradigm works at all is that systems implemented in this way approximate a much simpler, declarative model of interactive computation. Experienced software designers rely tacitly on something like this declarative model in order to make decisions about what counts as reasonable behaviour. Rather than being ill-suited to interaction, as is sometimes assumed, perhaps because of the somewhat arcane feel of techniques such as monads [JW93][Wad99] for modelling effectful computation, the declarative paradigm - with a simple orthogonal extension to support interactivity - fits the abstract behaviour of interactive, stateful systems surprisingly well. But today, because this declarative behaviour is achieved only indirectly, interactive systems are significantly less reliable and understandable than they need to be, despite the widespread use of design patterns such as Observer [GHJV95] to manage some of the inessential complexity. State-related dysfunctions in particular, such as the gradual degradation of a system over time, spurious sensitivities to the order in which two operations take place, deadlocks and race conditions, are common. I propose that the best way to address the impedance mismatch between algorithmic languages and interactive systems is not to wallow in the tar-pit of imperative languages, but to lift declarative languages into a computational model which supports interaction directly.

The aim of this blog is to explore the conceptual foundations of such a programming model, which I shall refer to as the declarative interaction model. Developing a formal semantics will be a topic for later treatment. Like some other declarative approaches to interaction, the declarative interaction model maintains a clean separation of (stateless, deterministic) declarative computation and (stateful, non-deterministic) reactive computation. The model is distinguished by its modal construal of interaction, whereby an interactive system is taken to be a space of canonically represented "possible worlds", each expressing a purely declarative computation, along with an index into that space indicating the "actual world", which represents the system's current state. To interact with such a system is to select a new actual state from this space of possible states. Interactions (effectful operations) are lifted to meta-programs that manipulate values of modal type; crucially, they are unable to interfere with the purely declarative semantics of each possible state, and in any case are only required when they are essential to application logic. Non-determinism arises from concurrent interactions, which are handled transactionally.

The declarative interaction model relates to many active research areas, including modal type systems, incremental computation, meta-programming, declarative concurrency, transactional concurrency, dataflow computing, interactive computing, and wide-area computing. As we shall see, four closely related concepts in particular are central to the model: modality, incrementality , concurrency, and persistence. A key validation of the model will be the development of an interactive programming language which supports the model directly.

Labels: , , , , , ,