Why syntax won’t go away

Some time ago Jonathan Edwards wrote:

Let’s face it: syntax is a simple and intuitive way to represent structure. But a fundamental principle of Subtext is to avoid syntactically encoding programs so that their semantics are explicit and immediate. It should be possible to use syntactic techniques as a visualization in the UI instead of as a representation of the source.

Jonathan’s hope is that we will one day extricate ourselves from the muddy waters of syntax and float free in the semantic stratosphere. It’s hard to argue with the spirit of this hope, and his recent achievement with Subtext 2.0 is an exciting glimpse of the kinds of virtual realities for programming we might inhabit in 5 to 10 years’ time. (Let’s recap…he did all that thinking about schematic tables on paper – ok, electronic paper – and then just went ahead and implemented it, only to have it actually work…I’m jealous.)

The demo itself is a nice reminder of how we are still in the pre-dawn of the computing age, fumbling forlornly in the dark, largely ignorant of our own ignorance. (Paraphrasing Jonathan’s OOPSLA presentation: “In the beginning was the static void…”) Subtext is far from a fumbling exercise, though; or at least it’s clever fumbling, fumbling in the direction of the light switch. But Subtext is both nascent, and philosophically radical, and like most radical youth it’s still a little confused: not all of its ideas quite make sense yet. My own research interests are largely compatible with Jonathan’s, and I have a few thoughts on how Subtext might come of age…a little friendly constructive criticism (although I guess Jonathan would agree with much of what I have to say), but also some more general considerations for the long-term future of these sorts of highly visual, highly interactive languages.

So where is Jonathan trying to go with his idea of “explicit and immediate semantics”? Just what is a UI visualisation of a program, after all, if it is not a representation of “the source”? Conversely, what modern tools really use unstructured text – character sequences interspersed with line-terminators – as representations for programs? Compilers certainly don’t, and nor do any non-trivial editors. Most work with richer representations. (How successfully they do so is another question. The fact that they are often flaky and unrobust is perhaps what feeds our intuition that they are still just working with flat files “under the hood”.) Flat files endure as an interoperability format, but they are not how anything that does anything with programs represents programs. So the thought must be something deeper: that we should be able to somehow present the “semantics” of the program directly to the user – for example using direct links in place of textual names – reaping significant ergonomic benefits by so doing.

Appealing though this idea obviously is, it is problematic. The problem is not with the claim that there are significant ergonomic gains to be had from advances in UIs, but with the idea that non-textual UIs are doing something different – deeply different – from textual UIs. Because user interfaces are syntactic…and not just in some academic sense, but quite straightforwardly. They have a recursive, tree-like structure; they can be described inductively by a term algebra over a fixed set of constructors. They have semantically irrelevant flourishes, like rounded corners and drop shadows; like most textual syntaxes, in fact, with their curly braces and fancy keywords. UI’s must have syntax because that’s how they convey meaning: there’s no way to represent a set or a function (qua mathematical object), or whatever it is you take your program to mean, “directly”; you have to denote it in some way. (And unless you’re some kind of mathematical Platonist, perhaps there are only denotations. If you are you need help.)

I’m being pedantic, I admit, but I’m not just being pedantic. I think it’s crucial that we start to treat fancy UIs as fancy syntaxes, for the following reason. As UIs become more powerful and interactive – and as VPLs become more compelling – they need to become better behaved. We need to stop thinking of our programming environments as (metaphorically, as well as often quite literally) little “windows” through which we squint awkwardly at the “real” program underneath. Instead our UIs need to become our programs, and our programs our UIs. This is exactly the vision Jonathan has for Subtext, of course. But it can only be practically achieved if we understand UIs as bona fide syntactic structures in their own right, so that we can treat interaction (by end-users or other software processes) as the creation and manipulation of views and transformations of that structure. Think of those cool projections that approximate “method views” in Subtext: if they are to be any use for building real-world programs (whose [partial] meaning we need to be able to grasp from a [partial] visualisation), then these views need to both have a well-defined structure – yes, a syntax – and a well-defined interpretation. To borrow from Jonathan again: notation matters.

So here’s my main thesis: only if we take the syntactic nature of UIs seriously we can turn advanced browsing and editing features into the powerful multi-dimensional programming constructs that they should be. Such features become robust and well-defined transformations: sometimes preserving behaviour, sometimes changing behaviour in well-defined ways. In this hypothetical future, refactoring, browsing and AOP become closely interrelated: browsing becomes a kind of refactoring, where the “factors” are something like aspects. This may sound fanciful, but this is just what the Subtext 2.0 UI gives us a tantalising glimpse of. Refactoring is already an interactive activity, but is nothing like robust enough: for refactoring to come of age it needs to be as formally reliable as the kind of batch-mode transformations that a (well-behaved) optimising compiler might implement. The same applies to advanced browsing features, particularly if they support “putback” editing – otherwise we won’t be able rely on these views or trust edits mediated by them. And conversely, traditional AOP is a robust batch-mode transformation, but lacks interactivity: and so for AOP to come of age, we need to turn aspect-weaving into a live, interactive browsing activity – a UI, in other words, that allows programs to be woven and unwoven while they are being edited [HoK05].

So VPLs have the potential to turn UIs, in an important and I hope non-contrived sense, into rich and full-featured languages for meta-programming. This is potentially in tension with the fact that, to be practical to design, describe, and implement, programming languages need to be simple, or at least modular. Are schematic tables part of a programming language or part of a UI? Think about all the advanced features that schematic tables provide: rendering of predicates in Blake canonical form, enforcement of partioning. Are these features part of the Subtext language or the Subtext UI? And if this distinction is never to be made, how is anyone else to implement Subtext? Must we require bona fide implementations to be isomorphic to the full Subtext UI? (Must it be possible to encode the entire state of the Subtext UI in the state of the candidate implementation, and vice versa?) If we agree this is almost certainly an unreasonable burden to place on implementors, then it is not clear where to draw the line. The engineering risk to which Subtext is thus exposed is non-modularity: neatly sidestepping all the horrible complexity of maintaining a live mapping to textual syntax, only to gradually mire itself in a new kind of complexity that arises from pushing all those fancy views and direct manipulation features into a single “language”.

The way to avoid this potential tarpit is, I suggest, not to shun syntax, but to embrace it. Visual programming languages do not get rid of syntax; they enrich it and make it come alive. But to think of powerful UIs as syntaxes and interactions as syntactic transformations that satisfy certain semantic properties presents a significant architectural challenge. We need be able to layer views of top of views on top of views…so that it’s views all the way down, until we reach a “primitive” view. (In MVC language, we could just as well call them models: the point is that “model” and “view” are, as with “syntax” and “UI”, just relative notions.) The primitive view, or model, is simply the lowest level of syntax for the language, but it is not otherwise distinguished ontologically. These syntaxes are related by semantic functions (homomorphisms), and ideally those mappings would execute both automatically and incrementally, so that changes in one syntactic structure are interpreted as changes in any dependent syntactic structures. The biggest challenge facing this paradigm is perhaps that many of these semantic functions will need to be bidirectional – i.e. functional in both directions, like Pierce’s lenses [FGMP05] – so that we can edit at our chosen perspective…and it is this that really blurs the distinction between syntax and user interface from the end-user’s perspective, providing the rich, integrated experience required for a programming tool, without compromising the modularity of the language. In fact each tier, of a system so conceived, is a separate language and each homomorphism on languages is a denotational semantics which interprets the syntax of one language in the syntax of another. But it’s syntax all the way down. These languages are live, like visual programmings languages, but they need not be visual; my preferred term is interactive programming language (IPL).

So do schematic tables belong in a programming language or a UI? The question turns out to be ill-posed, because these are really relative terms. A similar analysis can be applied to Subtext’s proposed treatment of names (or lack thereof). Theoreticians are familiar with the formal redundancy of names (or at least of bound names): they prefer to work with terms “up to alpha conversion of bound variables”, or to rely on encodings like de Bruijn indices to sidestep name-related issues. But for programming purposes, does life without names get easier or harder? There are no doubt ergonomic benefits to be had from using direct links instead of names. But there are almost certainly ergonomic benefits associated with names too. Imagine a Subtext-like language with only direct links equipped with comment-like annotations. For any non-trivial programming exercise it would probably not be long before we started embedding naming conventions into comments – in other words treating them as structured, rather than flat. We might for example adopt an informal rule whereby no two sibling program elements can have the same comment, so that we can unambiguously find an element by finding its parent and then searching the comments of its children. We might then want to be able to “repoint a link” by overtyping on the comment. At other times, we might expect to be able to maintain the identity of the referent and update the text of the comment instead.

I don’t propose any resolution to the tension between names and links here, because I suspect the ultimate requirement is going to be to have characteristics of both. Methodologically, at any rate, it seems unwise to rule out forever the use of textual names on a priori grounds. So the risk for Subtext as it stands is that if its goal of emancipating us from the world of textual names turns out to be untenable for the kinds of reasons just mentioned, it may be forced to reintroduce names on the sly, by providing various naming-like features in its UI (and effectively embedding a complex language extension). It then faces the very “creeping complexity” dilemma that I suggested arises with schematic tables. On the one hand, if we are not careful to design the language to be layered and modular, implementations become unmanageably complex and lack clarity on exactly what they must provide. Yet if we accept the challenge of designing our language to be layered and modular, and propose language extensions to deal specifically with names – new syntaxes, plus semantics which map terms in our “nominal algebra” to direct links – then we still seem to be lacking a robust implementation paradigm. (And no, MVC doesn’t touch the sides.)

As Alan Perlis said, “There’s no such thing as a free variable.” Or as the counterpart of Fred Brooks probably said in a nearby possible world, no syntactic silver bullets. Maybe we’re forced to accept that there’s a significant bootstrapping issue here: to build mature IPLs, we will need powerful IPLs. Closing the bootstrap gap is one of the focuses of my own research, so maybe one day Jonathan and I will meet in the middle.

From the sublimity of Subtext to…the Charles Petzoldiness of Charles Petzold. I have the disturbing (but in a good way) feeling that this guy is completely serious; as a precautionary measure, however, I shall be taking his observation with a suitably large pinch of monosodium glutamate. Clearly this dude is a comic genius, and the joke is on me for not spotting it. He writes:

Here’s another example of perhaps the most notorious syntactical absurdity in all of traditional C# – the for loop:


for (i=0, j=0; i < 1000; i++)
if (IsPrime(i))
j++;

In CSAML, that jumble of symbols and semicolons is abandoned for a structural beauty that can almost induce the modern programmer to weep with joy:


<ForLoop>
<ForLoop.Initializer>
<StatementExpressionList>
<Assignment LValue="i">
<Assignment.Expression>
<Literal Type="{x:Type Int32}"
Value="0" />
</Assignment.Expression>
</Assignment>
</StatementExpressionList>
</ForLoop.Initializer>
<ForLoop.Condition>
<BooleanExpression>
<LessThanExpression LeftSide ="i">
<LessThanExpression.RightSide>
<Literal Type="{x:Type Int32}"
Value="1000" />
</LessThanExpression.RightSide>
</LessThanExpression>
</BooleanExpression>
</ForLoop.Condition>
<ForLoop.Iterator>
<StatementExpressionList>
<PreIncrementExpression Identifier="i" />
</StatementExpressionList>
</ForLoop.Iterator>
<ForLoop.EmbeddedStatement>
<IfStatement>
<IfStatement.Condition>
<BooleanExpression>
<InvocationExpression MemberAccess="IsPrime">
<InvocationExpression.ArgumentList>
<Variable Identifier="i" />
</InvocationExpression.ArgumentList>
</InvocationExpression>
<BooleanExpression>
</IfStatement.Condition>
<IfStatement.EmbeddedStatement>
<StatementList>
<PreIncrementExpression Identifier="j" />
</StatementList>
</IfStatement.EmbeddedStatement>
</IfStatement>
</ForLoop.EmbeddedStatement>
</ForLoop>

I’m not sure that weeping with joy is quite the response the CSAML code would elicit (but I think I’m in agreement that it would involve some kind of bodily function).

3 comments

  1. Tom says:

    I think I’m agreeing with you (aren’t I?) when I say that the distinction between “textual” and “visual” interfaces for programming is a disingenuous one. If cultural history has demonstrated anything it’s that text is remarkably good at encoding subtle, complex, interdependent ideas; the problem here is the expressivity of prose emerges through a complex interaction with the human brain, whereas Your Favourite IDE is just too damn stupid to do anything especially complex with something so (superficially!) unstructured. And yes, nobody wants to care about actual names, but only because they don’t know how much worse it is to have to care about de Bruijn indices.

    So I’m not at all convinced that “UI” (in the conventional absolute sense, not the well-observed subtle interpretation you’re proposing here) is the solution to any of these problems. If anyone disagrees, they’re welcome to attempt a high-fidelity encoding of your blog post in the medium of modern dance, which is self-evidently more expressive than boring flat text.

    Also, the date of Charles Petzold’s CSAML post rather gives the game away, doesn’t it?

  2. Roly Perera says:

    Excellent, so the joke was on me after all…I should congratulate Charles on the Petzoldiness of his high-quality deadpan. I console myself with the thought that there’s probably someone out there who likes programming in CSAML.

    Yes, I think we are in agreement. We don’t need to partition the world into the textual and the visual. We want to build systems where we work with syntax as tangible structure, which will require significant improvement in UIs, for sure, but mainly in the direction of closing the gap between UIs and “live domain languages”, making syntax come to life and be directly manipulable.

  3. Matt Hellige says:

    Great post, and thought-provoking. Since this conversation is already spread across several blogs, I thought I would keep the trend going and post my response here.