Alas, Concepts, We Hardly Knew Ye

22 July 2009 @ 09:19
 in Programming

While catching up on recent postings to Lambda the Ultimate, I was rather surprised to discover that last week concepts were, by a vote of the committee members, completely removed from the C++0x draft specification.

On Monday, July 13th 2009 Concepts were dramatically voted out of C++0x during the C++ standards committee meeting in Frankfurt. [...] When I first heard the news, I couldn’t believe it. Concepts were supposed to be the most important addition to core C++ since 1998. Tremendous efforts and discussions were dedicated to Concepts in every committee meeting during the past five years. After dozens of official papers, tutorials and articles — all of which were dedicated to presenting and evangelizing Concepts — it seemed very unlikely that just before the finishing line, Concepts would be dropped in a dramatic voting.

C++0x is the next version of the C++ language standard. It should more properly be called C++1x, since the chances of it being released within the year are like the chances that Microsoft would contribute code to the Linux kernel — er, bad example. Anyway, the point is that it’s not going to be done this year.

While I’ve been following the development of C++0x with interest, I haven’t been immersing myself in the implementation details. In essence, I want to know what’s coming up, but I’m not too interested in committing to memory the nuts and bolts of a specification that is likely to be revised several more times before it is finalized. I’m not totally abstaining from C++0x until it’s done. I do use some features where they are useful and already implemented, for example I am using unordered_map, the standard library’s version of a hash table, in the latest iteration of my work on the lottery problem. However, my understanding of the majority of C++0x features has been relatively, for lack of a better word, “conceptual”.

The layman’s understanding that I had of the C++0x concepts feature led me to react with shock to learning of it’s removal. It sounded incredibly useful, and I didn’t see what could be such a big problem that it would have to be ripped out entirely. That changed when I read more about concepts as they (were going to) exist in C++0x.

According to the InformIT article that I linked above, the straw that broke the camel’s back was a paper by Bjarne Stroustrup entitled “Simplifying the Use of Concepts”. This paper, ironically, was intended to champion the retention of the concepts feature, and to propose solutions for some of the feature’s outstanding problems. According to that article, this paper “scared folks.” That’s a description that’s hard to interpret until you read the paper itself, which if you have any interest in C++0x or any curiosity about why concepts were removed, I strongly suggest you do. Let me give my reaction to it.

First, some background. The current C++ standard library already has concepts, and uses them heavily. The most easily demonstrable example of concepts in the current C++ standard library is iterators. You have ForwardIterators, BidirectionaIterators, RandomAccessIterators, etc, etc. These each specify behavior semantics and usage syntax for different kinds of iteration. For example, a ForwardIterator must (among other things) support operator++ and be re-usable over the same collection. Yet if you go looking for a class called ForwardIterator to use as a superclass of your iterator, you will find nothing.

The fundamental difference between the concepts currently in the standard library and the concepts that were until recently part of the C++0x draft is that the compiler is totally ignorant of concepts as they exist in C++ today. Today, concepts exist only in the documentation for the standard library; the compiler is completely ignorant of them. The only ways for a programmer to be sure that he or she has conformed to a required concept is to either read the documentation, to read the implementation code and infer the concept requirements, or to guess what the concept requirements are and to refine that guess as a result of each compiler error that arises. C++0x concepts aimed to give a language-supported and compiler-understood mechanism for defining the requirements of a concept in code.

Years ago, my first reaction to concepts, both in C++98 and C++0x forms, was “that’s neat, but why don’t they just use interfaces?” After all, generics in Java accomplish this rather elegantly through the use of interfaces, wildcards, and wildcard binding. The answer is that interfaces are inherently an object-oriented construct, and C++ is not just an object-oriented programming language.

Sure, interface-based template restriction would work fine for classes, but not requiring the use of classes when it is not necessary is a fundamental part of the multi-paradigm nature of C++. In fact, even though Java is object-oriented, its interface-based generics implementation has an obvious and annoying artifact in the completely unintuitive rule that you cannot use fundamental types (such as int) for generic type parameters. In C++, aside from wanting to avoid Java-like autoboxing, it is essential that code like this be legal:

  1. int numbers[] = {42, 27, 117, 1024, 34};
  2. std::sort(numbers, numbers + sizeof(numbers));

This uses the templated std::sort function to sort a raw C-style array. The arguments to std::sort are RandomAccessIterators, which is a concept, not an interface. If RandomAccessIterator were an interface, then we would need either need to convert numbers into a random access container object (e.g. std::vector), or we would need to wrap the pointers that we passed to std::sort in iterator objects that implemented the RandomAccessIterator interface. Instead, since RandomAccessIterator is a concept, passing two int*s to std::sort works just fine with no behind-the-scenes magic, since pointers into an array support all the syntax that RandomAccessIterators must support, and they have equivalent semantics.

The big idea to take away from this set-up is that C++ templates differ from other similar systems (like Java generics) because they rely on structural typing, rather than nominal typing. That is, template arguments are validated based on whether or not they support the required syntax (operators and/or method calls). Not only is this different from other languages, it is radically different from other parts of C++, such as function arguments, which are validated based on whether or not they derive from the appropriate class. Because of this, the idea of adding formal concepts to C++0x was necessarily more than just a way to bring documentation into code that the compiler could understand and enforce. Concepts in C++0x were supposed to unify structural and nominal typing in C++.

Formal concepts are a way to give symbolic names to structural properties. Given this, a templated function or class can specify the symbolic name(s) for the collection of structural properties that it requires of its template argument(s). These names can then by looked up by the compiler, or the programmer, or the programmer’s IDE, to determine if a given type is appropriate as a template argument to that function or class. However, concepts are not nominal types, so there is no need for the type of the template argument to itself know anything about the concept that it is structurally conforming to.

The problem is that unifying the almost diametrically opposite notions of structural and nominal typing, of course, quite hard, which implies that formalized concepts are going to be quite difficult to define in just the right way.

What Bjarne addresses in his paper (linked above) is a series of problems that are caused by what the C++0x comittee had been referring to as “explicit concepts”. These are concept definitions which types must explicitly declare their support for, rather than implicitly supporting by means of providing the appropriate strucutral properties. An explicit concept is a nominal type, no two ways about it. It’s very much like an interface, except that there is a mechanism called a concept map which can be used to specify that a type supports the explict concept without requiring the type (which may not even be a class) to modify its own definition to declare support, as one would do when inheriting from an interface.

Concept maps have other uses, but the problem in using them in this particular way is that the extremely loose coupling between a concept map and the type that it is acting on leads to tricky problems of ambiguity and inconsistency. You can read Bjarne’s paper to understand this in more detail. Furthermore, writing concept maps is frankly a chore, and the more explicit concepts there are, the more often the average programmer is going to face the choice of put up with this annoyance or avoiding concepts altogether.

It becomes clear from the paper that there are several showstopper-level design bugs in the then-current C++0x draft specification of concepts. The core of Bjarne’s proposed solution is to eliminate explicit concepts altogether, and instead rely only on explicit or implicit relationships between concepts to resolve situations where wholly implicit concepts are inappropriate.

The example that Bjarne uses is the relationship between the concepts of RandomAccessIterator and a derivative concept called ContiguousIterator, which is a RandomAccessIterator guaranteed to use contiguously-allocated memory storing only plain data. They have exactly the same semantics, and so the compiler would have no way of knowing (without a concept map) whether or not a given iterator that matches these semantics actually conforms to the additional logical requirement of being a ContiguousIterator. The existing solution was to make them both explicit concepts, but this leads to a proliferation of concept maps to distinguish between the two concepts. Bjarne’s proposed fix is to make the derivation of ContiguousIterator from RandomAccessIterator explicit, with the effect that all types which match their common syntax are treated only as RandomAccessIterators unless there is a concept map stating otherwise.

While Bjarne’s proposed fix for this problem would work just fine, what I think is missed is this: Concepts as they were defined for C++0x were effective at capturing structural properties and assigning symbolic names to them. However, in a nominal type system, the class and interface hierarchy provides more than just symbolic names that capture a collection of method signatures. A nominal type represents a semantic contract that goes above and beyond the structure of the type. For example, a Shape class and a Cowboy, may both have a draw() method, but the two methods have completely different semantics. Bjarne mentions this shortcoming and dismisses it:

[W]e have lived happily with class hierarchies and templates for decades without [this issue] reaching anyones top 100 list of traps and pitfalls, so it is not on my top 100 list of C++ problems needing solution. One reason “accidental match” hasn’t been a major problem is that we rely on simultaneous matches of both names and types. For example, only if the Cowboys draw() has the same argument and return types as Shapes draw() and the same holds for every other function in the concept/type can the problem slip past the compiler (to be caught be the simplest testing). Also, unrelated types tends to get muddled only when they are used for similarly named functions, so that overload resolution often catches the mistakes as ambiguities.

But the disturbing fact that remains is that without explicit concepts, formal concepts become unable to fully capture the notion of a semantic contract in a design-oriented manner. Explicit concepts provide a way to say “this concept has an important semantic contract”, whereas explicitly derived concepts provide only the much weaker and much more language-technical statement of “this concept’s semantic contract differs in some important way from that of its nominal parent”. That loss of expressiveness is what scared me when I read his paper. The two statements are similar from a language perspective (i.e. getting your program to compile), but from the more important design perspective (i.e. expressing the meaning of your code in your code) it doesn’t let you express things that you will very likely want to express.

Because of that deficiency, it really can’t be said that implicit concepts alone accomplish the goal of unifying structural and nominal typing. Bjarne’s “simplified” concepts are not a meaningful advancement over symbolic names for a collection of syntax requirements. Symbolic syntax naming is nice, and a mild improvement over the status quo, but it fails to fill the whole gap that concepts in C++0x were supposed to fill.

Furthermore, since they don’t fully capture semantic contracts, the explicit relationships between concepts that Bjarne proposes to avoid problems like the ContiguousIterator issue are basically just fix-ups for conflicts between concept rules and other existing C++ rules like overload resolution. This part of “simplified” concepts just doesn’t feel like an elegantly designed solution. Instead, it feels like what it is: an effort to save as much as possible of the original C++0x concept proposal while rather unceremoniously hacking off the parts of that design that led to the unacceptable explicit concept bugs.

Don’t get me wrong, I’m not defending explicit concepts. Aside from the bugs, I think they and their associated concept maps are a rather tedious way to express semantic contracts. And neither am I claiming to be smarter than Dr. Stroustrup or that I know a better way to express semantic contracts in concepts. What I mean to say is that in the end, I have to agree with the C++0x committee that concepts probably need some kind of fundamental redesign, and that a poorly-designed or incomplete concepts implementation which fails to truly unify structural and nominal typing would be markedly worse for the C++ language than maintaining, for the time-being at least, the status quo of having no formal concepts at all.

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URL

Leave a comment