Category: Programming

Designing Painless Protocols

29 December 2009 by Tyler McHenry·2 Comments

Of the several programming jobs I’ve had in my (still relatively short) career, the one thing I’ve had to do the most frequently is implement networking protocols. I’ve implemented standard protocols defined by RFCs, I’ve implemented in-house proprietary protocols, and I’ve implemented experimental protocols for academic research. I’ve yet to be asked to design my own from the ground up, but I have developed a good feel, from an implementer’s perspective, of what works and what doesn’t when it comes to legible, extensible, and robust protocol design.

The point of this post is not to give a guideline for how to design protocols that work well, or are efficient. That is a topic of much larger scope, and if you’re interested in that, RFC 3117 is a good jumping-off point. Rather, this post aims to give a set of suggestions for how to design protocols that will be the least painful for yourself and other programmers to implement, debug, and apply. There is an underlying assumption here that you have already spent the time to decide what your protocol is trying to accomplish and that you have found a way to make it work well (assuming that it is implemented correctly).

Islanded in a Stream of Chars

11 August 2009 by Tyler McHenry·0 Comments

From the “things that really shouldn’t be difficult, but for some reason are anyway” department comes the following. Do you think you know how to program in C++? Familiar with objects and polymorphism and templates and everything? Then this should be dead easy. Should, I said.

Problem: Write a function that takes in a std::istream and a size n and returns a std::string. The string should contain the first n characters of the input stream, with all formatting (whitespace, newlines, etc) preserved.

You can ignore all concerns about multi-byte characters for the sake of this problem. Sounds simple, right? You’d be able to crank this out in ten seconds if someone asked you this in an interview, right? Okay, now try it with this caveat.

Caveat: You must do this in a purely C++ “style”. To be precise, you must do this without using any character variables or character arrays. Use only a std::string object (or some other memory-managed object in the standard library) as your input buffer.

For as much as the C++ STL tries to encourage you to use RAII-oriented containers instead of raw arrays, this seemingly trivial task requires some surprisingly baroque coding. If you want to test yourself, try writing the function before you click more.

Alas, Concepts, We Hardly Knew Ye

22 July 2009 by Tyler McHenry·0 Comments

While catching up on recent postings to Lambda the Ultimate, I was rather surprised to discover that last week concepts were, by a vote of the committee members, completely removed from the C++0x draft specification.

On Monday, July 13th 2009 Concepts were dramatically voted out of C++0x during the C++ standards committee meeting in Frankfurt. […] When I first heard the news, I couldn’t believe it. Concepts were supposed to be the most important addition to core C++ since 1998. Tremendous efforts and discussions were dedicated to Concepts in every committee meeting during the past five years. After dozens of official papers, tutorials and articles — all of which were dedicated to presenting and evangelizing Concepts — it seemed very unlikely that just before the finishing line, Concepts would be dropped in a dramatic voting.

C++0x is the next version of the C++ language standard. It should more properly be called C++1x, since the chances of it being released within the year are like the chances that Microsoft would contribute code to the Linux kernel — er, bad example. Anyway, the point is that it’s not going to be done this year.

While I’ve been following the development of C++0x with interest, I haven’t been immersing myself in the implementation details. In essence, I want to know what’s coming up, but I’m not too interested in committing to memory the nuts and bolts of a specification that is likely to be revised several more times before it is finalized. I’m not totally abstaining from C++0x until it’s done. I do use some features where they are useful and already implemented, for example I am using unordered_map, the standard library’s version of a hash table, in the latest iteration of my work on the lottery problem. However, my understanding of the majority of C++0x features has been relatively, for lack of a better word, “conceptual”.

The layman’s understanding that I had of the C++0x concepts feature led me to react with shock to learning of it’s removal. It sounded incredibly useful, and I didn’t see what could be such a big problem that it would have to be ripped out entirely. That changed when I read more about concepts as they (were going to) exist in C++0x.

Detecting C++ Libraries with Autotools

7 July 2009 by Tyler McHenry·3 Comments

Autotools, more properly known as the GNU Build System is a set of shell script-based utilities that automate parts of the configuration and compilation process for software that is distributed as source. The autotools are, in my opinion, a bit over-complex and fragile, but they remain the most portable and standardized way of allowing software to be compiled on UNIX-like (and especially Linux) systems.

One of the parts of autotools, called autoconf, is responsible for creating the “configure script” for a source package. If you’ve ever built a piece of free software from source, you have almost certainly encountered the following venerable triumvirate of commands:

./configure make sudo make install

That first command runs the script that is the output of autoconf, and it is responsible for detecting the capabilities and details of the system that the software is being compiled on. It, for example, will check that a sufficiently-recent compiler is available, and what system headers are necessary for various semi-standardized functions that the program uses. One other major function of the configure script is to detect whether or not libraries that the software depends on are available on the system.

The library detection mechanism, however, is biased heavily towards libraries written in C. Since C is of course the lingua franca of the UNIX world, this is understandable. Without cajoling, it really won’t detect libraries written in anything else, unless they have C bindings, and that includes C++. What then do you do if you have a C++ library that you wish to detect using autoconf? If you’re interested in the answer to this question, you probably already know what the problem is, and here I will give two workable solutions.

No Factories Necessary

12 June 2009 by Tyler McHenry·6 Comments

I thought of a quick addendum to Monday’s article on covariant, templatized copy constructors. The end of that article said

[T]here is a large caveat that you may have noticed. Due to the the inversion of the inheritance hierarchy around the cloning mixins, we have completely lost the ability to construct derived objects using anything but the default constructor. If you need to construct a Derived (which is really a Cloneable<Derived_>) and pass an argument to the Derived_ constructor, you cannot.

And then I go on to suggest using a factory pattern that is aware of the quirky type hierarchy described in that post. But in fact there is a relatively elegant way around this problem without using factories.

Covariant, Templatized Virtual Copy Constructors

8 June 2009 by Tyler McHenry·6 Comments

One problem that programmers new to C++ often run into when they begin to write non-trivial programs is that of slicing.

In object-oriented programming, a subclass often holds more information than its superclass. Thus, if we assign an instance of the subclass to a variable of type superclass, there is not enough space to store the extra information, and it is sliced off.

The simple way to avoid slicing is to avoid making copies. Simply pass polymorphic objects around by reference (or by pointer) rather than by value, and one never has to worry about slicing. But sometimes the need to copy is unavoidable. In particular, when the need arises to make a copy of a derived object and all you have is a base pointer, things get hairy. How do you know which derived class the pointer’s target really belongs to, in order to make a proper, deep, non-sliced copy?

Fortunately, this problem has a well-known solution pattern called the virtual copy constructor idiom. One implements a virtual clone() method that makes a proper deep, non-sliced copy. There’s nothing wrong with this idiom, but it does entail the inelegance of having to repeat near-identical code for the clone() method in each derived class.

C++ programmers encountering this pattern for the first time often get the bright idea that this code can be centralized through the use of templates. Doing this while maintaining all the advantages of a non-templatized virtual copy constructor turns out to be much trickier than it first appears. This post will explain the problem, and how to do something that I have not yet found described on the Internet: implement the virtual copy constructor idiom using templates without sacrificing covariance.

A Hierarchy of Errors

5 June 2009 by Tyler McHenry·0 Comments

In Monday’s post, “Out-of-Band Error Reporting Without Exceptions”, I described a method for elegantly obtaining three out of five major advantages of C++ exceptions without actually using them. Now I’ll tell you how to go about grabbing a fourth advantage: exception type hierarchies.

C++ exceptions are hierarchical, and their hierarchy is a type hierarchy such that a more-derived exception can be treated, if desired, as one of its base exception types. For example, a FileNotFoundException might be derived from FileIOException. Some code may choose to handle FileNotFoundExceptions distinctly from other FileIOExceptions. Other code might delegate all FileIOExceptions to a common handler. In C++ this is done with implicit typecasting in the catch statements in a try block.

Wouldn’t it be useful if we could do something similar with the ErrorReport and CheckedValue classes that were defined in Monday’s post, instead of just having an error/not-an-error indicator and a human-readable message? Well, we can. It’s easy to use but a little tricky to set up.

Out-of-Band Error Reporting Without Exceptions

1 June 2009 by Tyler McHenry·1 Comment

Let’s say, for the sake of argument, that you’re working in C++ and you happen to be in an environment where exceptions don’t work. In my case, it was an embedded environment where the miniaturized C++ standard library didn’t support them, but in your case it may be that you can’t use them for performance reasons, or they’re against policy or something else.

As opposed to the old C-style return codes, C++ exceptions offer five major advantages:

They are out of band (that is, a function returning an integer need not assign a special value, e.g. -1, to mean “error”).
They can carry information about the specific error that occurred, as opposed to simply a code with a general meaning.
They exist in a type hierarchy. A FileNotFoundException may be derived from a FileIOException, and a handler for the latter can handle the former.
If you ignore them, your program blows up. This is a good thing – an undetected error is far worse than a fatal error.
They automatically unwind the stack to find a handler.

And these are all reasons why, unless you have a good reason not to, you should prefer using exceptions. But in the case when you can’t, it turns out that point 5 is really the only advantage that you have to do without. There’s a relatively elegant method that you can use to achive the first four advantages without using exceptions at all.