Detecting C++ Libraries with Autotools

Autotools, more properly known as the GNU Build System is a set of shell script-based utilities that automate parts of the configuration and compilation process for software that is distributed as source. The autotools are, in my opinion, a bit over-complex and fragile, but they remain the most portable and standardized way of allowing software to be compiled on UNIX-like (and especially Linux) systems.

One of the parts of autotools, called autoconf, is responsible for creating the “configure script” for a source package. If you’ve ever built a piece of free software from source, you have almost certainly encountered the following venerable triumvirate of commands:

./configure
make
sudo make install

That first command runs the script that is the output of autoconf, and it is responsible for detecting the capabilities and details of the system that the software is being compiled on. It, for example, will check that a sufficiently-recent compiler is available, and what system headers are necessary for various semi-standardized functions that the program uses. One other major function of the configure script is to detect whether or not libraries that the software depends on are available on the system.

The library detection mechanism, however, is biased heavily towards libraries written in C. Since C is of course the lingua franca of the UNIX world, this is understandable. Without cajoling, it really won’t detect libraries written in anything else, unless they have C bindings, and that includes C++. What then do you do if you have a C++ library that you wish to detect using autoconf? If you’re interested in the answer to this question, you probably already know what the problem is, and here I will give two workable solutions.

When you want to detect a C library with autoconf, it’s easy enough. You invoke the AC_CHECK_LIB macro. According to it’s documentation, it works like this:

Macro: AC_CHECK_LIB (library, function, [action-if-found], [action-if-not-found], [other-libraries])

Test whether the library library is available by trying to link a test program that calls function function with the library. function should be a function provided by the library. Use the base name of the library; e.g., to check for -lmp, use `mp’ as the library argument.

The problem should jump out at you from this. AC_CHECK_LIB is going to create a test program. That test program will be in C, the function you provide as the function argument had better be callable from C. Obviously, if your C++ library is a library consisting solely of classes, there’s nothing in there at this callable from C, since C doesn’t understand C++ classes. Worse yet, even if there are functions in your library that are not members of classes, these functions probably are still not callable from C.

The reason that even “free” (i.e. non class member) functions in C++ are not callable from C programs is because of name mangling. Since C++ was originally built to compile down to C (and later to compile on its own but still link as C), the developers of the C++ language had to devise a way to cram information about function-related C++ features that C did not support into function identifiers that C allowed. For example, C++ had to have some way to differentiate ClassA::foo() from ClassB::foo() when passing off object code to the linker, given that a C-oriented linker would have no concept of classes. Even outside of class members, C++ added support for function overloading, which was similarly alien to C linkers.

Early C++ compilers worked around these limitations of their linkers by encoding this information into a “mangled” function name. To swipe a good example from the Wikipedia article I linked above:

Consider the following two definitions of f() in a C++ program:

int  f (void) { return 1; }
int  f (int)  { return 0; }
void g (void) { int i = f(), j = f(0); }

These are distinct functions, with no relation to each other apart from the name. If they were naively translated into C with no changes, the result would be an error — C does not permit two functions with the same name. The compiler therefore will encode the type information in the symbol name, the result being something resembling:

int  __f_v (void) { return 1; }
int  __f_i (int)  { return 0; }
void __g_v (void) { int i = __f_v(), j = __f_i(0); }

Notice that g() is mangled even though there is no conflict; name mangling applies to all symbols.

These days, C++ is its own language, but this mechanic remains as a vestige of its early days. Worse, C++ compilers are notoriously inconsistent about how names are mangled. There is no standardized conversion between a native C++ function identifier and its mangled equivalent.

So getting back to autotools, this leaves us with two problems when trying to detect a C++ library using AC_CHECK_LIB:

  1. C++ functions, even free functions, cannot be called from C using their C++ identifiers since their names (in the library binary) will be mangled from what they were in the C++ source code.
  2. Even if you knew what the mangled identifier produced by your compiler was, the mangling schema is far from standardized, so checking for the mangled identifier is extremely non-portable, defeating the whole point of using autotools.

How then do we get around this? Here are two solutions:

Solution One: Find (or create) a function with C calling conventions in the library

This is by far the easiest method if you are also the developer of the library. It is possible (and easy) to create a free function in a C++ program that is not mangled. You do it like this:

extern "C" {
  void libfoo_is_present(void);
}

This declares a function called libfoo_is_present that has C-style linkage. That is, the name will not be mangled whatsoever by your C++ compiler. The tradeoff is, of course, that anything within an extern "C" block cannot use any C++ features such as classes or overloading. That’s fine though – all we want for this purpose is a single function with a relatively unique name that we can search for in this library. Note that since autotools will try to link with the library using this function, you are going to want to give the function an implementation as well, which can be empty.

If you’re not in control of the library sources, you might still be in luck. Some libraries have both C and C++ bindings. The C bindings are going to have C linkage, even if the library was built with a C++ compiler. Choose some function from the C bindings to search for with the AC_CHECK_LIB macro. Even if a library does not have explicit C bindings, you might still be able to find a function with C linkage hidden in the library. For this, you can use the nm tool (if you have a static library), or the readelf tool if you have a dynamic library. It should be fairly simple to tell C symbols from mangled C++ symbols. For example, using readelf on a C library produces

tyler@kusari /lib $ readelf --dynamic --symbols libncurses.so.5 | grep FUNC | tail
548: 000000000001e1d0 42 FUNC GLOBAL DEFAULT 11 slk_touch
549: 0000000000015ce0 15 FUNC GLOBAL DEFAULT 11 erase
550: 0000000000027ae0 120 FUNC GLOBAL DEFAULT 11 _nc_free_termtype
551: 000000000002c6f0 108 FUNC GLOBAL DEFAULT 11 _nc_outch
552: 000000000002ac50 1095 FUNC GLOBAL DEFAULT 11 _nc_tparm_analyze
553: 0000000000028940 149 FUNC GLOBAL DEFAULT 11 _nc_keypad
554: 0000000000014d70 15 FUNC GLOBAL DEFAULT 11 refresh
556: 0000000000014cf0 10 FUNC GLOBAL DEFAULT 11 touchline
558: 0000000000024c40 2 FUNC GLOBAL DEFAULT 11 _nc_freeall
559: 0000000000012d50 73 FUNC GLOBAL DEFAULT 11 beep

While using it on a C++ library produces

tyler@kusari /lib $ readelf --dynamic --symbols libboost_regex-gcc43-mt.so | grep FUNC | tail
1403: 00000000000a1f70 1498 FUNC WEAK DEFAULT 11 _ZN5boost9re_detail18basi
1404: 0000000000090cd0 62 FUNC WEAK DEFAULT 11 _ZN5boost9re_detail12perl
1405: 0000000000091f50 166 FUNC WEAK DEFAULT 11 _ZN5boost9re_detail19basi
1407: 0000000000072fe0 15 FUNC WEAK DEFAULT 11 _ZN5boost6detail17sp_coun
1408: 000000000008cbb0 381 FUNC WEAK DEFAULT 11 _ZN5boost9re_detail19basi
1409: 0000000000042320 75 FUNC WEAK DEFAULT 11 _ZN5boost9re_detail12perl
1411: 0000000000043410 15 FUNC WEAK DEFAULT 11 _ZN5boost6detail17sp_coun
1412: 0000000000083d80 239 FUNC WEAK DEFAULT 11 _ZN5boost9re_detail12perl
1414: 0000000000042680 19 FUNC WEAK DEFAULT 11 _ZN5boost9re_detail12perl
1415: 00000000000980a0 775 FUNC WEAK DEFAULT 11 _ZN5boost9re_detail12perl

The “_ZN5” prefix being a token that designates g++ name mangling.

Not all C++ libraries will have C symbols, but if you find one, just use AC_CHECK_LIB as usual. As a real-world example, the Google Unit Testing Framework is a C++ library, but the gtest_main library that comes with it provides a main() function which always has C linkage. Therefore you can check for googletest like this:

AC_CHECK_LIB([gtest_main], [main], 
  [HAVE_GTEST=1] [TEST_LIBS="$TEST_LIBS -lgtest_main"], 
  AC_MSG_WARN([libgtest is not installed.]))

This checks for googletest by way of checking for libgtest_main by trying to link against its main function. If the library is found, it sets HAVE_GTEST and adds libgtest to the TEST_LIBS variable, and if not it emits a warning.

Solution Two: Roll your own test

If you are out of luck and the library source is not in your control and it has no functions with C linkage, you are not out of luck. Autoconf is quite extensible. Recall that the main issue causing this whole mess is that autoconf’s test programs are C programs. Well, that’s the default, but it doesn’t have to be that way. You have to do it by hand, but it is possible to make autoconf use a C++ test program by way of the AC_LANG_PROGRAM macro.

The semantics of this macro are a bit too lengthy for me to quote here, so just go read the documentation. In summary, this macro takes a bit of code, and wraps it in the usual boilerplate common to C and C++. That is, the code snippet you provide will be treated as the body of a main function. This provides a lot of flexibility. You’re no longer limited to trying to call a function. If your library has no free or static functions, that may not be possible. Instead, you might test the library by trying to instantiate one of its objects.

To use a similar real-world example to the previous solution, consider the Google C++ Mocking Framework, which contains no symbols with C linkage. Here’s how we would construct a test for gmock, starting with the call to AC_LANG_PROGRAM:

AC_LANG_PROGRAM([#include <gmock/gmock.h>], [testing::Cardinality dummy])

Simple enough. The prologue argument (the first argument) gives the include directive necessary for using the gmock library. The body argument gives a simple instantiation of a dummy object from the library. I chose the Cardinality class largely because it was default-constructable and therefore simple to use.

To make AC_LANG_PROGRAM act like AC_CHECK_LIB, you need to wrap it in AC_LINK_IF_ELSE like so:

AC_LINK_IFELSE(
  [AC_LANG_PROGRAM([#include <gmock/gmock.h>], 
    [testing::Cardinality dummy])],
  [TEST_LIBS="$TEST_LIBS -lgmock"] [HAVE_GMOCK=1], 
  [AC_MSG_WARN([libgmock is not installed.])])

The AC_IF_ELSE macro simply gives us the ability to do something if the test passed, or if it failed, much like the final two arguments to AC_CHECK_LIB.

Finally, there are two details left out. First, the above AC_LINK_IF_ELSE macro doesn’t know to provide -lgmock to the linker. In order to do that, we have to set -lgmock in LDFLAGS, and remember to restore the original LDFLAGS afterwords.

Second, we need to tell autoconf to compile this test program with a C++ compiler, or else it’s going to fail for the same reason that using AC_CHECK_LIB would have. This is accomplished using the AC_LANG macro, documented here, along with documentation about how to use multiple languages in test programs in the same configure script.

In total, the whole test for libgmock looks like this:

AC_LANG(C++)
SAVED_LDFLAGS=$LDFLAGS
LDFLAGS="$LDFLAGS -lgmock"
AC_LINK_IFELSE(
  [AC_LANG_PROGRAM([#include <gmock/gmock.h>], 
    [testing::Cardinality dummy])],
  [TEST_LIBS="$TEST_LIBS -lgmock"] [HAVE_GMOCK=1], 
  [AC_MSG_WARN([libgmock is not installed.])])
LDFLAGS=$SAVED_LDFLAGS

Tada!

This should, however, leave you with the same question I had when I went through the process of figuring this out. Namely: why in the hell does AC_CHECK_LIB not respect AC_LANG when creating its test programs?. If you look at the config.log that the configure script generates, it always uses C programs and the C compiler to run AC_CHECK_LIB test, even when the language has been set to C++. If this were not so, things would be much simpler, except in the case where a C++ library had absolutely no free or static functions to test for.

If anyone knows why autoconf behaves like that, please enlighten me (or better yet, explain how to make it behave otherwise!).


Share this content on:

Facebooktwittergoogle_plusredditpinterestlinkedinmailFacebooktwittergoogle_plusredditpinterestlinkedinmail

3 comments

  1. Thanks for the nice article. It helped me to modify AC_CHECK_LIB so that it works with a class:

    # GB_CHECK_LIB(LIBRARY, [PROLOGUE], [BODY],
    # [ACTION-IF-FOUND], [ACTION-IF-NOT-FOUND],
    # [OTHER-LIBRARIES])
    AC_DEFUN([GB_CHECK_LIB],
    [m4_ifval([$4], , [AH_CHECK_LIB([$1])])dnl
    AS_LITERAL_IF([$1],
    [AS_VAR_PUSHDEF([ac_Lib], [ac_cv_lib_$1_$3])],
    [AS_VAR_PUSHDEF([ac_Lib], [ac_cv_lib_$1”_$3])])dnl
    AC_CACHE_CHECK([for $3 in -l$1], [ac_Lib],
    [ac_check_lib_save_LIBS=$LIBS
    LIBS=”-l$1 $6 $LIBS”
    AC_LINK_IFELSE([AC_LANG_PROGRAM([$2], [$3])],
    [AS_VAR_SET([ac_Lib], [yes])],
    [AS_VAR_SET([ac_Lib], [no])])
    LIBS=$ac_check_lib_save_LIBS])
    AS_IF([test AS_VAR_GET([ac_Lib]) = yes],
    [m4_default([$4], [AC_DEFINE_UNQUOTED(AS_TR_CPP(HAVE_LIB$1))
    LIBS=”-l$1 $LIBS”
    ])],
    [$5])dnl
    AS_VAR_POPDEF([ac_Lib])dnl
    ])

    In your example you would call it like this:

    AC_LANG_PUSH([C++])
    GB_CHECK_LIB(gmock,
    [#include ],
    [testing::Cardinality dummy],
    [HAVE_GMOCK=1],
    [AC_MSG_WARN([libgmock is not installed.])
    AC_LANG_POP([C++])

    This is based on AC_CHECK_LIB from autoconf 2.61 (and thus inherits its GPL license). Note that I could not observe the behaviour of AC_CHECK_LIB always using the C language you mentioned: AC_CHECK_LIB works fine with a C++ library which contains a free function in the global namespace if the current language is C++. The macro above is useful if the whole library is inside a namespace, or if it only contains classes.

  2. Thanks for the article: very clear and helpful. Just a note about your AC_LINK_IFELSE recipe: the -lgmock should actually be added to LIBS, not LDFLAGS. In many cases you will get away with modifying LDFLAGS, but I just encountered a problem with using libboost_filesystem where the resulting order of g++ command line arguments resulted in errors like “undefined reference to `boost::system::generic_category()”.

    Moving the “-lboost_filesystem -lboost_system” part to a temporary version of LIBS, and keeping the -L options in the temporary LDFLAGS sorted out the command line and everything worked.

    On the subject of temporary variables, I think this article on how AC_CHECK_LIB modifies the LIBS variable is also worth reading, since it seems to be a “secret” and undesirable behaviour in addition to just doing the test:
    http://blog.flameeyes.eu/2008/04/i-consider-ac_check_lib-harmful

    Andy

Leave a Reply to Georg Baum Cancel reply

Your email address will not be published. Required fields are marked *