Buildings plugins as Haskell shared libs

Thu, 21 May 2009 11:46:14 GMT, by duncan.
Filed under coding, industrial-haskell-group.

This post is a sneak preview about building Haskell shared libraries on Linux. We'll look at how to use ghc to make a standalone Haskell shared library that exports C functions. We could use this shared library as part of a bigger project (without having to use ghc for the final linking) or we could load it dynamically, e.g. as a plugin in some other program.

This work is being supported by the IHG and it builds on the hard work of several other people over the last few years (see the first post in this series for the history and credits)

Building GHC with shared libs support

For starters you need the latest development version of GHC. See these instructions on getting the sources and doing the configure, build and install steps.

The only non-standard thing you need to do is to use ./configure --enable-shared. Note that this has only been tested on Linux x86-64 and x86, though in the past, the shared lib support has also worked on Linux PPC and OSX PPC.

Currently what you get is a ghc that itself is statically linked but it can build programs and shared libraries that dynamically link against the runtime system and base libraries.

Building programs that use shared libs

For example, for "hello world":

$ ghc --make -dynamic Hello.hs

It is interesting to look at the output of the ldd program:

$ ldd ./Hello

I'll not paste the whole output, but here's a bit of it:

libHSbase-4.0.0.0-ghc6.11.so =>
  /opt/ghc/lib/ghc-6.11/base-4.0.0.0/libHSbase-4.0.0.0-ghc6.11.so
  (0x00007f8959aff000)

(I've simplified the ghc version slightly)

If you were to look at the full output what you would notice is that it links against each Haskell package as a separate .so file. What is more, it is able to find the shared libs even though they are not in a standard location like /usr/local/lib. This is because by default it is using the -rpath mechanism. It is also possible to build binaries in a mode that does not embed an rpath which might be more suitable for deployment.

Building shared libs

Suppose we have a module Foo.hs that uses the FFI to export a C function called foo():

module Foo where
import Foreign.C
foreign export ccall foo :: CInt -> CInt
foo :: CInt -> CInt
foo = ...

we can build it into a shared library:

$ ghc --make -dynamic -shared -fPIC Foo.hs -o libfoo.so

We need to use -dynamic, -shared and -fPIC. The -dynamic flag tells ghc at the compile step to produce code so that it can link dynamically to dependent packages. At the link step it tells ghc to actually link dynamically to dependent packages. The -shared flag tells ghc to link a shared library rather than a program. The -fPIC flag tells ghc to make code that is suitable to include into a shared library. If we were to break it down into separate compile and link steps then we would use:

$ ghc -dynamic -fPIC -c Foo.hs
$ ghc -dynamic -shared Foo.o Foo_stub.o -o libfoo.so

In principle you can use -shared without -dynamic in the link step. That would mean to statically link the rts all the base libraries into your new shared library. This would make a very big, but standalone shared library. However that would require all the static libraries to have been built with -fPIC so that the code is suitable to include into a shared library and we don't do that at the moment.

If we use ldd again to look at the libfoo.so that we've made we will notice that it is missing a dependency on the rts library. This is problem that we've yet to sort out, so for the moment we can just add the dependency ourselves:

$ ghc --make -dynamic -shared -fPIC Foo.hs -o libfoo.so \
  -lHSrts-ghc6.11 -optl-Wl,-rpath,/opt/ghc/lib/ghc-6.11/

The reason it's not linked in yet is because we need to be able to switch which version of the rts we're using without having to relink every library. For example we want to be able to switch between the debug, threaded and normal rts versions. It's quite possible to do this and it just needs a bit more rearranging in the build system to sort it out. Once it's done you'll even be able to switch rts at runtime, eg:

$ LD_PRELOAD=/opt/ghc/lib/ghc-6.11/libHSrts_debug-ghc6.11.so
$ ./Hello

Going back to our libfoo.so, now that it is linked against the rts it is completely standalone, we can link it into a C program using just gcc, or we can use dlopen() to load libfoo.so at runtime.

Assuming we've got libfoo.so in the current directory, we can link it into a C program:

$ gcc main.c -o main -lfoo -L.

If you use ldd now it'll tell you that libfoo.so is not found. Remember that the runtime linker doesn't look in the same places as the static linker. We told the static linker to look in the current directory with the flag -L.. For the dynamic linker we can either move our libfoo.so to /usr/local/lib or we can embed a path into the binary that tells the runtime linker where to look. One particularly neat way to do this is to tell it to look for the library not at an absolute path, but relative to the program itself:

$ gcc main.c -o main -lfoo -L. -Wl,-rpath,'$ORIGIN'

The Linux runtime linker understands the special variable $ORIGIN and interprets it as the location of the executable. This also works on Solaris. Windows and OS X have something similar. This makes it possible to distribute binaries along with shared libraries and have the whole lot fully relocatable.

If we want to load the library and call functions at runtime we would use C code like:

void *dl = dlopen("./libfoo.so", RTLD_LAZY);
int (*foo)(int a) = dlsym(dl, "foo");
printf("%d\n", foo(2500));

In this case we do not need to link our C program against libfoo.so (we just need -ldl for the dynamic linking functions like dlopen).

$ gcc main.c -o main -ldl

Now one thing to watch out for is that before you call any exported Haskell function, you have to start up the runtime system. If you just call foo() directly then it'll emit a helpful error message to remind you. We have to use the C API of the Haskell FFI to initialise the runtime system. This is a little tiresome. In our case it'll look like:

hs_init(&argc, &argv);
hs_add_root(__stginit_Foo);

The first line is specified by the Haskell FFI. The second is a GHC'ism. It initialises the module containing the function we're going to call.

If you're exporting a plugin API then hopefully the API will support some kind of plugin initialisation. In that case you can include the above C code to initialise the rts before any of the Haskell functions get called. We can do that by adding the above initialisation code into a C function and export that from our shared lib:

void init (void);
void init (void) { ... }

Then we would add init into our shared lib:

$ ghc -fPIC -c init.c
$ ghc -dynamic -shared Foo.o Foo_stub.o init.o -o libfoo.so \
  -lHSrts-ghc6.11 -optl-Wl,-rpath,/opt/ghc/lib/ghc-6.11/

Of course the calling program has to call init() first.

If you have to support a C API where there is no initialiser then we can use this trick:

static void init (void) __attribute__ ((constructor));
void init (void) { ... }

The constructor attribute means the function will be called on program startup or as soon as the library is loaded via dlopen.