Parallel Haskell Digest 2

Wed, 11 May 2011 20:45:09 GMT, by eric.
Filed under parallel, ph-digest.

Welcome to the second edition of the Parallel Haskell Digest, bringing you news, discussions and tasters of parallelism and concurrency in Haskell.

The digest is made possible by the Parallel GHC project. More news about how we're doing below.

News

Word of the Month

This edition of the digest is brought to you by threads, threads and more threads. In the last digest, we took a look at Haskell sparks and particularly at how they differ from threads. Now let's delve a little bit deeper into threads.

A "thread of execution" is a programming abstraction that helps the programmer separate concerns within a program. Consider a web server serving many documents to many clients simultaneously. The programmer may wish to use threads, using one thread per client making it easier to manage concurrent action.

In the Haskell world, this programming abstraction is provided by Haskell threads. To implement Haskell threads, the GHC runtime system uses what is known as a M:N threading model, where many Haskell threads are scheduled across a much smaller number of operating system threads. This model has two key benefits:

Don Stewart illustrated this on StackOverflow with the following diagram, citing 2011 figures of about a handful of CPUs, a dozen or so OS threads and tens of thousands of Haskell threads (plus for those of us interested in pure parallelism, millions of sparks).

Thread illustration by Don Stewart

If you want to try generating some figures for yourself, have a look at this nofib benchmark utility. The benchmark tool creates a large "pipeline" of simultaneous Haskell threads, and tries to send a message through the pipeline, passing it from one thread to the next. On my laptop, I can create 10000 Haskell threads in 0.074 seconds and pump "Hello World" through them in 0.07 seconds. That works out to 7.4 microseconds per thread (0.7 microseconds for pumping). How about giving it a shot on your computer? Haskell threads are very lightweight.

For most concurrency needs, you can generally forget that operating system threads exist. Indeed, when the documentation for modules like Control.Concurrent refers to "threads", or when Haskell hackers discuss threads in Haskell code it's a good bet that they're referring to Haskell threads rather than OS threads.

That said, there are a few occasions where you may want to be aware about OS threads, in order of importance, if you

Doing any of these things requires that you use GHC's multi-threaded runtime system. GHC currently uses a single-threaded runtime system by default, but until this changes, you will have to explicitly enable the multi-threaded one by linking your program with the flag -threaded. With the multi-threaded runtime system all Haskell threads are scheduled across multiple OS threads as opposed to being interleaved on a single one. This allows for parallelism in the first case, for concurrent foreign code in the second case.

For the first case, where you are specifically interested in parallelism, as well as enabling the multi-threaded RTS you also need to tell the runtime system how much parallelism to try to use. Specifically, you have to say how many OS threads will be used for scheduling your program's Haskell threads. You can do this by passing +RTS -N when you run your program (with GHC 6.12.1 and higher, the bare -N flag will cause the RTS to use a sensible default based on the number of CPU cores on your machine).

If you are only concerned about making foreign calls, just enabling the multi-threaded RTS is enough. The issue with the single-threaded runtime system is that Haskell threads that make foreign calls to the operating system or C libraries will block all other Haskell threads. This means that if a foreign call should take a long time, or worse, if it should block in its own right, all other Haskell threads will be stuck. With the multi-threaded runtime system, Haskell threads that make foreign calls do not block other Haskell threads, because they can be handed over to another OS thread while the one making the foreign call is churning away.

But be careful! Concurrency with foreign calls can be a tricky business. If you are only using one OS thread at a time, you are effectively shielded from having to worry about concurrency issues in foreign code (by not being able to run foreign code concurrently). Enabling multi-threaded mode comes with extra responsibility of making sure any foreign libraries you use are thread-safe, or that you have adequate mechanisms to deal with the ones that are not. Thread-safety isn't an issue with Haskell-only code because shared state is always managed with locks or atomicity. But when concurrency and foreign calls mix, you will need to take care.

There's another issue to watch out for when mixing foreign calls with multiple OS threads. The nice thing thing about the M:N threading model in Haskell is that the GHC RTS scheduler will automatically move Haskell threads between OS thread to achieve a good balance of work. But this introduces a new problem for foreign libraries that use thread-local state: the Haskell thread may calling the library from any number of OS threads, so if the foreign library uses OS-thread-local state then this state could be different from one call to the next... what a mess! To solve this problem we have to use a feature called "bound threads". Bound threads are Haskell threads that are bound to a single OS thread; they are not moved around by the scheduler. Bound threads are more expensive than unbound ones because they tie up a whole OS thread, but they are necessary for working with certain foreign libraries like OpenGL that make essential use of thread local state.

Summing up threads in Haskell:

Thanks to Paul Bone for the paragraph presenting threads as a programming abstraction and also to Andres Löh and Duncan Coutts from Well-Typed for extensive help revising this word of the month!

Parallel GHC project news

The Parallel GHC Project is an MSR-funded project to push the real-world use of parallel Haskell. Part of this project involves effort by Well-Typed to provide tools for use by the general community:

We have begun work on making the "Modified Additive Lagged Fibonacci" and perhaps other random number generators from the SPRNG library available in Haskell. As a first step to the Haskell SPRNG reimplementation, we have developed a binding to the existing C++ library. The binding will serve as a reference implementation to test against, but it also ready to be used now.

To complement our work on extending GHC eventlog and Threadscope to support multi-process or distributed Haskell systems, we have begun work on developing new visualisations for ThreadScope, including the rate of parallel spark creation, and the distribution of spark evaluation times.

Meanwhile, work continues on our project partners' side. We hope to be say more about it in the next edition of the digest :-)

For more information on the Parallel GHC project, see the Haskell wiki page

Talks

Blogs, papers and packages

Parallel-Haskell and Haskell Cafe

Stack Overflow

Help and feedback

Got something for the digest? Send me an email at eric@well-typed.com. I'm particularly interested in short parallelism/concurrency puzzles, cool projects for featured code. Other comments and feedback (criticism and corrections especially!) would be welcome too.