Parallel Haskell Digest 1

Eric Kow – Thursday, 31 March 2011

all parallel ph-digest  

I'd like to introduce the first Parallel Haskell Digest, a newsletter aiming to show off all the work that's going on using parallelism and concurrency in the Haskell community.

The digest is a bit of an experiment. For the community in general, I hope to help you catch up with a monthly recap of news, interesting blog posts and discussions about parallelism in Haskell. For people (like me) who are new to parallelism and concurrency in Haskell, or maybe just have a passing interest, I hope to give you little nibbles, with regular features like the Word of Month, Featured Code and Parallel Puzzlers.

Finally, as a bit of disclosure, I'm writing this digest as part of my work to promote the Parallel GHC Project. So don't be surprised if I give it a little special emphasis :-)

Word of the Month

This very first edition of the PH digest is brought to you by the word spark. You may have heard of sparks and threads in the same sentence. What's the difference?

A Haskell thread is a thread of execution for IO code. Multiple Haskell threads can execute IO code concurrently and they can communicate using shared mutable variables and channels.

Sparks are specific to parallel Haskell. Abstractly, a spark is a pure computation which may be evaluated in parallel. Sparks are introduced with the par combinator; the expression (x par y) "sparks off" x, telling the runtime that it may evaluate the value of x in parallel to other work. Whether or not a spark is evaluated in parallel with other computations, or other Haskell IO threads, depends on what your hardware supports and on how your program is written. Sparks are put in a work queue and when a CPU core is idle, it can execute a spark by taking one from the work queue and evaluating it.

On a multi-core machine, both threads and sparks can be used to achieve parallelism. Threads give you concurrent, non-deterministic parallelism, while sparks give you pure deterministic parallelism.

Haskell threads are ideal for applications like network servers where you need to do lots of I/O and using concurrency fits the nature of the problem. Sparks are ideal for speeding up pure calculations where adding non-deterministic concurrency would just make things more complicated.

Parallel GHC project news

The Parallel GHC Project is an MSR-funded project to push the real-world use of parallel Haskell. Part of this project involves effort by Well-Typed to provide tools for use by the general community:

Work shall soon begin working on making the "Modified Additive Lagged Fibonacci" and perhaps other random number generators from the SPRNG library available in Haskell. Current plans are to implement the algorithms directly in Haskell, and to expose them as instances of System.Random.RandomGen. These generators are attractive for use in Monte Carlo simulations because they are splittable and have good statistical quality, while providing high performance.

Work is underway on extending the GHC EventLog and associated tools (ghc-events, ThreadScope). The aim is to support profiling of multi-process or distributed Haskell systems such as client/server or MPI programs. This involves incorporate some changes made in related projects (Eden, EdenTV). This work may have some benefits even for profiling single-process programs. It should also allow comparative profiling where multiple runs of the same program (e.g. different inputs or slightly different code) are viewed on the same timeline.

For more information on the Parallel GHC project, see the Haskell wiki page.

Featured Code

The feature code for this month is hulk, an IRC server in 1250 lines of code. Hulk was coded up by Chris Done in one evening, and it was used the next morning. (Chris has done a bit of cleaning up since).

Here's what Chris had to say about his experience writing Hulk:

Haskell's lightweight threads make it natural (and guilt-free!) to choose the design of one-thread-per-client. With a single MVar containing a pure value for the whole server-state, and the help of a monad stack of ReaderT(connection information), WriterT (replies) and StateT (user details), it was trivial to make the bulk of the code completely pure. LineBuffering on the (Handle-wrapped) sockets tripped me up; as opposed to NoBuffering, this did not behave as expected: many threads would block when only one should have. Overall, it was textbook Haskell; keep the main code pure, and only the truly impure in the outer shell.

It's up on hackage so you can install it with a quick

cabal install hulk

You can also fork his code on GitHub.

Got a cool use of Parallel Haskell or an interesting problem? Nominate it for the PH digest featured code/parallel puzzler!

Blog Posts

Parallel-Haskell and Haskell Cafe

  • Concurrency best practices? Wren Ng Thorton is using STM to "run a lot of things in parallel without the headaches of locks." It works great, but then what if you want IO? How would you enforce atomic IO operations, eg. printing log messages in two threads without getting garbage on screen? Some suggestions were the twighlight-stm package, and using STM.TChan with potential performance penalty.
  • Possible bug in Control.Concurrent. Krzysztof Skrzętnicki encountered an unexpected deadlock using Control.Concurrent. Was it a normal consequence of repeatedly reading from channel with only a single element put into it, or a bug?
  • Faster timeout but is it correct? Bas van Dijk proposed increasingly faster potential replacements for System.Timeout, including some that use the new GHC event manager. After much scrutiny, it turns out that one version is not faster and the events based one had a bug. Maybe next time!
  • SMP parallelism gains inferior than expected.. José Pedro Magalhães also tried using SMP parallelism without getting performance gains he was hoping for. Simon Marlow suggested analysing with Threadscope, checking for too many/few sparks, and checking to see if a lot of time was being spent in GC.

Stack Overflow

Help and feedback

Got something for the digest? Send me an email at eric@well-typed.com. I'm particularly interested in short parallelism or concurrency puzzles, cool projects for featured code. Other comments and feedback (criticism and corrections especially!) would be welcome too.