<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

<generator>Well-Typed Blog Generator</generator>
<icon>http://www.well-typed.com/blog/favicon.ico</icon>
<id>http://blog.well-typed.com/feed/atom/</id>
<link href="http://www.well-typed.com/blog/atom.xml" rel="self" type="application/atom+xml" />
<link href="http://www.well-typed.com/blog/" rel="alternate" type="application/xhtml+xml" />
<logo>http://www.well-typed.com/img/logo_125_20.png</logo>
<title type="text">Well-Typed Blog</title>
<subtitle type="text">Because Well-Typed Programs Don't Go Wrong</subtitle>
<updated>2012-01-12T12:44:17Z</updated>
<entry>
    <title type="text">Well-Typed are hiring: Haskell consultant</title>
    <published>2012-01-12T12:44:17Z</published>
    <updated>2012-01-12T12:44:17Z</updated>
    <link rel="alternate" type="application/atom+xml" href="http://www.well-typed.com/blog/63" />
    <summary type="text"><![CDATA[In order to keep up with customer demand, we are looking to hire a Haskell expert to work with us at Well-Typed as a Haskell consultant. This is an exciting opportunity for someone who is passionate about Haskell and who is keen to improve and promote Haskell in a professional context. The role is quite [...]]]></summary>
    <author><name>andres</name></author>
    <category term="well-typed" />
    <id>http://www.well-typed.com/blog/63</id>
    <content type="html"><![CDATA[<p
>In order to keep up with customer demand, we are looking to hire a
Haskell expert to work with us at
<a href="http://www.well-typed.com/"
  >Well-Typed</a
  > as a Haskell consultant.</p
><p
>This is an exciting opportunity for someone who is passionate about
Haskell and who is keen to improve and promote Haskell in a professional
context.</p
><p
>The role is quite general and could cover any of the projects and
activities that we are involved in as a company. The tasks may involve:</p
><ul
><li
  >working on the Haskell compilers, libraries and tools;</li
  ><li
  >Haskell application development;</li
  ><li
  >working directly with clients to solve their problems.</li
  ></ul
><p
>Well-Typed has a variety of clients. For some we do proprietary Haskell
development and consulting. For others, much of the work involves
open-source development and cooperating with the rest of the Haskell
community: the commercial, open-source and academic users.</p
><p
>At the moment, we are running the 
<a href="http://www.haskell.org/haskellwiki/Parallel_GHC_Project"
  >Parallel GHC Project</a
  >.
It is likely that
initial tasks will have some connection with parallel and/or concurrent
programming in Haskell. We are also doing quite a bit of GHC maintenance, and
some knowledge or interest in compiler internals, operating systems, the
foreign language interface, and/or deployment issues would be welcome.</p
><p
>Our ideal candidate has excellent knowledge of Haskell, whether from
industry, academia, or personal interest. Familiarity with other
languages, low-level programming, and good software engineering
practices are also useful.  Good organisation and ablity to manage your
own time, and reliably meet deadlines, is important. You are likely to
have a bachelor's degree or higher in computer science or a related
field, although this isn't a requirement. Experience of consulting, or
running a business, is also a bonus.</p
><p
>The position is initially as a contractor for one year with a salary
of 150 GBP per day. We offer flexible hours and work from home.
Living in England is not required.</p
><p
>In the longer term there is the opportunity to become a member of the
partnership with a full stake in the business: being involved in
business decisions, and fully sharing the risks and rewards.</p
><p
>If you are interested, please apply via
<a href="mailto:info@well-typed.com"
  >info@well-typed.com</a
  >.
Tell us why you are interested and why you would be a
good fit for the job, and attach your CV. Please also indicate when you might
be able to start. We are more than happy to answer informal enquiries. Contact
<a href="http://www.well-typed.com/who_we_are"
  >Duncan Coutts, Ian Lynagh or Andres L&#246;h</a
  >
for further information, either by email or IRC.</p
><p
>The deadline for applications is Friday 27th January 2012.</p
><h3
> About Well-Typed</h3
><p
>Well-Typed LLP is a Haskell services company, providing consultancy
services, writing bespoke applications, and offering commercial
training in Haskell and related topics.</p
><p
><a href="http://www.well-typed.com/"
  >http://www.well-typed.com/</a
  ></p
>]]></content>
</entry>
<entry>
    <title type="text">Parallel Haskell Digest 7</title>
    <published>2011-12-24T20:32:44Z</published>
    <updated>2011-12-24T20:32:44Z</updated>
    <link rel="alternate" type="application/atom+xml" href="http://www.well-typed.com/blog/62" />
    <summary type="text"><![CDATA[GHC 7.4 is coming! There is loads to look forward to, but sometimes, it's the little things that count. For example, do you hate the fact that you can't just flip on an +RTS -N without having to first recompile your program, this time remembering to throw an -rtsopts on it? Duncan Coutts has relaxed [...]]]></summary>
    <author><name>eric</name></author>
    <category term="parallel" />
    <category term="ph-digest" />
    <id>http://www.well-typed.com/blog/62</id>
    <content type="html"><![CDATA[<p
>GHC 7.4 is coming! There is loads to look forward to, but sometimes,
it's the little things that count. For example, do you hate the fact
that you can't just flip on an <code
  >+RTS -N</code
  > without having to first
recompile your program, this time remembering to throw an <code
  >-rtsopts</code
  > on
it? Duncan Coutts has relaxed the requirement so that commonly used RTS
options can be used without it. This flag was originally implemented to
counter security problems for CGI or setuid programs; however, it was
also a hassle for regular users because it got in the way of common
options like <code
  >-eventlog</code
  >, <code
  >-N</code
  >, or <code
  >-prof</code
  >. The GHC 7.4 RTS will make a
better tradeoff between security and convenience, allowing a common set
of benign flags without needing <code
  >-rtsopts</code
  >.</p
><p
>That's the sort of thing that the Parallel GHC Project is about. We want
to push parallel Haskell out into the real world, first by helping real
users (our <strike>guinea pigs</strike> industrial partners) to apply it to their
work, second by making it easier to use (tools, libraries), and finally
communicating more about it (this digest).</p
><p
>In this month's digest, we'll be catching up on news from the community.
After the holidays, we'll be back with some new words of the month
exploring a bit of concurrent Haskell. In the meantime, happy hacking
and Merry Christmas!</p
><h3
> News</h3
><p
><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-November/097008.html"
  >Job Opportunity at Parallel Scientific</a
  ></p
><p
>Peter Braam wants you, parallel Haskeller!</p
><blockquote
><p
  >Parallel Scientific, LLC is a Boulder, CO based early stage, but funded
startup company working in the area of scalable parallelization for
scientific and large data computing. We are implementing radically new
software tools for the creation and optimization of parallel programs
benefiting applications and leveraging modern systems architecture. We
build on our mathematical knowledge, cutting edge programming languages
and our understanding of systems software and hardware. We are currently
working with the Haskell development team and major HPC laboratories
world wide on libraries and compiler extensions for parallel
programming.</p
  ></blockquote
><p
>Note the mandatory Haskell experience and the desirability of &#8220;in depth
knowledge of core Haskell libraries for parallel programming (NDP, REPA
etc)&#8221;.</p
><h3
> Parallel GHC Project Update</h3
><p
>The Parallel GHC Project is an MSR-funded project, run by Well-Typed,
with the aim of demonstrating that parallel Haskell can be employed
successfully in real world projects.</p
><p
>Our most recent work has been in polishing the upcoming ThreadScope
release that we previewed this September at the Haskell Implementor's
Workshop. This new release comes with goodies for users of Strategies or
the basic <code
  >par/pseq</code
  > parallelism: spark creation/conversion graphs,
visualisations showing your spark pools filling and emptying, and
histograms displaying the distribution of spark sizes. All this with the
aim of helping you gain deeper insight, not just what your program is
doing but <em
  >why</em
  >.</p
><p
>We've also done backend work to make ThreadScope even more useful
further down the road. First, we have improved the ghc-events package by
encoding the meanings of events in state machines. This makes it
possible to validate eventlogs, and doubles as an always up-to-date
source of code as documentation. Second, we have extended the GHC RTS to
emit the startup wall-clock time and Haskell threads labels to the
eventlog. The wall-clock time event allows us to synchronise logs for
simultaneous processes, brining us a step closer to using ThreadScope on
distributed programs. Named Haskell thread make it easier to distinguish
threads from each other.</p
><p
>Finally, we have been exploring the use of Cloud Haskell for high
performance computing on clusters. To do this we would need to abstract
Cloud Haskell over different transport mechanisms, that is to develop a
robust Cloud Haskell implementation sitting on top of a swappable
transport layer. We have posted an
<a href="https://groups.google.com/d/msg/parallel-haskell/wUmoSxdAmhE/2fX7OmYtzlwJ"
  >initial design</a
  >
for this layer on the parallel-haskell list. We have taken the
substantial feedback into consideration and will be sending a revised
design and recording it in a page on the GHC wiki. Meanwhile, we are
working to further validate our design on simple models of both the
transport layer and a cloud Haskell layer on top. Longer term, we aim to
implement some transports, an IP transport in particular and perhaps a
single-node multi-process transport using forks and pipes.</p
><h3
> Tutorials and Papers</h3
><ul
><li
  ><p><a href="http://www.well-typed.com/Hal6/"
    >Tutorial: Deterministic Parallel Programming in Haskell</a
    >
(7 Oct)</p>

<p>Well-Typed's Andres L&#246;h presented a parallel programming tutorial at
the recent Haskell in Leipzig meeting. The tutorial comes with slides,
exercises, sample code. It paints a picture of the parallel Haskell
landscape, and then focuses on one of the many possible approaches
(namely, strategies). One nice feature of the tutorial is an emphasis
on practicalities, for example, on using ThreadScope to figure out
where performance goes wrong in a program. So if you're looking for a
way to get started using on parallelism to speed up your Haskell code,
give Andres' tutorial a try!</p></li
  ></ul
><ul
><li
  ><p><a href="http://malde.org/~ketil/papers/stmcluster.pdf"
    >Parallel Genome Assembly with Software Transactional Memory</a
    >
(27 Oct)</p>
 
<p>Ketil Malde wrote up some of his experiences using STM to parallelise
an inherently complicated program best solved with multiple
interacting threads. His article demonstrates that a program using STM
is able to successfully parallelize the genome scaffolding process
with a near linear speedup. Ketil would be interested in any feedback
the community may have.</p></li
  ></ul
><h3
> Blogs and Packages</h3
><h4
> Actors, actors everywhere</h4
><ul
><li
  ><p><a href="http://hackage.haskell.org/package/remote"
    >remote: Cloud Haskell is here!</a
    >
(27 Oct)</p>

<p>You may have been hearing a lot about Cloud Haskell lately, the new
Erlang-ish distributed programming library for Haskell. Now's your
chance to see what all the fuss is about! Jeff Epstein has uploaded
the remote package to Hackage, so take it for a spin by doing</p>

<pre>
cabal update
cabal install remote
</pre>

<p>Library documentation is on the Hackage page, and more details are
available in the paper
<a href="http://research.microsoft.com/en-us/um/people/simonpj/papers/parallel/remote.pdf"
    >Towards Haskell in the Cloud</a
    ></p></li
  ></ul
><ul
><li
  ><p><a href="http://jpembeddedsolutions.wordpress.com/2011/10/30/distributed-storage-in-haskell/"
    >Distributed storage in Haskell</a
    >
(30 Oct)</p>

<p>So what are people doing with Cloud Haskell? Julian Porter for one has
been working on a distributed monadic MapReduce implementation. Along
the way he's produced a general proof of concept for distributed
storage. Have a look at Julian's page for a short paper and GitHub
page.</p></li
  ></ul
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-October/095970.html"
    >simple-actors 0.1.0 released</a
    >
(11 Oct)</p>

<p>Brandon Simmons accounced simple-actors, an EDSL-style library for
writing more structured concurrent programs, based on the Actor Model.
It was designed for local concurrency, as an alternative to ad-hoc use
of Chans, but could be extended to a distributed system by defining
appropriate <a href="http://hackage.haskell.org/package/chan-split"
    >SplitChan</a
    >
instances for some network &quot;channel&quot;.</p></li
  ></ul
><ul
><li
  ><p><a href="http://sulzmann.blogspot.com/2011/10/haskell-actors.html"
    >Haskell Actors</a
    >
(28 Oct)</p>

<p>Martin Sulzmann wishes he'd named his
<a href="http://hackage.haskell.org/package/actor"
    >actor package</a
    >
&#8220;multi-headed-actor&#8221;. With the recent interest in actor style
concurrency in Haskell, there may be some confusion about the various
packages that are out there. The point in Martin's library is being to
pattern match over multiple events in the message queue, which makes
it easier elegantly express ideas like a marketplace actor which
matchmakes buyer/seller messages. While Martin's library is built on
concurrent channels, it could be adapted to use distributed channels
provided by haskell-mpi or Cloud Haskell. See the paper for more
information
<a href="http://ww2.cs.mu.oz.au/~sulzmann/publications/multi-headed-actors.pdf"
    >Actors with Multi-Headed Message Receive Patterns</a
    >.</p></li
  ></ul
><h4
> More concurrency</h4
><ul
><li
  ><p><a href="http://factisresearch.blogspot.com/2011/10/stm-stats-retry-statistics-for-stm.html"
    >stm-stats: Retry statistics for STM transaction</a
    >
(9 Oct)</p>

<p>Joachim Breitner blogged about the stm-stats package, which provides
wrappers around <code
    >atomically</code
    > to track how often a transaction was
initiated and how often it was retried. The stm-stats library is used
interally by Factis research, but recently released to the wider
Haskell community. In fact, Factis have recently hired Joachim to help
them contribute back to the Free Software community where possible.
So, thanks, Factis and congratulations, Joachim!</p></li
  ></ul
><ul
><li
  ><p><a href="http://apfelmus.nfshost.com/blog/2011/10/11-frp-concurrent-events.html"
    >How to deal with concurrent external events?</a
    >
(11 Oct)</p>

<p>Apfelmus has been scratching his head over a design problem for
event-based frameworks such as GUI libraries: how do you deal with
events that occur while you are currently handling another event?
Apfelmus gave a simple wxHaskell demonstrator illustrating the
problem, (A) reacting to an event while handling another one may
expose internal invariants but (B) reacting to an event after
finishing another one may render it &#8220;impossible&#8221;, i.e. it should not
have happened in the first place. Any thoughts on the dilema?</p></li
  ></ul
><ul
><li
  ><p><a href="http://blog.melding-monads.com/2011/10/24/concurrency-and-foreign-functions-in-the-glasgow-haskell-compiler/"
    >Concurrency And Foreign Functions In The Glasgow Haskell Compiler</a
    >
(24 Oct)</p>

<p>Leon P. Smith posted an overview of the interaction between Haskell
concurrency and FFI calls in GHC. Leon's post walks us through some
the basic concepts: capabilities, Haskell threads, OS threads, and
bound threads. This could be good place to start before delving into
papers or library documentation.</p></li
  ></ul
><ul
><li
  ><p><a href="https://plus.google.com/115372308262579808851/posts/PKrA4817zJB"
    >iteratee-stm</a
    >
(4 Nov)</p>

<p>John Lato announced the new iteratee-stm library recently uploaded to
Hackage. Iteratee-stm provides an iteratee interface that uses bounded
TChans for communication. This makes it simple to run IO in a separate
thread from processing.</p></li
  ></ul
><h4
> Parallelism</h4
><ul
><li
  ><p><a href="http://kenta.blogspot.com/2011/11/qkhsskbg-automatic-deparallelization.html"
    >Automatic deparallelization</a
    >
(17 Nov)</p>

<p>Ken Takusagawa explored a different perspective on parallelism.
Instead of adding parallelism to programs, what if we started with too
much parallelism and stripped it away to fit reality?</p>

<blockquote>
Consider always writing code in a style using egregious fine grained
parallelism: assume lots of cores with no communication latency and no
overhead.

It is the compiler's job to deparallelize (unparallelize, serialize)
the program to run on the actual number of cores available, taking
into account communication latency and the overhead of parallelization
</blockquote> 

<p>Oh, and [<a href="http://kenta.blogspot.com/2004/07/document-refindingkey.html"
    >qkhsskbg</a
    >]</p></li
  ></ul
><ul
><li
  ><p><a href="http://comonad.com/reader/2010/introducing-speculation/"
    >Introducing Speculation</a
    >
(22 Jul 2010)</p>

<p>Recently, I got a chance to catch up with Edward Kmett, getting my
mind twisted into delightful funny shapes in the process. Edward
mentioned his speculation library, yet more parallelism in Haskell!
The library is based on the paper
<a href="http://research.microsoft.com/apps/pubs/default.aspx?id=118795"
    >Safe Programmable Speculative Parallelism</a
    >
by Prakash Prabhu et al. It provides a way to parallelise inherently
sequential algorithms (eg. lexing, Huffman decoding) by guessing the
value of intermediate results. You start working in parallel to build
work off the guess, only discarding it if the guess turns out to be
wrong later on. Check out Edward's blog and slides for more details.</p></li
  ></ul
><ul
><li
  ><p><a href="http://mainisusuallyafunction.blogspot.com/2011/10/quasicrystals-as-sums-of-waves-in-plane.html"
    >Quasicrystals as sums of waves in the plane</a
    >
(24 Oct)</p>

<p>Keegan McAllister posted an somewhat hypnotic animation of
quasicrystals. His post comes with complete source code for his
program using the Repa parallel arrays library. Repa was useful to
Keegan because it provides</p>
 
<ul> 
  <li>Immutable arrays, supporting clean, expressive code</li>
  <li>A fast implementation, including automatic parallelization</li>
  <li>Easy output to image files, via repa-devil</li>
  </ul></li
  ></ul
><ul
><li
  ><p><a href="https://groups.google.com/d/msg/parallel-haskell/ETUJFOjuspU/NVPKnNdL2gcJ"
    >Simple library for CAS posted</a
    >
(7 Dec)</p>

<p>Ryan Newton released IORefCAS, which provides a drop-in replacement
for atomicModifyIORef that takes advantage of the new <code
    >casMutVar#</code
    >
primop from GHC 7.2. Ryan says that &#8220;[b]ecause it's an easy change it
might be worth trying that for hot IOrefs in your parallel app.&#8221;</p>
</li
  ><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-November/097058.html"
    >OpenCL 10.2.2</a
    >
(23 Nov)</p>
 
<p>Luis Cabellos has updated the Haskell OpenCL package with better
documentation and improved error handling using Control.Exception
instead of Either error.</p></li
  ></ul
><h3
> Mailing list discussions</h3
><h4
> Help wanted</h4
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-December/097428.html"
    >Alternative STM implementation</a
    >
(10 Dec)</p>

<p>Daniel Waterworth shared his
<a href="https://gist.github.com/1454995"
    >alternative STM implementation</a
    >,
written as a learning exercise. Could anybody provide Daniel with some
criticism and pointers to STM benchmarks for comparison? His
replacement should drop right in if you only use TVars.</p></li
  ></ul
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-December/097434.html"
    >Parallel Matrix Multiplication</a
    >
(10 Dec)</p>

<p>Mukesh Tiwari is trying to teach himself parallel Haskell (welcome!).
He's gone through Real World Haskell and the
<a href="http://research.microsoft.com/en-us/um/people/simonpj/papers/parallel/AFP08-notes.pdf"
    >tutorial</a
    >
by Simon Peyton-Jones and Satnam Singh, but now trying to implement a
parallel matrix multiplication function, he finds himself with no
sparks converted. Can anybody give Mukesh a hand?</p>

<p>Mukesh also asked about resources for Parallel Haskell, which would be
where I come in. Mukesh, have a look at the parallel Haskell portal:
<a href="http://www.haskell.org/haskellwiki/Parallel"
    >http://www.haskell.org/haskellwiki/Parallel</a
    ></p></li
  ></ul
><h4
> Cloud Haskell</h4
><ul
><li
  ><p><a href="https://groups.google.com/forum/#!msg/parallel-haskell/8YelldrF0QI/TZaoFGjwZfUJ"
    >Cloud Haskell now on Hackage</a
    >
(27 Oct)</p>

<p>Jeff Epstein's announcement that he had uploaded &#8220;remote&#8221; to Hackage
was greeted with joy and a somewhat lengthy discussion on
package/module naming. It looks like the modules will be moved from
'Remote' to 'Control.Distributed.Actor' or
'Control.Distributed.Process' to match the approach used for the
concurrency packages. The final package name seems to be
<a href="https://github.com/haskell-distributed/distributed-process"
    >distributed-process</a
    >.</p>
 
<img src="http://www.well-typed.com/blog/aux/images/ph-digest/bikeshed.jpg" width="180" height="240" alt="Anybody got a paintbrush?" title="Anybody got a paintbrush?"
     /></li
  ></ul
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-October/095731.html"
    >Haskell Cloud and Closures</a
    >
(1 Oct)</p>

<p>Fred Smith gave Cloud Haskell a try, using it to remotely compute the
plus function. Now he wants to be able to send a function to a remote
host, no matter if the function is locally declared or at the top
level. Erik de Castro Lopo replied that this was a known limitation
with the only known workaround being to move the required function to
the top-level. Chris Smith pointed out that while the current
restrictions may be too tight, there is good reason to have them. As
for alternatives approaches to serialising functions, David Barbour
suggested maybe looking at the
<a href="http://www.haskell.org/haskellwiki/TV"
    >tangible values</a
    > work by Conal
Elliot.</p></li
  ></ul
><ul
><li
  ><p><a href="https://groups.google.com/d/msg/parallel-haskell/wUmoSxdAmhE/2fX7OmYtzlwJ"
    >Feedback on Cloud Haskell transport layer interface</a
    >
(2 Nov)</p>

<p>As I mentioned in the Parallel GHC Project update, we've been looking
quite a bit into Cloud Haskell lately. Duncan Coutts posted a request
for feedback on the design for a Cloud Haskell transport layer
interface. We're hoping one day to make use of Cloud Haskell on for
high performance computing on clusters. To do this, we hope to develop
a robust Cloud Haskell implementation sitting on top of a swappable
transport layer, for example, an IP transport, or a single-node
multi-process transport using forks and pipes.</p>

<p>One issues that emerged from the discussion is how to deal with
potentially a plethora of paramaters (eg. buffered vs eager? ordered?
reliable?) associated with connection/endpoint creation. It doesn't
help that each connection type may have its own set of parameters. Is
it enough to be able to set and forget them during transport session
initialisation, or is it essential for Cloud Haskell be able to set
these parameters differently for different connections in the same
session?</p></li
  ></ul
><ul
><li
  ><p><a href="https://groups.google.com/d/msg/parallel-haskell/UnClyLc8GXI/DZpIIof3bhcJ"
    >Parallel Haskell in industry</a
    >
(7 Nov)</p>

<p>S&#233;bastien Lannez also got a chance to try out Cloud Haskell. The
remote package uploaded by Jeff seems to work well and &#8212; dabblers take
note &#8212; the examples shipped with the code are very easy to adapt.
Before digging deeper, S&#233;bastien wanted to know more about</p>

<ol><li>performance limitations</li>
  <li>communication requirements/overheads</li>
  <li>stability</li>
  <li>already developed applications</li></ol><br/>

<p>Jeff cautioned that while he thinks Cloud Haskell could be a good
platform to develop distributed applications, it's still very much
research software and a work in progress. Don't stake your company on
Cloud Haskell just yet.</p>

<p>That said, Duncan Coutts added, we are pretty happy with the design
and optimistic about developing a robust implementation, because we
can build it as an ordinary Haskell library without requiring tricky
extensions to the runtime system. As for S&#233;bastien's fourth question,
a couple of Parallel GHC Project partners are rather keen on Cloud
Haskell. We are working on the implementation and will hopefully have
more to report on performance, overheads and other issues we
encounter.</p></li
  ></ul
><h4
> Multicore performance</h4
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-October/095845.html"
    >SMP parallelism increasing GC time dramatically</a
    >
(5 Oct)</p>

<p>It takes a village to tune a program. Tom Thorne has a program with a
function does some fairly intensive calculations on with hmatrix. When
Tom tries to get some simple parallelism on his 12 core machine,
replacing a <code
    >map</code
    > with a <code
    >parMap rdeepseq</code
    >, he finds GC time going
through the roof, from 1s (1.7%) to 248s (40.5%). Is the big scary
number just an artefact of how GC time is reported, or is something
really wrong?</p>

<p>ThreadScope is a good first response here and Tom was duly nagged by
the community. Tom promises to give it a go, although the last time he
tried, the event log output produced about 1.8GB, and then crashed.
The ThreadScope team would love to get hold of any hints about
reproducing the crash.</p>

<p>Ryan Newton observed that GC aside, the program does not appear to be
scaling; the mutator time itself isn't going down with parallelism.
Tom improved the parallelism a bit, breaking the work into chunks and
spreading it around more evenly, and provided he disables the parallel
GC, it turns much faster and outperforms the sequential version.
Having loads of RAM to play and code that doesn't use much memory, Tom
then tried telling the RTS to perform GC less often. This worked.
Increasing the minimum allocation area size from its default 512K with
<code
    >+RTS -A32M</code
    > allows Tom to get performance with the parallel GC
comparable to that without.  Hooray! But there's still this little problem&#8230;
now Tom's program intermittently segfaults. Getting a bug report out of this
may take a while though as Tom attempts to boil it down.</p>

<p>Meanwhile, Oliver Batchelor offered his experience that enabling more
threads than he has cores makes his program get drastically slower.
Alexander Kjeldaas and Austin Seipp commented that this is due to GC
needing to co-ordinate with blocked threads, and that the problem of
oversaturating is well known. There's also the &quot;dreaded last core
slowdown&quot; bug which once affected Linux users but seems to have gone
away in recent Linux/GHC.</p></li
  ></ul
><ul
><li
  ><p><a href="https://groups.google.com/d/msg/parallel-haskell/pLxsTRJijJg/SKCne1L2tSkJ"
    >AMD Bulldozer modules and Haskell parallelism</a
    >
(13 Oct)</p>

<p>Herbert Valerio Riedel has been eyeing the AMD FX-8120
<a href="http://en.wikipedia.org/wiki/Bulldozer_(processor)"
    >Bulldozer processor</a
    >.
Bulldozer cores are not independent from each other, but grouped into
pairs. So Herbert wanted to know how this might affect Haskell
parallelism; would 8 cores <em
    >really</em
    > mean 8 or just 4 with slightly
better SMT capability? Simon Marlow does not know (benchmarks). Duncan
Coutts believes that it should be all fine as the pairing is not at
all like hyperthreading.</p></li
  ></ul
><ul
><li
  ><p><a href="https://groups.google.com/d/msg/parallel-haskell/UcWJFW-eUsI/nexQ1_6A5BcJ"
    >Estimating contention on an IORef hammered with atomicModifyIORef</a
    >
(27 Oct)</p>

<p>Ryan Newton starts us off with a hypothetical scaling bottleneck: all
threads frequently accessing a single IORef using <code
    >atomicModifyIORef</code
    >
(Data.IORef). This is commonly understood to be likely a bad idea, but
how do we go about measuring just <em
    >how</em
    > bad it is? This sort of design
appears in monad-par, as pointed out by Johan Tibbell, in the GHC IO
manager, so it would be good to know how much it really hurts. (See
also Ryan's
<a href="https://groups.google.com/d/msg/parallel-haskell/ETUJFOjuspU/NVPKnNdL2gcJ"
    >IORefCAS</a
    >
package which seems to be partly a result of this discussion)</p>

<p>One approach is to use GHC events to count operations on particular
IORefs, then put that through a model that reports whether if the
IORef is being used acceptably, or is &quot;hot&quot;. Duncan Coutts suggests a
simple way to get partway there: stick something like a
<code
    >traceEvent &quot;IORef #3&quot;</code
    > on each use of <code
    >atomicModifyIORef</code
    > and do
something like a <code
    >ghc-events show | grep IORef</code
    > to at least get an
idea which <code
    >IORefs</code
    > are hotter than others and some orders of
magnitude. We'll hear back from Ryan when he's had a chance to try it.</p>

<p>Also for the interested, it's worth mentioning that GHC 7.4 will be
sporting a new and improved <code
    >traceEvent</code
    >, this time exported through
<code
    >Debug.Trace</code
    > and offering versions for use in pure code and IO both.</p></li
  ></ul
><ul
><li
  ><p><a href="https://groups.google.com/d/msg/parallel-haskell/b8Yo8HNRnks/noNMm-RLvgEJ"
    >Way to expose BLACKHOLES through an API?</a
    >
(7 Nov)</p>

<p>A BLACKHOLE in GHC acts as a placeholder for a thunk that is currently
being evaluated. When the thunk is forced, GHC replaces it with a
BLACKHOLE object, which it later replaces when it has the evaluation
result. In a parallel/concurrent setting, it may happen that two
threads are trying to evaluate the same thunk at the same time. In
that case, the first thread creates the blackhole, which the second
thread notices and blocks on until the evaluation result is available.</p>
 
<p>Ryan Newton observes that this blocking is implicit, whereas &#8220;[w]hen
implementing certain concurrent systems-level software in Haskell it
is good to be aware of all potentially blocking operations&#8221;. He
proposes a mechanism to expose blackholes, for example with a
<code
    >evaluateNonblocking :: a -&gt; IO (Maybe a)</code
    > that returns <code
    >Nothing</code
    > if
the value is blackholed. Simon Marlow points out that this may be
slightly problematic as thunks depend on each other and &#8220;you might be
a long way into evaluating the argument and have accumulated a deep
stack before you encounter the BLACKHOLE&#8221; See the discussion for a
counter-proposal.</p></li
  ></ul
><h4
> Data structures and concurrency</h4
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-October/096343.html"
    >Efficient mutable arrays in STM</a
    >
(25 Oct)</p>

<p>Ben Franksen has large arrays (millions of elements) with mostly small
elements (Int or Double) and largely chunk-wise access patterns. The
current implementation of <code
    >Control.Concurrent.STM.TArray</code
    > as
<code
    >Array ix (TVar e)</code
    > is not nearly efficient enough for his use case. A
more efficient implementation would be most welcome, but for now Ben
is eyeing <code
    >Data.Vector.Unboxed</code
    > from the vector package instead. The
idea is to use <code
    >unsafeIOToSTM</code
    > to provide shared transactional access
to his arrays. Ben thinks he can live with the consequences: IO code
being rerun, aborting, and inconsistent views.</p>

<p>But does the STM transaction actually &quot;see&quot; that he changed part of
the underlying array so that the transaction is retried? If not, how
does he go about manually implementing this behaviour? Antoine Latter
reports that no <code
    >unsafeIOToSTM</code
    > is not transactional - IO actions will
be performed immediately and are not rolled back, and are then
re-performed on retry. David Barbour and Ketil Malde suggested
possible implementations, either keeping an extra <code
    >TVar  Int</code
    > for
every chunk in the array, or (B) cleaner and safer: create a &#8220;chunked&#8221;
<code
    >TArray</code
    > that works with fixed-width immutable chunks in a spine.</p>

<p>Another issue that came up is that transactions scale quadratically
with the number of TVars touched. Bryan O'Sullivan and Ryan Ingram
explained that this is due to choice of data structure (a list) for
the STM transaction log, and should be easy to fix.</p></li
  ></ul
><ul
><li
  ><p><a href="https://groups.google.com/forum/#!topic/parallel-haskell/9NqyXYo1VRg"
    >High performance threadsafe mutable data structures in Haskell?</a
    >
(27 Oct)</p>

<p>Ryan Newton wanted to know if anybody else was working on threadsafe
mutable data structures in Haskell. He and the monad-par team were
planning to replace their work stealing deques with something more
efficient. If anybody else is working in the same general area,
teaming up would be great!</p>

<p>Ryan will be exploring both a pure Haskell approach and one based on
wrapping foreign data structures with the FFI. Ultimately, Ryan is aiming for
an &quot;abstract-deque&quot; parameterizable interface that abstracts over many
variants (bounded/unbounded, concurrent/non-concurrent,
single/1.5/double-ended, etc). His current prototype makes use of phantom
types and the type families extension to handle all this abstraction, with the
intended end result being that someone can create a new queue by setting all
the switches on the type (eg.  <code
    >q :: Deque NT T SingleEnd SingleEnd  Bound Safe
  Int &lt;- newQ</code
    >), but this brings up a set of Haskell language and type system
questions.  More details in the thread!</p></li
  ></ul
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-November/096533.html"
    >Persistent Concurrent Data Structures</a
    >
(1 Nov)</p>

<p>Like Ryan, Dmitri Kondratiev is interested in concurrent mutable data
structures, but this time with persistence to boot. His goal is to
program at a higher level of abstraction, avoiding the detail bloat
that would result from directly using some data storage API (eg.
SimpleDB). Dmitri's idea: a module tree of data structures mirroring
Data.List, Data.Map, etc but with concurrency and persistence. One
would be able to configure through the type interfaces:</p>
 
<ol>
  <li>media to persist data (file? DBMS?)</li>
  <li>caching policy</li>
  <li>concurrency configuration (optimistic/pessimistic locking?).</li>
  </ol><br/>

<p>Dmitri's post prompted some suggestions for packages to look into:</p>

<ul>
  <li><a href="http://hackage.haskell.org/package/safecopy">safecopy</a>:
    addresses
    both the issues of serializing the data and migrating it when the
    datastructure changes</li>
  <li><a href="http://hackage.haskell.org/package/acid-state">acid-state</a>:
    builds on top of safecopy to add a notion of transactions to any Haskell
    data structure</li>
  <li><a href="http://hackage.haskell.org/package/TCache">TCache</a>: a transactional
    cache with configurable persistence</li>
  <li>Haskell web server frameworks (eg. Yesod, Happstack [acid-state was
    formerly happstack-state]), as some come with persistence support</li>
  </ul><br/>

<p>Jeremy Shaw and David Barbour had reservations about what Dmitri had
in mind when he said &quot;concurrent&quot;. How would he deal with transaction
boundaries, and would a concurrently modified Data.List variant still
be a list? Evan Laforge also expressed skepticism about the viability
of abstracting over data stores with potentially very different needs.</p></li
  ></ul
><h4
> Threads, blocking</h4
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-October/096104.html"
    >Waiting on input with <code
      >hWaitForInput' or</code
      >threadWaitRead'</a
    >
(17 Oct)</p>

<p>Jason Dusek would like to use evented I/O for a proxying application,
in particular, to fork a thread for each new connection and then to
wait for data on either socket in this thread, writing to one or the
other socket as needed. He's found two functions which could help,
<code
    >System.IO.hWaitForInput</code
    > and <code
    >Control.Concurrent.threadWaitRead</code
    > but
each comes with some difficulties. Is there something like <code
    >select()</code
    >
that works with handles rather than file descriptors?</p>

<p>Ertugrul Soeylemez suggested an alternative approach, just plain
Concurrent Haskell because &#8220;[a] hundred Haskell threads reading from
Handles are translated to one or more OS threads using whatever
polling mechanism (select(), poll(), epoll) your operating system
supports&#8221;. He pasted a small <a href="http://hpaste.org/52742"
    >echo server</a
    > to
demonstrate the idea. It wasn't entirely clear for Jason how to apply
this to a proxy server. Jason has a
<code
    >lazyBridge ::  Handle -&gt; Handle -&gt; IO ()</code
    > which writes everything it
reads from one handle into the other and vice-versa, but it blocks and
does not allow packets to go back and forth. Gregory Collins sketched
out a possible solution: how about <code
    >forkIO</code
    >ing two threads (one for
the read end, one for the write end), with a loop over lazy I/O?
<a href="http://hpaste.org/52814"
    >This works</a
    >, but is still somewhat
surprising.</p></li
  ></ul
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-November/096597.html"
    >System calls and Haskell threads</a
    >
(3 Nov)</p>

<p>Andreas Voellmy noticed this in Kazu Yamamoto's
<a href="http://themonadreader.files.wordpress.com/2011/10/issue19.pdf"
    >Monad Reader</a
    >
article on a high performance web server.</p>

<blockquote>When a user thread issues a system call, a context switch occurs. This
means that all Haskell user threads stop, and instead the kernel is
given the CPU time.
</blockquote>

<p>Can that be right? Andreas thought, and Johan Tibell confirms, that
when a Haskell thread is blocking a particular OS threads, other
Haskell threads can continue run concurrently on other OS threads on
other CPUs (see
<a href="http://community.haskell.org/~simonmar/papers/conc-ffi.pdf"
    >Extending the Haskell Foreign Function Interface with Concurrency</a
    >).</p>

<p>Further clarification comes from David Barbour, who points out why
Kazu's original statement was correct in the context of the article.
While Mighttpd uses Haskell threads for concurrency; it does not go
the traditional route of using the RTS <code
    >-Nx</code
    > argument to generate OS
threads. Instead it gets its parallelism from a &quot;prefork&quot; model that
creates separate processes to balance user invocations (each process
may itself be running multiple Haskell threads). This unusual approach
is chosen to avoid issues with garbage collection.</p></li
  ></ul
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-December/097329.html"
    >Where threadSleep is defined?</a
    >
(6 Dec)</p>

<p>Dmitri Kondratiev was looking for a function to make the current
process (executing thread) go to sleep for a given time. Felipe
Almeida Lessa pointed to the <code
    >threadDelay</code
    > function in
Control.Concurrent.</p></li
  ></ul
><h3
> Stack Overflow and Reddit</h3
><ul
><li
  ><a href="http://stackoverflow.com/questions/7704580/space-analysis-for-parfib-in-monad-par-example"
    >Space analysis for parfib in monad-par example</a
    ></li
  ><li
  ><a href="http://stackoverflow.com/questions/7769996/using-the-par-monad-with-stm-and-deterministic-io"
    >Using the Par monad with STM and Deterministic IO</a
    ></li
  ><li
  ><a href="http://stackoverflow.com/questions/7862372/haskell-thread-blocked-indefinitely-in-an-stm-transaction"
    >Haskell: thread blocked indefinitely in an STM transaction</a
    ></li
  ><li
  ><a href="http://stackoverflow.com/questions/8056880/how-to-install-haskell-parallel-on-mac"
    >How to install haskell Parallel on mac?</a
    ></li
  ><li
  ><a href="http://stackoverflow.com/questions/8155929/mutable-possibly-parallel-haskell-code-and-performance-tuning"
    >Mutable, (possibly parallel) Haskell code and performance tuning</a
    ></li
  ><li
  ><a href="http://stackoverflow.com/questions/8272241/why-does-my-concurrent-haskell-program-terminate-prematurely"
    >Why does my concurrent Haskell program terminate prematurely?</a
    ></li
  ><li
  ><a href="http://stackoverflow.com/questions/8480087/parallel-matrix-multiplication"
    >Parallel Matrix Multiplication</a
    ></li
  ></ul
><ul
><li
  ><a href="http://www.reddit.com/r/haskell/comments/ldu36/yo_rhaskell_i_heard_you_like_monads/"
    >Yo <em
      >r</em
      >haskell, I heard you like monads... : haskell</a
    ></li
  ><li
  ><a href="http://www.reddit.com/r/haskell/comments/n2mws/ghc_commit_allow_the_number_of_capabilities_to_be/"
    >GHC commit: Allow the number of capabilities to be increased at runtime : haskell</a
    ></li
  ><li
  ><a href="http://www.reddit.com/r/haskell/comments/n7iv1/haswell_processor_hardware_transactional_memory/"
    >Haswell processor (hardware transactional memory) : haskell</a
    ></li
  ></ul
><h3
> Help and Feedback</h3
><p
>If you'd like to make an announcement in the next Haskell Parallel
Digest, then get in touch with me, Eric Kow, at
<a href="mailto:parallel@well-typed.com"
  ><code
    >parallel@well-typed.com</code
    ></a
  >. Please
feel free to leave any comments and feedback!</p
><p
><a href="http://www.flickr.com/photos/banlon1964/6337069654/"
  >Bikeshed image</a
  > by
banlon1964 available under a CC-NC-ND-2.0 license.
</p
>]]></content>
</entry>
<entry>
    <title type="text">Tutorial on Parallel Programming in Haskell</title>
    <published>2011-10-07T12:42:38Z</published>
    <updated>2011-10-07T12:42:38Z</updated>
    <link rel="alternate" type="application/atom+xml" href="http://www.well-typed.com/blog/61" />
    <summary type="text"><![CDATA[Today, I presented a tutorial at the Haskell in Leipzig meeting in Germany. The topic was parallel programming in Haskell, mainly using strategies and the parallel package. The slides and the code that I have used are available for download. Feedback is welcome!]]></summary>
    <author><name>andres</name></author>
    <category term="parallel" />
    <id>http://www.well-typed.com/blog/61</id>
    <content type="html"><![CDATA[<p
>Today, I presented a tutorial at the
<a href="http://portal.imn.htwk-leipzig.de/events/hal6-haskell-workshop"
  >Haskell in Leipzig</a
  >
meeting in Germany.
The topic was parallel programming in Haskell, mainly using strategies
and the <code
  >parallel</code
  > package.</p
><p
>The <a href="http://www.well-typed.com/Hal6/"
  >slides and the code that I have used</a
  >
are available for download. Feedback is welcome!</p
>]]></content>
</entry>
<entry>
    <title type="text">Parallel Haskell Digest 6</title>
    <published>2011-10-06T15:18:54Z</published>
    <updated>2011-10-06T15:18:54Z</updated>
    <link rel="alternate" type="application/atom+xml" href="http://www.well-typed.com/blog/60" />
    <summary type="text"><![CDATA[This edition of the Parallel Haskell Digest kicks off with some warm feelings from BioHaskeller Ketil Malde Blessing of the day goes to the authors of threadscope. In fact, let me look it up: Donnie Jones, Simon Marlow, and Satnam Singh. I had an old STM experiment lying around, but couldn't quite make [...]]]></summary>
    <author><name>eric </name></author>
    <category term="parallel" />
    <category term="ph-digest" />
    <id>http://www.well-typed.com/blog/60</id>
    <content type="html"><![CDATA[<p
>This edition of the Parallel Haskell Digest kicks off with some
<a href="https://plus.google.com/108473609018577106562/posts/QHYorSiZr89"
  >warm feelings</a
  >
from BioHaskeller Ketil Malde</p
><blockquote
><p
  >Blessing of the day goes to the authors of threadscope. In fact, let me
look it up: Donnie Jones, Simon Marlow, and Satnam Singh. I had an old
STM experiment lying around, but couldn't quite make it go faster than
the single threaded version.</p
  ><p
  >Today I installed threadscope, and a glance at the output sufficed to
identify the problem, and now at least it's faster. Great tool! (But you
all knew that, of course)</p
  ></blockquote
><p
>Thanks indeed to the original authors of ThreadScope! Ketil made good
use of the original ThreadScope. Since the original release, the folks
at Well-Typed have taken over development of ThreadScope (Andres,
Duncan, and Mikolaj) and are working to extend it so that it becomes
even easier to squash those pesky performance bugs. These extensions,
much like this digest, are made possible by the
<a href="http://www.haskell.org/haskellwiki/Parallel_GHC_Project"
  >Parallel GHC Project</a
  >.
We hope you'll love the results.</p
><h3
> News</h3
><p
><a href="http://justtesting.org/video-and-slides-of-data-parallelism-in-haske"
  >Data Parallelism in Haskell</a
  >
slides and video: Manuel Chakravarty came all the way from Sydney (that's 931 km!) to
tell the Brisbane FP Group about work going on at UNSW on data
parallel programming in Haskell. The talk covers a lot of ground. It's
an hour and a half long and</p
><p
><blockquote>
[It] motivates the use of functional programming for parallel, and in
particular, data parallel programming and explains the difference
between regular and irregular (or nested) data parallelism. It also
discusses the Repa library (regular data parallelism for multicore
CPUs), the embedded language Accelerate (regular data parallelism for
GPUs), and Data Parallel Haskell (nested data parallelism for
multicore CPUs).
</blockquote>
 
Both video and slides are available, so check it out!</p
><h3
> Word of the Month</h3
><p
>The word of the month is <em
  >dataflow</em
  > as in dataflow parallelism. (And if
you've just seen Manuel's talk, not to be confused with data
parallelism). With dataflow, we will be wrapping up our now four-part
series on parallel programming in Haskell. In recent words of the month,
we've been looking at various approaches for parallelising your way to
faster code. We started with the parallel arrays library Repa, then
moved on to the Haskell <code
  >par</code
  > and <code
  >pseq</code
  > primitives, and built from
there to get Strategies. Now we'll be looking at <code
  >monad-par</code
  >, a library
for <em
  >dataflow</em
  > parallelism.</p
><p
>First, the bigger picture: why do we have another way of doing parallel
programming in Haskell when we already have parallel arrays and
Strategies? Part of the answer is parallelism is still a research topic,
with new ideas coming out from time to time and some old ideas slowly
withering away. For another part of the answer, it may help
to think in terms of a trade-off between implicitness and performance,
in other words, between Easy and Fast.</p
><p
><img src="http://www.well-typed.com/blog/aux/images/ph-digest/parallelism.png" width="457" height="107" alt="" title="Degrees of implicitness"
   /></p
><p
>Not all problems are subject to the trade-off in the same way, and
it's not always clear to what extent they are.
As a rough sketch, problems that require
applying the same operation on a large number of simple object can get
fast performance from a parallel arrays library. This is a form of data
parallelism. If the problem does not lend itself to data parallelism,
the next port of call is likely Strategies. If Strategies are not
adequate for the desired speedups, the problem may require an even more
explicit approach.</p
><p
>Until recently, this meant &#8220;rolling your own&#8221; parallelism by using
Concurrent Haskell: forking off threads, assigning tasks to them, and
communicating finished work between them. Using concurrency for DIY
parallelism is risky. It means venturing into IO, giving up determinism,
exposing ourselves to all manner of side-effects, not to mention subtle
concurrency bugs like deadlock. As Haskellers, we should demand better:
safer, more predictable and easier to reason about. We want to have
explicit fine-grained control without all the nasty side-effects. And
now we can have it!</p
><p
>The latest addition to the parallel Haskell quiver is the monad-par
library. Programming in monad-par looks a lot like programming in
Concurrent Haskell:</p
><pre
>data Par a
instance Monad Par
 
runPar :: Par a -&gt; a     
fork :: Par () -&gt; Par ()

data IVar a
 
new :: Par (IVar a)
put :: NFData a =&gt; IVar a -&gt; a -&gt; Par ()
get :: IVar a -&gt; Par a
</pre
><p
>Instead of using <code
  >forkIO</code
  >, we use <code
  >fork</code
  > to create threads (not sparks!)
and instead of using <code
  >MVar</code
  >'s to communicate, we use <code
  >IVar</code
  >'s. <code
  >MVar</code
  > and
<code
  >IVar</code
  > behave in somewhat similar ways, in that any attempt to read from
an <code
  >IVar</code
  > will wait until it has been filled. But this is also where
differences emerge. Unlike their concurrent counterparts <code
  >IVar</code
  > may only
be written to once; subsequent writes result in an error. This aligns
with our desire for determinism. We never have to worry about the <code
  >IVar</code
  >
changing values over time. Moreover, the <code
  >Par</code
  > monad does not allow for
any <code
  >IO</code
  > (feature! not a bug) and computations in the <code
  >Par</code
  >
monad can always be extracted with a simple <code
  >runPar</code
  >.</p
><p
>But this doesn't really tell us all that much about how to use this
library. For now we come back to our word <em
  >dataflow</em
  >. The dataflow model
treats program as a little black box, an input goes in, an output comes
out, and what happens in between is immaterial:</p
><p
><img src="http://www.well-typed.com/blog/aux/images/ph-digest/dataflow-basic.png" width="392" height="59" alt="" title="Degrees of implicitness"
   /></p
><p
>The program is represented as a directed graph (a <em
  >dataflow network</em
  >),
with each node representing an individual operation (black boxes in
their own right), and connections between the nodes expressing
dependencies. In the graph above, <code
  >g</code
  > and <code
  >h</code
  > depend on the output of
<code
  >f</code
  > (the same output - there is only one) and in turn, <code
  >i</code
  > depends on
the output of both <code
  >g</code
  > and <code
  >h</code
  >.</p
><p
><img src="http://www.well-typed.com/blog/aux/images/ph-digest/dataflow-network.png" width="605" height="131" alt="" title="Dataflow network"
   /></p
><p
>Dataflow networks are handy thinking tools because one of the things
that makes it hard to get fast parallel code is the inherent
sequentially that dependencies introduce. A dataflow network nicely
captures which parts of the program are parallel (<code
  >g</code
  > and <code
  >h</code
  > are
independent of each other) and which parts are sequential (<code
  >g</code
  > depends
on <code
  >f</code
  >). This is a sort of parallelism topline; we can't go any faster
than what this network allows, but at the very least we should take the
shape of the network into account so we don't make matters worse and
find ourselves waiting a lot more on dependencies than is strictly
necessary.</p
><p
>Using the <code
  >Par</code
  > monad essentially consists in translating dataflow
networks into code.</p
><p
>Suppose we have functions</p
><pre
>f :: In -&gt; A
g :: A  -&gt; B
h :: A  -&gt; C
j :: B  -&gt; C -&gt; Out
</pre
><p
>For the graph above, we might say:</p
><pre
>network :: IVar In -&gt; Par Out
network inp = do
 [vf,vg,vh] &lt;- sequence [new,new,new]
 
 fork $ do x &lt;- get inp
           put vf (f x)
  
 fork $ do x &lt;- get vf
           put vg (g x)
 
 fork $ do x &lt;- get vf
           put vh (h x)
 
 x &lt;- get vg
 y &lt;- get vh
 return (j x y)
</pre
><p
>It's useful to note that here dataflow graphs need not be static; they
may be dynamically generated to fit the shape of the input. To see this
and also a more realistic taste of monad-par, check out Simon Marlow's
<a href="http://community.haskell.org/~simonmar/par-tutorial.pdf"
  >Parallel and Concurrent Programming Haskell</a
  >
tutorial. It shows how a parallel type inferencer might be implemented, its
dataflow graph arising from the set of bindings it receives as input.</p
><p
>Finally what about Strategies? How do we choose between the two? Both
are IO-free and deterministic, both require some but not a whole lot of
comfort with monads. <code
  >Par</code
  > may be easier to use correctly than
Strategies because it is strict and forces you to explicitly delineate
the dependencies in parallel computations. This also makes it more
verbose on the other hand. Aside from being more verbose, <code
  >Par</code
  > can be a
little bit &#8220;messier&#8221; than Strategies because you don't get the same
separation between algorithm and parallelism. And since we're ultimately
concerned with performance, it's worth knowing that the <code
  >Par</code
  > monad
currently involves more overhead than the <code
  >Eval</code
  > monad from Strategies.
The latter may be more appropriate at finer granularities.</p
><p
>Perhaps one way to go about this is to start off with Strategies as they
have been around longer and may require less invasive modifications to
your code. But if you find Strategies to be difficult, or you just can't
get the performance you want, don't reach for those <code
  >forkIO</code
  >'s yet. Give
<code
  >Par</code
  > a try.</p
><p
>For more about monad-par, do visit
<a href="http://community.haskell.org/~simonmar/par-tutorial.pdf"
  >Simon's tutorial</a
  > and
the <a href="http://hackage.haskell.org/package/monad-par"
  >API on hackage</a
  >.  To dig
deeper, you may also find it instructive to read
<a href="http://research.microsoft.com/en-us/um/people/simonpj/papers/parallel/monad-par.pdf"
  >A Monad of Deterministic Parallelism</a
  > (Marlow et al) presented in this year's
Haskell Symposium.  Related to this are his CUFP
<a href="http://community.haskell.org/~simonmar/slides/CUFP.pdf"
  >monad-par slides</a
  >
tutorial slides. </p
><p
>This concludes our tour of current approaches to Haskell parallelism. There is
some new technology on the horizon which we may cover in a later digest,
particularly Data Parallel Haskell and nested data parallelism (<em
  >data</em
  >
parallelism not to be confused with <em
  >dataflow</em
  > parallelism), but for next
month, let's take a little break from parallelism and look at something a
little different.</p
><h3
> Parallel GHC Project Update</h3
><p
>The Parallel GHC Project is an MSR-funded project, run by Well-Typed,
with the aim of demonstrating that parallel Haskell can be employed
successfully in real world projects.</p
><p
>This has been a month for releasing software. We've recently put on
Hackage updates to ThreadScope and some helper libraries</p
><ul
><li
  >ThreadScope (0.2.0), with a new spark profiling visualisation,
bookmarks, and other enhancements</li
  ><li
  >gtk2hs (0.12.1), adding GHC 7.2 compatibility,</li
  ><li
  >ghc-events (0.3.0.1), adding support for spark events (needs GHC 7.3),</li
  ></ul
><p
>Duncan Coutts presented <a href="http://haskell.org/haskellwiki/HaskellImplementorsWorkshop/2011/Coutts"
  >our recent work on ThreadScope</a
  > at the
Haskell Implementors Workshop in Tokyo (23 Sep). He talked about the new
spark visualisation feature which show a graphical representation of
spark creation and conversion statistics with a demo inspired by a
program we are working on for Parallel GHC.</p
><p
>Meanwhile, we have also completed a pure Haskell implementation of the
&#8220;Modified Additive Lagged Fibonacci&#8221; random number generator. This
generator is attractive for use in Monte Carlo simulations because it is
splittable and has good statistical quality, while providing high
performance. The LFG implementation will be released on Hackage when it
has undergone more extensive quality testing.</p
><h3
> Blogs, Papers, and Packages</h3
><ul
><li
  ><p><a href="http://ugcs.net/~keegan/talks/first-class-concurrency/talk.pdf"
    >First Class Concurrency in Haskell</a
    >
(PDF) (31 Aug)</p>

<p>Keegan McAllister posted slides from a talk he gave last year at the
Boston Area Haskell Users' Group. His slides give an overview of
topics in concurrent imperative programming:</p>

<ul>
	  <li>Using closures with concurrency primitives to improve your API</li>
  	  <li>Using MVar to retrieve thread results</li>
  	  <li>How System.Timeout works</li>
  	  <li>Implementing IORef and Chan in terms of other primitives</li>
	  <li>Implementing the π-calculus, and compiling the λ-calculus to it</li>
   </ul>
 
<p>You might find Keegan's slides to be a useful introduction to Haskell
Concurrency.</p></li
  ></ul
><ul
><li
  ><p><a href="http://mainisusuallyafunction.blogspot.com/2011/09/lambda-to-pi.html"
    >Lambda to pi</a
    >
(9 Sep)</p>

<p>&#8220;If the &#955;-calculus is a minimal functional language, then the
&#960;-calculus is a minimal concurrent language&#8221;. Following up on his
recent slides, Keegan wrote a follow-up post elaborating on his
pi-calculus interpreter and lambda-calculus to pi-calculus compiler.
If you want to run his blog post just pass the
<a href="http://ugcs.net/~keegan/code/pi-calc.lhs"
    >lhs file</a
    > to GHCi.</p>

<p>(note that I had to use <code
    >-hide-package=monads-tf</code
    >)</p></li
  ></ul
><ul
><li
  ><p><a href="http://mortenlysgaard.com/?p=35"
    >Sirkle</a
    > (17 Sep)</p>

<p>Morten Olsen Lysgaard released Sirkle. It provides an implementation
of the Chord
<a href="http://en.wikipedia.org/wiki/Distributed_hash_table"
    >Distributed Hash Table</a
    >,
and a simple storage layer with fault tolerance and replication
similar to DHash. The Sirkle DHT implementation uses Cloud Haskell,
which we saw some interest about in the
<a href="http://www.well-typed.com/blog/58"
    >last edition</a
    > of this digest.
Sirkle is available on hackage and GitHub. &#8220;If you think distributed
computing is fun&#8221;, Morten invites us, &#8220;and would love a hobby project,
join me in discovering what can be done with a DHT in Haskell!&#8221;</p></li
  ></ul
><ul
><li
  ><p><a href="http://hackage.haskell.org/package/threadscope"
    >ThreadScope</a
    > (5 Sep)</p>

<p>Duncan Coutts released ThreadScope 0.2.0, the performance visualiser
that we saw Ketil used to make his STM experiment faster. The updated
ThreadScope has a new spark profiling visualisation, bookmarks, and
other enhancements. Now if only we had some sort of user
documentation&#8230;</p></li
  ></ul
><h3
> Mailing list discussions</h3
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-August/094814.html"
    >Performance of concurrent array access</a
    >
(23 Aug)</p>

<p>As a first step to his foray into writing concurrent parallel Haskell
programs Andreas Voellmy wrote a bit of code to read that reads and
writes concurrently to an IOArray. Unfortunately, the performance with
multiple cores was not very good and using ThreadScope did not reveal
anything obvious. What did help on the other hand was to switch to
IOUArray, which puzzled Andreas. Why would this improve scalability to
multiple threads and cores? Andrew Coppin and Brandon Allbery
suggested it may have something to do with strictness. Johan Tibbell
pointed out some places where Andreas would likely want to add a bit
of strictness to his original IOArray version. But this only seems to
have helped slightly. Anybody else have an idea?</p></li
  ></ul
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-August/094863.html"
    >(Repa) hFlush: illegal operation</a
    >
(24 Aug)</p>

<p>Michael Orlitzky is using Repa to process MRI data, but when writing
the results to file, he gets the error message
<code
    >spline3:  output.txt: hFlush: illegal operation (handle is closed)</code
    >
Trial and error is a bit hard to do here because it takes 45 minutes
to get to the point of failure. Ben Lippmeier offered to have a look
if Michael could provide some data to go with the code he posted. Ben
also pointed out that the implementation of Repa's text IO functions
is naive, and may have problem with massive files. If the data is in
some standard binary form, it may be worthwhile to write a Repa loader
for it.</p></li
  ></ul
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-August/094639.html"
    >how to read CPU time vs wall time report from GHC?</a
    >
(14 Aug)</p>

<p>Wishnu Prasetya is just getting started with Parallel Haskell. He's
having some trouble interpreting runtime statistics from the RTS (eg.
<code
    >8.97s  (  2.36s elapsed)</code
    >).</p>

<pre>
SPARKS: 5 (5 converted, 0 pruned)
   
INIT  time    0.02s  (  0.01s elapsed)
MUT   time    3.46s  (  0.89s elapsed)
GC    time    5.49s  (  1.46s elapsed)
EXIT  time    0.00s  (  0.00s elapsed)
Total time    8.97s  (  2.36s elapsed)
</pre>
   
<p>Why does the CPU time on the left exceed the wall clock time on the
right? Iustin Pop pointed out that CPU time accumulates over the
multiple CPUs. The roughly 4:1 ratio suggest Wish must be making 95%
use of 4 CPUs (or partial use of more CPUs). This may sound efficient
except that at the end of day what counts is the difference in wall
clock times. Iustin points out that the MUT/GC times show Wish
spending 60% of his time in garbage collection, perhaps a sign of a
space leak or an otherwise inefficient algorithm.</p></li
  ></ul
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-September/095289.html"
    >Can't link OpenCL on Windows</a
    >
(12 Sep)</p>

<p>OpenCL (Open Computing Language) is a framework for writing parallel
programs that run on GPUs (more generally, on heterogeneous systems
with perhaps a mix of CPUs and other processors). Jason Dagit is
trying to get the OpenCLRaw bindings to work on Windows. On link time,
however, he gets undefined symbol errors for everything he uses in the
OpenCL API. Mystery solved: Jason later reported that the binding was
set to use ccall whereas OpenCL uses stdcall.</p></li
  ></ul
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-September/095344.html"
    >Turn GC off</a
    >
(14 Sep) and
<a href="http://www.haskell.org/pipermail/haskell-cafe/2011-September/095616.html"
    >Build local-gc branch of GHC</a
    >
(26 Sep)</p>
 
<p>Andreas Voellmy (who we saw above with IOArray woes) is writing
a server handling multiple long-lived TCP connections with a bit of
shared state among connections. Ideally, it'd be a case of more cores = more
clients, but the current stop-the-world garbage collection in  GHC makes this
expensive. His first thought was to completely disable garbage collection,
and just run until he is out of memory. Perhaps  there is an alternative.</p>

<p>David Barbour commented that controlling the amount of live memory is
important here, much more than avoiding allocations. Austin Seipp
pointed us to recent work on the
<a href="https://github.com/ghc/ghc/tree/local-gc"
    >local-gc branch of GHC</a
    > (see
the paper
<a href="http://research.microsoft.com/~simonpj/papers/parallel/local-gc.pdf"
    >Multicore Garbage Collection with Local Heaps</a
    >).
This work has not been merged into GHC though (as it's not clear if
the improvements are worth the complexity increase).</p>

<p>Andreas decided to give it a shot. Unfortunately, he can't get it to
build. Help?</p></li
  ></ul
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-September/095589.html"
    >A Missing Issue on Second Generation Strategies</a
    >
(24 Sep)</p>

<p>Burak Ekici is trying to parallelize RSA encryption and decryption
with Strategies, but all his sparks keep getting pruned! Antoine
Latter noticed that Burak's strategy wasn't actually doing anything to
its input; it simply sparks off computations (themselves evaluated
with <code
    >rdeepseq</code
    >) that nobody ever looks at. Daniel Fischer made a
similar point, providing some improved code with a refactor and clean
up of repeated code along the way.</p></li
  ></ul
><ul
><li
  ><p><a href="https://groups.google.com/d/topic/parallel-haskell/dhMkpgdswt0/discussion"
    >Parallel Haskell Digest 5 followup</a
    ></p>

<p>Following up on the last word of the month (Strategies). Kevin Hammond
observed that the first <code
    >parMap</code
    > example interweaves behavioural and
functional code, defeating the point of strategies. The issue here is
pedagogical: although I later follow up with the version using
<code
    >parList rseq</code
    >, I run the risk of people simply copy and pasting while
missing out on the benefits. Thanks for the feedback, Kevin! I'll see
if I can avoid the try-the-wrong-way-first style in the future.</p>

<p>As a more general note, feedback people give, particularly criticism
and suggestions for improvement are very valuable, so keep it coming,
folks!</p></li
  ></ul
><h3
> Stack Overflow and Reddit</h3
><ul
><li
  ><a href="http://stackoverflow.com/questions/7194826/concurrent-reading-and-writing-to-ioarray-in-haskell"
    >Concurrent reading and writing to IOArray in Haskell</a
    ></li
  ><li
  ><a href="http://stackoverflow.com/questions/7216149/whats-the-best-way-to-write-some-semaphore-like-code-in-haskell"
    >What's the best way to write some semaphore-like code in Haskell?</a
    ></li
  ><li
  ><a href="http://stackoverflow.com/questions/7256860/help-understanding-mvar-example-in-haskell"
    >Help understanding MVar example in Haskell</a
    ></li
  ><li
  ><a href="http://stackoverflow.com/questions/7305625/how-to-exit-from-hackstack-server-app"
    >How to exit from Hackstack Server App?</a
    ></li
  ></ul
><ul
><li
  ><a href="http://www.reddit.com/r/haskell/comments/k764g/the_most_useful_error_message_evar"
    >The most useful error message evar! : haskell</a
    ></li
  ><li
  ><a href="http://www.reddit.com/r/haskell/comments/jperf/new_ibm_cpu_has_transactional_memory_built_in/"
    >New IBM CPU has transactional memory built in : haskell</a
    ></li
  ></ul
><h3
> Help and Feedback</h3
><p
>If you'd like to make an announcement in the next Haskell Parallel
Digest, then get in touch with me, Eric Kow, at
<a href="mailto:parallel@well-typed.com"
  ><code
    >parallel@well-typed.com</code
    ></a
  >. Please
feel free to leave any comments and feedback!
</p
>]]></content>
</entry>
<entry>
    <title type="text">Upcoming Events</title>
    <published>2011-09-10T19:51:15Z</published>
    <updated>2011-09-10T19:51:15Z</updated>
    <link rel="alternate" type="application/atom+xml" href="http://www.well-typed.com/blog/59" />
    <summary type="text"><![CDATA[Here's two upcoming events that Well-Typed will be involved in: Hal6 Haskell-Workshop, 7 October 2011, Leipzig, Germany This is a one-day workshop with various talks and tutorials. I will be giving a tutorial (90 minutes) on parallel programming in Haskell. In addition, Kevin Hammond will be talking [...]]]></summary>
    <author><name>andres</name></author>
    <category term="well-typed" />
    <id>http://www.well-typed.com/blog/59</id>
    <content type="html"><![CDATA[<p
>Here's two upcoming events that Well-Typed will be involved in:</p
><h3
> Hal6 Haskell-Workshop, 7 October 2011, Leipzig, Germany</h3
><p
>This is a one-day workshop with various talks and tutorials. I will be giving a
tutorial (90 minutes) on parallel programming in Haskell.  In addition, Kevin
Hammond will be talking about parallel refactorings.</p
><p
>Most of the talks during the day will be in German. So if you speak German, you
might want to consider coming to Leipzig!</p
><p
><a href="http://portal.imn.htwk-leipzig.de/events/hal6-haskell-workshop/"
  >(More info)</a
  ></p
><h3
> FP Day, 14 October 2011, Cambridge, UK</h3
><p
>This is a one-day training event for Haskell, F#, and Clojure, aimed mainly at
participants from industry. Simon Peyton Jones and Don Syme are keynote
speakers. Well-Typed is running a hands-on introduction to Haskell (180
minutes). There are a limited number of tickets available, and registration is
open. It would be great to see you there!</p
><p
><a href="http://www.fpday.net/"
  >(More info)</a
  >
</p
>]]></content>
</entry>
<entry>
    <title type="text">Parallel Haskell Digest 5</title>
    <published>2011-08-31T11:04:42Z</published>
    <updated>2011-08-31T11:04:42Z</updated>
    <link rel="alternate" type="application/atom+xml" href="http://www.well-typed.com/blog/58" />
    <summary type="text"><![CDATA[Hello Haskellers! Eric here, reprising my role as Parallel Haskell Digester. Many thanks to Nick for good stewardship of the digest and nice new directions for the future. This month we have Erlang PhDs (and a postdoc), new partners, two Monad Reader articles in progress and some strategies. As usual, [...]]]></summary>
    <author><name>eric</name></author>
    <category term="parallel" />
    <category term="ph-digest" />
    <id>http://www.well-typed.com/blog/58</id>
    <content type="html"><![CDATA[<p
>Hello Haskellers!</p
><p
>Eric here, reprising my role as Parallel Haskell Digester. Many thanks
to Nick for good stewardship of the digest and nice new directions for
the future. This month we have Erlang PhDs (and a postdoc), new
partners, two Monad Reader articles in progress and some strategies. As
usual, this digest is made possible by the
<a href="http://www.haskell.org/haskellwiki/Parallel_GHC_Project"
  >Parallel GHC Project</a
  >.</p
><h3
> News</h3
><p
><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-August/094484.html"
  >Scalable Erlang PostDoc and 2 PhD Studentships</a
  >
(8 Aug)</p
><p
>What, Erlang?! Yes, you are reading the right digest. If you're
generally into parallelism and concurrency in functional programming
languages, you may be especially interested to know about this
announcement from Phil Trinder.
   
<blockquote>RELEASE project - A High-level Paradigm for Reliable
Large-scale Server Software - is funded by the EU Framework 7 programme for
36 months from October 2011. Its aim is to scale the radical
concurrency-oriented programming paradigm to build reliable
general-purpose software, such as server-based systems, on massively
parallel machines (100 000 cores).</blockquote></p
><h3
> Word of the Month</h3
><p
>Last month, we had <code
  >par</code
  > and <code
  >pseq</code
  > as our twin words of the month.
Let's pursue this little train of parallel thought; our next word the
month is <em
  >strategy</em
  >. Strategies have been around since the 1993 paper
<a href="http://www.macs.hw.ac.uk/~dsg/gph/papers/abstracts/strategies.html"
  >Algorithm + Strategy = Parallelism</a
  >;
that's before even we started using monads in Haskell! They've recently
been revamped in Simon Marlow's
<a href="http://research.microsoft.com/apps/pubs/default.aspx?id=138042"
  >Seq no More</a
  >
paper, and it's this version of strategies that we'll be exploring here.</p
><p
>Strategies are built on top of the <code
  >par</code
  > and <code
  >pseq</code
  > primitives we saw in
the <a href="http://www.well-typed.com/blog/56"
  >last digest</a
  >. They provide a nice
way to express the often complicated logic we need to make the best use
of parallelism in our code. Use of strategies can also help to make
parallel code easier to read and maintain because they allow us to more
cleanly separate the core logic from our code that which pertains to our
use of parallelism.</p
><p
>Before delving into strategies, let's take a small notational detour by
introducing the <code
  >Eval</code
  > monad. Suppose we wanted a parallel version of
the <code
  >map</code
  > function, something that would apply a function to each item
of a list. Using the <code
  >par</code
  > and <code
  >pseq</code
  > from the last digest, we might
express this function</p
><pre
>parMap :: (a -&gt; b) -&gt; [a] -&gt; [b]
parMap f [] = []
parMap f (a:as) = b `par` bs `pseq` (b:bs)
 where
  b  = f a
  bs = parMap f as
</pre
><p
>If we look carefully at the code we can observe that there is something
inherently sequential in the way we have expressed this parallel
computation: first spark off <code
  >f a</code
  > then recurse to the tail of the list,
and finally cons.</p
><p
>The <code
  >Eval</code
  > monad builds off the insight that expressing parallelism is
fundamentally (perhaps counterintuitively) about ordering things. Monads
are well-suited for expressing ordering relationships, and so they have
been been pressed to work for expressing parallel computation as well.</p
><pre
>data Eval a
instance Monad Eval

runEval :: Eval a -&gt; a
rpar :: a -&gt; Eval a
rseq :: a -&gt; Eval a
</pre
><p
><code
  >Eval</code
  > is just a strict identity monad, with <code
  >rpar</code
  > and <code
  >rseq</code
  > as
counterparts to <code
  >par</code
  > and <code
  >pseq</code
  >. We use <code
  >Eval</code
  > to compose sequences of
parallel computation, which we extract the results of by using the
<code
  >runEval</code
  > function. If you're not familiar with monads, you can get away
with just treating <code
  >Eval</code
  > as new notation, or rather, borrowed notation,
the same that we use IO, parser combinator libraries, QuickCheck and a
plethora of other useful monads. It's worth noting also that despite
appearances, we are still in purely functional territory &#8212; no IO here!
&#8212; with the notion of sequencing being limited to controlling
parallelism and evaluation depth.</p
><p
>To make use of <code
  >Eval</code
  > for our <code
  >parMap</code
  > function, we could write a
version like the below. It introduces a change of type, from
returning <code>[b]</code> to <code>Eval [b]</code>. In the
general case, we could just use the <code
  >runEval</code
  > function to
get our result back, but we are not baking it into <code
  >parMap</code
  > because
we would typically want to use then function within a greater <code
  >Eval</code
  >
context anyway.</p
><pre
>parMap :: (a -&gt; b) -&gt; [a] -&gt; Eval [b]
parMap f [] = return []
parMap f (a:as) = do
  b  &lt;- rpar  (f a)
  bs &lt;- parMap f as
  return (b:bs)
</pre
><p
>As before, this function captures the basic idea of its sequential
counterpart <code
  >map</code
  >: apply function, recurse to tail, cons new head to new
tail. This is a passable parallel map, but there are still two things
which are unfortunate about it. First, we have repeated the
implementation of map, not a big deal for such a small function but a
potential maintenance problem for more complex code. Second, we have
only captured one sort of parallel evaluation, firing off sparks for all
the cons cells, whereas in practice getting parallelism right requires some
often careful tuning to get the right level of granularity and account
for dependencies. For instance, what if instead of doing each cell in
parallel, we wanted to bunch them up into chunks so as to minimise the
coordination overhead? What we need is a way to express ideas about
running code in parallel (such as &#8220;break into chunks and run each chunk
simultaneously&quot;), preferably avoid duplicating existing code or
otherwise making it more complicated than it needs to be.</p
><p
>This is where strategies come in. A strategy is just a function that
turns some data into an <code
  >Eval</code
  > sequence. We've already encountered two
strategies above, <code
  >rpar</code
  > which sparks off a parallel evaluation, and
<code
  >rseq</code
  > which evaluates its argument to weak head normal form. Many more
strategies possible particularly when they are custom designed for
specific data types. Also, if we have a strategy, we can imagine wanting
to use it. This we capture with <code
  >using</code
  >, which applies the strategy
function and runs the resulting sequence. Parallel evaluation aside, you
can almost pretty much think of <code>x `using` s</code> as being
equivalent to as <code
  >id x</code
  >, although you'll want to watch out for a couple of
caveats described the tutorial
<a href="http://community.haskell.org/~simonmar/par-tutorial.pdf"
  >Parallel and Concurrent Programming in Haskell</a
  >.</p
><pre
>Strategy a = a -&gt; Eval a

using :: a -&gt; Strategy a -&gt; a
x `using` s = runEval (s x)
</pre
><p
>Now that we've put a name to the idea of building <code
  >Eval</code
  > sequences, we
can think about trying to generalise our <code
  >parMap</code
  >. One way would be to
write a higher-order strategy. This <code
  >parList</code
  > function is almost
identical to <code
  >parMap</code
  >, except that instead of applying any function to a
list we limit ourselves to just applying some strategy on it.</p
><pre
>parList :: Strategy a -&gt; Strategy [a]
parList strat []     = return []
parList strat (x:xs) = do
  x'  &lt;- rpar (x `using` strat)
  xs' &lt;- parList xs strat
  return (x':xs')
</pre
><p
>We can then define <code
  >parMap</code
  > by parameterising <code
  >parList</code
  > with <code
  >rseq</code
  > to
get a parallel list strategy which we then apply to <code
  >map f xs</code
  >.</p
><pre
>parMap :: (a -&gt; b) -&gt; [a] -&gt; Eval [b]
parMap f xs = map f xs `using` parList rseq
</pre
><p
>We have now achieved two things. First we've improved the modularity of
our code by separating algorithm (<code
  >map f xs</code
  >) from coordination
(<code
  >parList rseq</code
  >). Notice how we are now reusing <code
  >map</code
  > instead of
reimplementing it, and notice as well how we now have a reusable
<code
  >parList</code
  > that we can use apply to any situation that involves a list.
Second, by isolating the algorithm from the coordination code we have
given ourselves a lot more flexibility to switch strategies. To bunch
our list up into coarser-grained chunks, for example, we could just swap
out <code
  >parList</code
  > with <code
  >parListChunk</code
  ></p
><pre
>parMap :: (a -&gt; b) -&gt; [a] -&gt; Eval [b]
parMap f xs = map f xs `using` parListChunk 100 rseq
</pre
><p
>In this word of the month, we have very lightly touched on the idea of
strategies as a compositional way of expressing parallel coordination
and improving the modularity of our code. To make use of strategies, we
basically need to be aware of two or three things:</p
><p
>1.  How to apply a strategy: <code>foo `using` s</code> evaluates the
    value <code
  >foo</code
  > with strategy <code
  >s</code
  ></p
><p
>2.  Useful strategies from the library: the basics are <code
  >r0</code
  > which does
    nothing, <code
  >rseq</code
  > which evaluates its argument to weak head normal
    form, <code
  >rdeepseq</code
  > which evaluates it all the way down to normal form,
    and <code
  >rpar</code
  > which sparks the value for parallel evaluation. Be sure
    to check out
    <a href="http://hackage.haskell.org/packages/archive/parallel/3.1.0.1/doc/html/Control-Parallel-Strategies.html"
  >Control.Parallel.Strategies</a
  >
    for more sophisticated strategies such as <code
  >parListChunk</code
  >, which
    divides the list into chunks and applies a strategy to each one of
    the chunks in parallel.</p
><p
>3.  (Optionally) How to build strategies, the simplest way being to use
    the <code
  >dot</code
  > function to compose two strategies sequentially. If you
    implement your own strategies instead of combining those from the
    library, you may want to have a look at the
    <a href="http://research.microsoft.com/apps/pubs/default.aspx?id=138042"
  >Seq no More</a
  >
    for a little safety note.</p
><p
>If you want to know more about strategies, the first place to look is
probably the tutorial
<a href="http://community.haskell.org/~simonmar/par-tutorial.pdf"
  >Parallel and Concurrent Programming in Haskell</a
  >
(there is a nice example in there, implementing a parallel K-means
algorithm), followed by the Control.Parallel.Strategies API. You
could also just dive in and give it a try on some of your own code. It's
sometimes just a matter of grabbing a strategy and <code
  >using</code
  > it.</p
><h3
> Parallel GHC Project Update</h3
><p
>The Parallel GHC Project is an MSR-funded project to push forward the
use of parallel Haskell in the real world. The aim is to demonstrate
that parallel Haskell can be employed successfully in industrial
projects.</p
><p
>Speaking of industrial projects, our recent search for a new project
partner has been successful! In fact, we will be welcoming two new
partners to the project, the research and development group of Spanish
telecoms company Telef&#243;nica, and VETT a UK-based payment processing
company. We are excited to be working with the teams at Telef&#243;nica I+D
and VETT. There may be some Cloud Haskell in our future; stay tuned for
more details about the respective partner projects.</p
><p
>Also coming up are a couple of Monad Reader articles featuring work from
the Parallel GHC project.</p
><p
>Kazu Yamamoto has been writing an article for the upcoming Monad Reader
special edition on parallelism and concurrency. He'll be revisiting
Mighttpd (pronounced &quot;Mighty&quot;), a high-performance web server written in
Haskell in late 2009. Since Mighttpd came out the Haskell web
programming landscape has evolved considerably, with the release of GHC
7 and development of several web application frameworks. The new
mighttpd takes advantage of GHC 7's new IO manager as well as the Yesod
Web Application Interface (WAI) and the HTTP engine Warp. Do check out
his article when it comes out for proof that &quot;we can implement a high
performance web server in Haskell, which is comparable to highly tuned
web servers written in C&quot;, or if you can't wait, try his
<a href="http://www.mew.org/~kazu/material/2011-mighttpd.pdf"
  >most recent slides</a
  >
on the topic.</p
><p
>Bernie Pope and Dmitry Astapov have been working on an article about the
Haskell MPI binding developed in the context of this project. You can
find out more about the binding on its
<a href="http://hackage.haskell.org/package/haskell-mpi"
  >Hackage</a
  > and
<a href="https://github.com/bjpop/haskell-mpi"
  >GitHub</a
  > pages.</p
><h3
> Blogs, Papers, and Packages</h3
><ul
><li
  ><p><a href="http://tumblr.justtesting.org/post/8778213236"
    >Data Parallel Haskell and Repa for GHC 7.2.1</a
    >
(11 Aug)</p>
 
<p>Ben Lippmeier has updated the Data Parallel Haskell libraries on
Hackage to go with the newly released GHC 7.2.1. While DPH is still a
technology preview, this new version is significantly more robust than
previous ones. If nested data parallelism isn't what you're after, Ben
has also updated the Repa library for parallel arrays. See
<a href="http://www.well-typed.com/blog/55"
    >PH Digest 3</a
    > for more details.</p></li
  ></ul
><ul
><li
  ><p><a href="http://hackage.haskell.org/package/thespian-0.999"
    >thespian: Lightweight Erlang-style Actors for Haskell</a
    >
(18 Aug)</p>
 
<p>Alex Constandache updated Hackage with this intriguing library. I wrote
to Alex asking about about the relationship with Cloud Haskell. It
turns out Alex had somewhat similar goals in mind and may be focusing
on Cloud Haskell instead.</li
  ></ul
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-July/094203.html"
    >dbus-core 0.9</a
    >
(23 Jul)</p>
  
<p>John Millikin announced &quot;the first significant release to dbus-core in
a while&quot; with significant improvements to both the API and
implementation. In particular, dbus-client has been merged into
dbus-core. See the announcement for more details then a quick code
sample showing the new simple API in action. For the curious, D-Bus is
an inter-process communication mechanism used in desktop environments
like Gnome.</p></li
  ></ul
><h3
> Mailing list discussions</h3
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-July/093722.html"
    >How to ensure code executes in the context of a specific OS thread?</a
    >
(4 Jul).</p>

<p>
Jason Dagit has found the correct handling of GUI events on OS X
requires checking for events and responding to them from the main
thread. He wanted to know if such thing was even possible on the
threaded RTS, particularly so that he could use the library from GHCi.
David Pollak has a
<a href="https://github.com/dpp/LispHaskellIPad"
    >bit of code</a
    > that provides a
runOnMain function using an FFI call to a bit objective C code. Simon
Marlow pointed out that for compiled code, the main thread is a bound
thread, bound to the main thread of the process. Simon is specifically
talking about the threaded RTS here as this is already trivially true
of the non-thread one. The question still remains open for GHCi.
</p></li
  ></ul
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-July/094176.html"
    >Cloud Haskell</a
    >
(22 Jul)</p>

<p>
Tom Murphy was curious to see anybody was using
<a href="http://research.microsoft.com/en-us/um/people/simonpj/papers/parallel/"
    >Cloud Haskell</a
    >
yet. It looks like we have a couple people who are <em
    >thinking</em
    > of using
it: Tim Cowlishaw might be using it for his masters thesis project (A
Haskell EDSL for agent-based simulation). Julian Porter, who we
mentioned in the last digest for his Monad Reader article, is planning
to make his MapReduce monad framework work on the distribute cluster.
Now how about those of us who have actually given it try?
</p>
 
<p>
As for the Parallel GHC project, it's worth mentioning that our new
partners are planning to use Cloud Haskell in their projects. We'll be
sure to report back to the community about the results.
</p></li
  ></ul
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-July/094398.html"
    >A language that runs on the JVM or .NET...</a
    >
(31 Jul)</p>

<p>  
KC was hoping for a version of Haskell &quot;that runs on the JVM or .NET
has the advantage of Oracle &amp; Microsoft making those layers more
parallelizable&quot;. Austin Seipp replied that it basically boils down to
it being
<a href="http://www.haskell.org/haskellwiki/GHC:FAQ#Why_isn.27t_GHC_available_for_.NET_or_on_the_JVM.3F"
    >a lot of work</a
    >.
Chris Smith also cautions that while there may be many good reasons to
easy to want a Haskell.Net or a Jaskell, Haskell parallelism and
concurrency may not be one of them. Parallel and concurrent Haskell
code relies generates a large volume of lightweight threads. Simply
using their JVM or CLR counterparts as they are would not do.
</p>

<p>
Are you interested in seeing a serious and long-term effort towards
Haskell on the JVM? See the
<a href="http://www.haskell.org/haskellwiki/GHC:FAQ#Why_isn.27t_GHC_available_for_.NET_or_on_the_JVM.3F"
    >FAQ</a
    >
and get in touch!
</p></li
  ></ul
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-August/094521.html"
    >Error in the asynchronous exception operational semantics</a
    >
(9 Aug)</p>

<p> 
Edward Z. Yang noticed at the end of
<a href="http://www.cs.missouri.edu/~harrison/papers/mpc08.pdf"
    >Asynchronous Exceptions as an Effect</a
    >
(Harrison et. al) that the authors have found an error in the
operational semantics described in
<a href="http://community.haskell.org/~simonmar/papers/async.pdf"
    >Asynchronous Exceptions in Haskell</a
    >.
Anybody know what it is?
</p> 
 
<p>Meanwhile, David Barbour took the opportunity to say</li
  ></ul
><p
><blockquote>
  All you need to know about asynchronous exceptions is: <em
  >don't use them!</em
  >
  They're too difficult to reason about, in terms of failure modes and such.
  Use TVars or MVars instead.
</blockquote></p></p
><ul
><li
  ><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-August/094614.html"
    >Haskell Actors, Linda, publish/subscribe models?</a
    >
(13 Aug)</p>
 
<p>Dmitri O. Kondratiev needs to &quot;build a framework to coordinate task
producers / consumers distributed in the same and different address
spaces. He needs to scale a data processing application somewhat
Hadoop-like way yet in more flexible manner, without Hadoop-specific
distributed FS constraints.&quot; At this point, Ryan Newton suggested
&quot;Cloud Haskell&quot;, which was greeted by with joy and great enthusiasm.
&quot;Finally Erlang actor model of communication comes to Haskell!&quot;</p>
</li
  ><li
  ><p><a href="https://groups.google.com/d/topic/parallel-haskell/97dmSXickYM/discussion"
    >More generic parfoldr than this...</a
    >
(23 Jul)</p>
  
<p>Prabhat Totoo tried to implement a parallel foldr function by breaking
the input listed chunks, mapping foldr over each chunk, and running
foldr over the list of results. This solution only works if the input
and output of the folded function have the same type. Prabhat wanted
to know how he could go about generalising his parallel fold. Also
while doing some timings, he noticed that the regular presumably
sequential fold improved with multiple cores. What's up with that?</p>
    
<p>Conal Elliott make the point that Prabhat's current
parallelise-by-reassociating solution is only correct for an
associative operation, which is forcibly of type (<code
    >a -&gt; a -&gt; a</code
    >). Have
a look at the the rest of the thread for an exploration of parallel
fold by Kevin Hammond, Conal Elliot and Sebastien Fischer.</p></li
  ></ul
><h3
> Stack Overflow and Reddit</h3
><ul
><li
  ><a href="http://stackoverflow.com/questions/6844378/are-par-and-pseq-good-for-data-parallelism"
    >Are 'par' and 'pseq' good for data parallelism?</a
    ></li
  ><li
  ><a href="http://stackoverflow.com/questions/6915079/difference-between-tvar-and-tmvar"
    >Difference between MVar and a TVar?</a
    ></li
  ><li
  ><a href="http://stackoverflow.com/questions/6955651/how-to-sort-a-list-mvar-a-using-a-values"
    >How to sort a list [MVar a] using a values?</a
    ></li
  ><li
  ><a href="http://stackoverflow.com/questions/6847307/understanding-blockedindefinitelyonmvar-in-concurrent-code"
    >Understanding BlockedIndefinitelyOnMVar in Concurrent code</a
    ></li
  ><li
  ><a href="http://stackoverflow.com/questions/6439925/poor-performance-lockup-with-stm"
    >Poor performance / lockup with STM</a
    ></li
  ><li
  ><a href="http://www.reddit.com/r/haskell/comments/jlipx/if_one_of_haskells_goals_is_concurrency_then_why/"
    >If one of Haskell's goals is concurrency, then why is it based on the &#955;-calculus and not on a process calculus?</a
    ></li
  ></ul
><h3
> Help and Feedback</h3
><p
>If you'd like to make an announcement in the next Haskell Parallel
Digest, then get in touch with me, Eric Kow, at
<a href="mailto:parallel@well-typed.com"
  ><code
    >parallel@well-typed.com</code
    ></a
  >. Please
feel free to leave any comments and feedback!
</p
>]]></content>
</entry>
<entry>
    <title type="text">Hacking on the hackage-server at the Haskell Hackathon</title>
    <published>2011-07-25T23:45:05Z</published>
    <updated>2011-07-25T23:45:05Z</updated>
    <link rel="alternate" type="application/atom+xml" href="http://www.well-typed.com/blog/57" />
    <summary type="text"><![CDATA[Several people have been wondering recently what is up with the new hackage-server implementation. At the weekend at HacPDX-II we made quite a bit of progress in fixing things and getting more people involved. The HacPDX-II hackathon took place this last weekend in Portland Oregon. Thomas DuBuisson contacted [...]]]></summary>
    <author><name>duncan</name></author>
    <category term="community" />
    <id>http://www.well-typed.com/blog/57</id>
    <content type="html"><![CDATA[<p
>Several people have been wondering recently what is up with the
<a href="http://hackage.haskell.org/trac/hackage/wiki/HackageDB/2.0"
  >new hackage-server implementation</a
  >. At the weekend at HacPDX-II we made quite a bit of progress in fixing things and getting more people involved.</p
><p
>The <a href="http://www.haskell.org/haskellwiki/HacPDX-II"
  >HacPDX-II</a
  > hackathon took place this last weekend in Portland Oregon. Thomas DuBuisson contacted me a week or two before to see if we could make the new hackage-server a focus for the hackathon and get several people involved. This worked out quite well, we got several people who attended HacPDX-II helping out. Also, just the fact that Thomas declared that this project would be the focus meant that it energised some of the rest of us who are interested in the hackage-server but who could not attend HacPDX-II in person, including Matthew Gruen and Jeremy Shaw. So before getting into the technical details I'd really like to thank Thomas for organising things and to Matthew and Jeremy for giving up much of their weekends to help us out.</p
><p
>Jeremy Shaw, of Happstack fame amongst other things, sent in two excellent patches to shift hackage-server to using newer infrastructure. One patch updated everything to using the latest Happstack 6.x version, and the other converted from using happstack-data to the newer acid-state package.</p
><p
>David Lazar and Thomas DuBuisson got stuck in, tried things out and found and fixed various bugs. In particular, David fixed some things to do with the interface that package maintainers can use to mark packages as deprecated, and Thomas fixed up some stuff to do with package tags.</p
><p
>Matthew Gruen, who did his GSoC project with me on the hackage-server last summer, spent much of the weekend hacking, debugging and answering questions from the rest of us. In particular he and I spent quite a while thinking about HTTP authentication issues.</p
><p
>Originally, last summer, we had the crazy idea that we could offer either HTTP Basic or Digest authentication based on the user account. The reason we wanted to do this is because the current ~800 user accounts on the central hackage.haskell.org server all use HTTP basic auth (stored in Apache's standard htpasswd database format) but really we'd like to migrate to using HTTP digest auth because it's more secure than basic auth. Unfortunately, due to the way HTTP authentication works, having some user accounts using basic and some using digest authentication just will not work. That means we cannot do a smooth transition from basic to digest for the existing user accounts. So we had a think about what system to go for and how to clean up the mess in the current code that tries to offer different auth systems to different users.</p
><p
>For the moment what we've decided is to go for digest authentication exclusively and to think about how to do migration separately. This is partly so that we can get something working now, because several people want to use the hackage-server now, it's not just the central hackage.haskell.org that we have to think about.</p
><p
>One option for migration is to import all the old accounts and have a special re-registration page for existing users. That would authenticate them using their old passwords and then enable their new accounts with the new digest authentication system. The point is it would not be transparent, but it should not be too much of a bother.</p
><p
>Going for digest authentication means that we had to fix a few bugs in the client side HTTP implementation. Matthew tracked down the problem and I've sent the patches off to the maintainer of the <a href="http://hackage.haskell.org/package/HTTP"
  >HTTP</a
  > package that cabal-install uses.</p
><p
>I spent a few days before the hackathon working on the hackage-mirror client. This is a program included in the hackage-server package that can synchronise packages between servers. In particular it can copy packages from the old hackage.haskell.org to your own hackage-server instance. Along with the HTTP digest work, on the final day of the hackathon I managed to get it to sync packages from hackage.haskell.org to my hackage-server running on localhost and to do the uploads authenticated using digest auth. So that was quite a good conclusion to things.</p
><p
>One thing that worked quite well for the hackathon was that we used a common darcs repository and gave everyone write access. We all just worked away and darcs pushed and pulled using the common repository. This is a model where darcs works really well. Now that the hackathon is over, I'll move the changes into the main repository.</p
><p
>So there's still plenty to do, but I think we've shown that the new hackage-server implementation is nearly ready to be used, for new server instances at least. I'm going to be doing more work on the mirroring, because a client of Well-Typed wants to use it, so I expect to post an update on that sometime soon.</p
>]]></content>
</entry>
<entry>
    <title type="text">Parallel Haskell Digest 4</title>
    <published>2011-07-22T15:46:17Z</published>
    <updated>2011-07-22T15:46:17Z</updated>
    <link rel="alternate" type="application/atom+xml" href="http://www.well-typed.com/blog/56" />
    <summary type="text"><![CDATA[Hello Haskellers! It's time for the fourth edition of the Parallel Haskell Digest, bringing you a summary of the news and discussions of parallelism and concurrency in Haskell. The digest is made possible by the Parallel GHC Project. par :: a -> b -> b pseq :: a -> b -> b fib :: Int -> Int fib 0 = 0 [...]]]></summary>
    <author><name>nick</name></author>
    <category term="parallel" />
    <category term="ph-digest" />
    <id>http://www.well-typed.com/blog/56</id>
    <content type="html"><![CDATA[<p
>Hello Haskellers!</p
><p
>It's time for the fourth edition of the Parallel Haskell Digest,
bringing you a summary of the news and discussions of parallelism and
concurrency in Haskell.
The digest is made possible by the Parallel GHC Project.</p
><h2 id="news">News</h2>
<ul>
<li><p>The Monad.Reader: special issue on parallelism and concurrency</p>
<p>In the words of <a href="http://themonadreader.wordpress.com/2011/07/11/call-for-copy-issue-19-special-issue-on-parallelism-and-concurrency/">The Monad Reader</a> itself:</p>
<blockquote>
<p>Whether you're an established academic or have only just started learning Haskell, if you have something to say, please consider writing an article for The Monad.Reader! Issue 19 will be a special issue focusing on articles related to parallelism and concurrency, construed broadly. The submission deadline for Issue 19 will be: Tuesday, August 16.</p>
</blockquote></li>
<li><p>Threadscope Implementor's Summit</p>
<p>The threadscope implementor's summit was held this month at Microsoft Research, Cambridge. The summit brought together developers who are currently working with Threadscope, whether that be hacking on generating the events that are emitted by GHC for analysis in Threadscope, using the event trace that is produced for detailed profiling information, or working on improving Threadscope itself to provide better tools for parallel profile analysis.</p>
<p>The meeting was full of ideas, and covered topics such as: adding extensions to the current eventlog format to enable additional information to be tagged to events; improving the visualisation of information in threadscope; formalising the transitions of thread states into a finite state machine; and matching up executed code with corresponding source locations. With all this food for thought, we should expect plenty of interesting work in this area.</p></li>
</ul>
<h2 id="word-of-the-month">Word of the Month</h2>
<p>In this issue we have two words of the month: <em>par</em> and <em>pseq</em>.</p>
<p>Haskell provides two annotations, <code>par</code> and <code>pseq</code>, that allow the programmer to give hints to the compiler about where there are opportunities to exploit parallelism. While these annotations are not typically used directly by programmers, it is useful to understand them because they are the underlying mechanism for higher level tools such as &quot;parallel strategies&quot; and the Eval monad.</p>
<p>The two combinators have the following signatures:</p>
<pre
>par  :: a -&gt; b -&gt; b
pseq :: a -&gt; b -&gt; b
</pre
><p>While their signatures are the same, they are used to annotate different things. The <code>par</code> combinator hints to the Haskell implementation that it might be beneficial to evaluate the first argument in parallel. However, since Haskell does not impose an evaluation order, we also need <code>pseq</code>, which instructs the compiler to ensure that its first argument is evaluated before the second.</p>
<p>Let's take a look at an example inspired by <a href="http://community.haskell.org/~simonmar/papers/threadscope.pdf">Parallel Performance Tuning for Haskell</a> by Jones, Marlow, and Singh, which illustrates this nicely. Suppose you're interested in the sum of two expensive computations. To keep things simple, we'll use a naive implementation of <code>fib</code> (the point here isn't to have an efficient computation, I'm trying to show an <em>expensive</em> one):</p>
<pre
>fib :: Int -&gt; Int
fib 0 = 0
fib 1 = 1
fib n = fib (n-1) + fib (n-2)
</pre
><p>For a second expensive computation, we'll calculate the negafibonacci number, which works on negative numbers:</p>
<pre
>negafib :: Int -&gt; Int
negafib 0 = 0
negafib (-1) = 1
negafib n = nfib (n+2) - nfib (n+1)
</pre
><p>The sum of these two can be calculated by the following sequential function:</p>
<pre
>sumfib :: Int -&gt; Int
sumfib n = x + y
 where
  x = fib n
  y = negafib (-n)
</pre
><p>There's obvious room for improvement here when we have two cores: we simply calculate the expensive computations on separate cores. Annotating the code above is a fairly simple process. We first use <code>par</code> to annotate the fact that <code>x</code> must be calculated in parallel with the rest of the computation. Second, we ensure that <code>y</code> gets computed before <code>x + y</code> by annotating with <code>pseq</code>. The result is as follows:</p>
<pre
>import Control.Parallel (par, pseq)

psumfib :: Int -&gt; Int
psumfib n = x `par` (y `pseq` x + y)
 where
  x = fib n
  y = negafib (-n)
</pre
><p>We can write a simple program that outputs the result of running this computation with the following <code>main</code> function:</p>
<pre
>main :: IO ()
main = putStrLn . show . sumfib $ 37
</pre
><p>We should hope for the parallel version to work twice as fast, since the two expensive functions should take about the same time to compute. Here's the output of compiling and running the sequential version of the program:</p>
<pre
>$ ghc -rtsopts Main.hs
[1 of 1] Compiling Main             ( Main.hs, Main.o )
Linking Main ...
$ time ./Main

real        0m6.113s
user        0m6.090s
sys         0m0.010s
</pre
><p>Now replacing <code>sumfib</code> with <code>psumfib</code> produces the following results:</p>
<pre
>$ ghc -rtsopts -threaded Main.hs
[1 of 1] Compiling Main             ( Main.hs, Main.o )
Linking Main ...
$ time ./Main +RTS -N2

real        0m3.402s
user        0m6.660s
sys         0m0.040s
</pre
><p>This is obviously a very trivial example, but the point is that annotations provide a powerful way of expressing parallel algorithms. It's interesting to note that for this simple program, the timings for the parallel version on a single core performs as well as the single core version compiled without threading.</p>
<p>While annotations are a simple mechanism for expressing where parallelism might be exploited in a program, beware that there are a number of pitfalls to using this technique: all that glitters is not gold! The main difficulty in using <code>par</code> and <code>pseq</code> directly is that you really need to have a clear understanding of evaluation order. In particular, unless you understand what laziness is doing to the evaluation order, then you might find that the computations you're sparking off with <code>par</code> might not occur when you expected they should.</p>
<p>Then there are all the general difficulties that you face with parallel programming, like getting the right granularity of work to do in parallel. Using <code>par</code> is quite lightweight so can be used to exploit reasonably fine grained parallelism, but it is certainly not free. Finally, parallel performance suffers when there are too many garbage collections, so keeping this to a minimum by either using more efficient data structures or increasing available memory, becomes an important factor.</p>
<p>Nevertheless, it's well worth having a play with <code>par</code> and <code>pseq</code>. The next step after that is to look at parallel strategies. Strategies is a layer of abstraction on top of <code>par</code> and <code>pseq</code>. You might like to to read <a href="http://research.microsoft.com/apps/pubs/default.aspx?id=138042">Seq no more: Better Strategies for Parallel Haskell</a>, by Simon Marlow et al. which describes Strategies and the Eval monad. It's all available in the <a href="http://hackage.haskell.org/package/parallel">parallel library</a> on Hackage. More recently, the <code>Par</code> monad has also been introduced as yet another way of describing parallel evaluations. These key topics will no doubt feature in a future word of the month, so stay tuned!</p>
<h2 id="parallel-ghc-project-update">Parallel GHC Project Update</h2>
<p>The Parallel GHC Project is an MSR-funded project to push the real-world use of parallel Haskell. The aim is to demonstrate that parallel Haskell can be employed successfully in industrial projects. This month we're having a guest column from the team over at Los Alamos National Laboratory, one of the partners involved in the project (you can see the full details in report LA-UR 11-0341). They have been working on writing Monte Carlo physics simulations in Haskell, which has given them high levels of parallelism, along with useful tools for abstraction. So, without further ado, over to Michael Buksas from LANL:</p>
<p>Our goal is to build highly efficient Monte Carlo physics simulations using parallel Haskell. We're focusing on SMP performance though some combination of explicit threading and pure parallel annotations.</p>
<p>The Monte Carlo approach involves randomly sampling the space of solutions to generate data which contributes to the solution. For these physical problems, our samples are the tracks of particles as they move through space, interacting with a physical material as they go. Data collected from each particle trajectory is then combined into information needed to compute the solution. For example, the detailed information about the particle's interaction with the material is collected into a collective effect on the material properties.</p>
<p>To date, we have a code base which includes two approaches to the problem. One is a specific and parallel-tuned application code targeting relativistic neutrino transport in stellar atmospheres. The other is building a more general environment for creating specific applications, such as this one.</p>
<p>We recently presented to our colleagues in LANL some preliminary results on the parallel performance of the targeted application code.</p>
<p>To give a sense of the approach to parallelization in this code, consider these high-level functions from an earlier serial version:</p>
<pre
>main :: IO ()
main = do
  (n, rest) &lt;- parseCL
  let tally = runMany infMesh simpleMat n
  writeTally &amp;quot;tally&amp;quot; tally

runMany :: Mesh -&gt; Material -&gt; Word32 -&gt; RNG -&gt; Tally
runMany msh mat ntot rng = let
  ps = genParticles ntot msh rng
  tallies = map (runParticle msh mat) $ ps
  in foldl' merge emptyTally tallies
</pre
><p>And consider the following changes for the parallel version:</p>
<pre
>main :: IO ()
main = do
  (n,sz) &lt;- parseCL
  let tally = feed infMesh simpleMat n sz prand
  writeTally &amp;quot;tally&amp;quot; tally

feed :: Mesh -&gt; Material -&gt; Word32 -&gt; Word32 -&gt; RNG -&gt; Tally
feed msh mat ntot chunkSz rng
    | ntot &lt;= chunkSz = runMany msh mat ntot rng
    | otherwise       = t `par` (ts `pseq` (merge t ts))
    where t  = runMany msh mat chunkSz g1
          ts = feed msh mat (ntot - chunkSz) chunkSz g2
          (g1,g2) = split g
</pre
><p>We've wrapped function <code>runMany</code> in <code>feed</code>, which partitions the collection of generated particles into groups of size <code>chunkSz</code>, and issues these particles to <code>runMany</code> in parallel.</p>
<p>With this simple change, we seeing upwards of 80% utilization of up to 8 cores, for a performance improvement greater than a factor of 6. We believe that performance can be further improved with different strategies for breaking down the work, and looking for additional parallelization opportunities in the collection of results.</p>
<p>Our other branch of development is focused on finding useful abstractions and high-level functions to support programming a variety of Monte Carlo problems of this kind. We have identified a few such useful abstractions, and implemented them as type classes and type families.</p>
<p>For example, <code>Space</code> is a general term for the physical space and imposed symmetries in which we can perform a simulation. We express this as follows:</p>
<pre
>class Space s where
  type Position s  :: *
  type Direction s :: *
  stream    :: s -&gt; Distance -&gt; s
  position  :: s -&gt; Position s
  direction :: s -&gt; Direction s
  make      :: Position s -&gt; Direction s -&gt; s
</pre
><p>and implement specific spaces, such as one with the symmetry of the unit sphere:</p>
<pre
>instance Space Spherical1D where
    type Position  Spherical1D = Radius
    type Direction Spherical1D = Normalized Vector2
    stream (Vector2 x y) (Distance d) = Vector2 (x+d) y
    position s  = Radius $ vmag s
    direction s = normalize s
    make (Radius pos) dir = pos *| (normalized_value dir)
</pre
><p>This allows the specific space data types to be used in a variety of contexts. Using ordinary parametric polymorphism is also effective:</p>
<pre
>-- | Stream a single particle:
stream :: (p -&gt; (e,p))   -- ^ Function to produce each step. Comes from a model.
          -&gt; (e -&gt; Bool) -- ^ Check for terminal events to stop streaming
          -&gt; p           -- ^ Initial particle
          -&gt; [(e, p)]    -- ^ Resulting list of events and particle states.
stream stepper continue p = next p
  where next p =
          let (e, p') = stepper p
          in  (e, p') : if continue e then next p' else []
</pre
><p>The above is our high-level routine function for generating a history from a single particle, recorded as a list of (event, particle) pairs, where the event and particle data types are provided for each problem.</p>
<h2 id="blogs-papers-and-packages">Blogs, Papers, and Packages</h2>
<ul>
<li><p><a href="http://tomasp.net/blog/comprefun.aspx">Fun with parallel monad comprehensions (19 July 2011)</a></p>
<p>This blog post by Thomas Petricek featured in the Monad Reader 18, and covers some of the interesting things that can be achieved with monad comprehensions when viewed from a parallel perspective. Along the way, he deals with examples such as the parallel composition of parsers.</p></li>
<li><p><a href="http://jaspervdj.be/posts/2011-07-05-parallelizing-a-nonogram-solver.html">Parallelizing a nonogram solver (05 July 2011)</a></p>
<p>Jasper Van der Jeugt detailed his implementation of a parallel nonogram solver. Nonograms also go by the name of Paint Sudoku: the aim is to colour in a grid where a list of numbers is given for each row and column and these numbers indicate consecutive runs of filled-in squares in the corresponding row or column. For large puzzles, grids that are 20x20, Jasper reports that on a dual core machine his a parallel algorithm reduces execution by 37.9% compared to its sequential counterpart.</p></li>
<li><p><a href="http://blog.ezyang.com/2011/06/the-iva-monad/">The IVar monad (29 June)</a></p>
<p>Edward Z. Yang has written a series of posts discussing IVars, which are immutable variables which are a write-once, read-many (these are particularly handy for communicating results from a child process to its parent). Edward's post outlines the difficulties involved in defining a monad for IVars.</p></li>
<li><p><a href="http://themonadreader.files.wordpress.com/2011/07/issue18.pdf">Map reduce as a monad (05 July 2011)</a></p>
<p>Julian Porter wrote an article for The Monad Reader 18 about how MapReduce could be expressed as a monad. The MapReduce framework finds its roots in functional programming, and this is an interesting take on the problem.</p></li>
</ul>
<h2 id="mailing-list-discussions">Mailing list discussions</h2>
<ul>
<li><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-July/093751.html">NVIDIA's CUDA and Haskell (5 July 2011)</a></p>
<p>Vasili Galchin was wondering whether or not there had been any efforts to build bridges between NVIDIA's CUDA and Haskell. Don Stewart was quick to respond with a number of links to active work in the area:</p>
<ul>
<li><a href="http://hackage.haskell.org/package/cuda">Direct access to CUDA</a></li>
<li><a href="http://hackage.haskell.org/package/language-c-quote">CUDA in Haskell</a></li>
<li><a href="http://hackage.haskell.org/package/OpenCLRaw">Direct access to OpenCL</a></li>
<li><a href="http://hackage.haskell.org/package/accelerate">The accelarate package</a></li>
</ul>
<p>Trevor McDonell noted that the accelerate package was best accessed from the source repository on github, and that the CUDA bindings hadn't yet been tested or updated for the latest toolkit release.</p></li>
<li><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-June/092661.html">Unbelievable parallel speedup (3 June 2011)</a></p>
<p>While reading Simon Marlow's <a href="http://community.haskell.org/~simonmar/par-tutorial.pdf">tutorial</a> on parallel and concurrent programming, John Ramsdell reported some remarkable (slightly superlinear!) performance gains for one of his programs. Thomas Schilling guessed that this was due to the large variance in the figures reported, but went on to describe how it might be possible to obvserve such performance boosts due to reduced local cache misses when using several cores. Without more information about the program in question, it's difficult to do any kind of diagnosis, but nevertheless, it's great to hear about good results from a happy Haskeller!</p></li>
<li><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-July/093689.html">Automatic Reference Counting (2 July 2011)</a></p>
<p>After hearing about the new static analysis tools in Clang that does automatic reference counting (ARC), Thomas Davie was wondering if some compiler gurus might be able to comment on the applicability of this kind of analysis to Haskell, as an alternative to garbage collection. This led to an enlightening discussion about reference counting versus garbage collection.</p></li>
<li><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-June/093231.html">Haskell on NUMA (16 June 2011)</a></p>
<p>Michael Lesniak was wondering what the state of parallel performance of Haskell on Non-Uniform Memory Access (NUMA) machines was like, since he's having problems and can't find much useful information online. Nobody seems to have answered this one, are there any suggestions?</p></li>
<li><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-May/092389.html">Parallel compilation and execution? (26 May 2011)</a></p>
<p>Michael Rice was trying to figure out how to compile and run a simple program that outputs the result of a parallel fibonacci algorithm. After a quick reminder to use <code>pseq</code> rather than <code>seq</code> to force sequential evaluation, Daniel Fischer suggested that recompilation might be required, and that passing <code>--fforce-recomp</code> would be a good way to ensure that this occured.</p>
<p>Michael was also keen to know whether Control.Parallel was comparable to OpenMP. Alex Mason gave a detailed reply and gave an example of parallel mergesort as a means of comparison.</p></li>
<li><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-May/091843.html">parMap doesn't work fine (12 May 2011)</a></p>
<p>After just starting out with parallel computations in Haskell, Grigory Sarnitskiy ran into troubles making parMap work with lazy structures. To resolve these issues, Brandon Moore pointed to using <code>rdeepseq</code>, and Maciej Piechotka suggested <code>deepseq</code>.</p></li>
<li><p><a href="http://groups.google.com/group/parallel-haskell/t/b34d796e6672fd6b">efficient parallel foldMap for lists/sequences (17 June 2011)</a></p>
<p>Sebastian Fischer re-posted his question about efficient parallel <code>foldMap</code> for lists to the parallel mailing list. In essence he was seeking an efficient implementation of <code>foldMap</code>, where a list is folded into a single value before a map is applied to the result. Johannes Waldmann advised against using ordinary lists, and mentioned that he was using <code>Data.Vector</code> instead. Additionally, he recommended switching to a sequential fold once a parallel fold had been used to a certain depth. Christopher Brown further confirmed that it was a good idea to spark off computations when the granularity is high enough to make it worthwhile, and also mentioned that it was best to spark computations that were evaluated to normal form.</p></li>
<li><p><a href="http://groups.google.com/group/parallel-haskell/t/f5f0946d0780b59b">Wanted: parallel Haskell tutorial/talk/demonstration in Leipzig, Germany, October 7 (8 July 2011)</a></p>
<p>Johannes Waldmann is looking for volunteers who might be able to present at their local Haskell Workshop, and welcomes submissions on parallel and distributed computing using Haskell. The submission deadline is 20 August.</p></li>
</ul>
<h2 id="stack-overflow">Stack Overflow</h2>
<ul>
<li><a href="http://stackoverflow.com/questions/6444716/how-to-write-nested-loop-problem-using-parallel-strategies-in-haskell">How to write nested loop problem using parallel strategies in Haskell</a></li>
<li><a href="http://stackoverflow.com/questions/6623316/how-to-measure-sequential-and-parallel-runtimes-of-haskell-program">How to measure sequential and parallel runtimes of Haskell program</a></li>
<li><a href="http://stackoverflow.com/questions/6439925/poor-performance-lockup-with-stm">Poor performance / lockup with STM</a></li>
</ul>
<h2 id="help-and-feedback">Help and Feedback</h2>
<p>If you'd like to make an announcement in the next Haskell Parallel Digest, then get in touch with me, Nicolas Wu, at <a href="mailto:parallel@well-typed.com">parallel@well-typed.com</a>. Please feel free to leave any comments and feedback!</p>
]]></content>
</entry>
<entry>
    <title type="text">Parallel Haskell Digest 3</title>
    <published>2011-06-16T08:39:15Z</published>
    <updated>2011-06-16T08:39:15Z</updated>
    <link rel="alternate" type="application/atom+xml" href="http://www.well-typed.com/blog/55" />
    <summary type="text"></summary>
    <author><name>nick</name></author>
    <category term="parallel" />
    <category term="ph-digest" />
    <id>http://www.well-typed.com/blog/55</id>
    <content type="html"><![CDATA[<p>Hello Haskellers!</p>
<p>Welcome to the third edition of the Parallel Haskell Digest, bringing you news, discussions and tasters of parallelism and concurrency in Haskell. The digest is made possible by the Parallel GHC project. More news about how we're doing below.</p>
<p>This digest is brought to you by Eric Kow and Nicolas Wu. Nick will be taking over as maintainer of the Parallel Digest for the next few months.</p>
<p>It's tutorial season in the Parallel Haskell world, with Simon Marlow's <a href="http://community.haskell.org/~simonmar/par-tutorial.pdf">Parallel and Concurrent Programming in Haskell</a>, and Don Stewart's <a href="http://www.haskell.org/haskellwiki/Numeric_Haskell:_A_Repa_Tutorial">Numeric Haskell: A Repa Tutorial</a>. The former gives a broad tour of parallelism and concurrency techniques in Haskell and the latter focuses on the Repa parallel array library. Both are very concrete and focus on real working code. Give them a try!</p>
<h2 id="news">News</h2>
<ul>
<li><p>Haskell in the Economist! The Economist article <a href="http://www.economist.com/node/18750706?story_id=18750706">Parallel Bars</a> discusses the rise of multicore computing, and the essential obstacle that programs have to be specially written with parallelism in mind. The article gives a tour of some problems (overhead and dependencies between subtasks, programmers being trained for sequential programming, debugging parallel programs) and possible solutions, among which is functional programming:</p>
<blockquote>
<p>Meanwhile, a group of obscure programming languages used in academia seems to be making slow but steady progress, crunching large amounts of data in industrial applications and behind the scenes at large websites. Two examples are Erlang and Haskell, both of which are “functional programming” languages.</p>
</blockquote>
<p>Are we doomed to succeed?</p></li>
</ul>
<h2 id="word-of-the-month">Word of the Month</h2>
<p>The word of the month (well, phrase really!) for this digest is <em>parallel arrays</em>. Parallel array manipulation fits nicely into Haskell's arsenal of parallel processing techniques. In fact, you might have seen the Repa library, as mentioned above, in the news a while back. And now there's a new tutorial on the <a href="http://www.haskell.org/haskellwiki/Numeric_Haskell:_A_Repa_Tutorial">Haskell Wiki</a>.<br />So what's all the fuss?</p>
<p>Parallel arrays manipulation is a particularly nice way of writing parallel programs: it's pure and it has the potential to scale very well to large numbers of CPUs. The main limitation is that not all programs can be written in this style. Parallel arrays are a way of doing <em>data parallelism</em> (as opposed to <em>control parallelism</em>).</p>
<p>Repa (REgular Parallel Arrays) is a Haskell library for high performance, regular, multi-dimensional parallel arrays. All numeric data is stored unboxed and functions written with the Repa combinators are automatically parallel. This means that you can write simple array code and get parallelism for free.</p>
<p>As an example, Repa has been used for real time edge detection using a <a href="http://en.wikipedia.org/wiki/Canny_edge_detector">Canny edge algorithm</a>, which has resulted in performance comparable to the standard OpenCV library, an industry standard library of computer vision algorithms written in C and C++ (a single threaded Haskell implementation is about 4 times slower than the OpenCV implementation, but is on par when using 8 threads on large images).</p>
<div class="figure">
<img src="http://2.bp.blogspot.com/-k6mYbRMrKTM/TZGhiwGTWvI/AAAAAAAAAB4/aX9H50xgY-k/s400/beholder.jpg" alt="Repa performing Canny edge detection" /><p class="caption">Repa performing Canny edge detection</p>
</div>
<p>While the Canny algorithm written with Repa doesn't quite match the speed of its procedural counterparts, it benefits from all of Haskell's built in support for purity and type safety, and painlessly scales to multiple cores. You can find more details on Ben Lippmeier's <a href="http://disciple-devel.blogspot.com/2011/03/real-time-edge-detection-in-haskell.html">blog</a>.</p>
<h2 id="parallel-ghc-project-news">Parallel GHC project news</h2>
<p>The Parallel GHC Project is an MSR-funded project to promote the real-world use of parallel Haskell. Part of this project involves effort by Well-Typed to provide tools for use by the general community.</p>
<p>Last week, Well-Typed were excited to <a href="http://www.well-typed.com/blog/54">announce</a> that we have capacity to support another partner for the parallel project for at least another 12 months. So, if you have a project that involves scaling up to multiple cores, or that deals with some heavy duty concurrency, then we'd love to hear from you. In return for your time and commitment to the parallel project, we'll unleash our team of expert programmers to help build tools and provide support that will help you and the community towards making parallel Haskell even more accessible.</p>
<p>For more information on the Parallel GHC project, on the <a href="http://www.haskell.org/haskellwiki/Parallel_GHC_Project">Parallel GHC Project wiki page</a></p>
<h2 id="blogs-papers-and-packages">Blogs, papers and packages</h2>
<ul>
<li><p><a href="http://community.haskell.org/~simonmar/par-tutorial.pdf">Parallel and Concurrent Programming in Haskell (19 May)</a></p>
<p>Simon Marlow released version 1.1 of his tutorial introducing the main programming models available for concurrent and parallel programming in Haskell. &quot;This tutorial takes a deliberately practical approach: most of the examples are real Haskell programs that you can compile, run, measure, modify and experiment with&quot;. The tutorial covers a lot of ground, including use of the new <code>Par</code> monad, so if you're still not sure where to start, this might be the place.</p></li>
<li><p><a href="http://tomasp.net/blog/speculative-par-monad.aspx">Explicit speculative parallelism for Haskell's Par monad</a></p>
<p>Tomas Petricek builds off the Par monad to provide a way to write speculative computations in Haskell. If there are two different ways to compute something and each of the ways is faster for different kinds of input (but we don't know exactly which), one useful trick is to run both ways in parallel and return the result of the one that finishes first. Tomas' speculative <code>Par</code> monad lets you do just that, starting computations in parallel and cancelling unwanted ones. You can check his code out on <a href="https://github.com/tpetricek/Haskell.ParMonad">GitHub</a></p></li>
<li><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-April/090736.html">Etage-Graph, data-flow based graph algorithms (3 Apr)</a></p>
<p>Mitar announced the Etage-Graph package, which implements graph algorithms on top of his data-flow framework. He invites us to have a look and see how one might implement known control-flow algorithms in a data-flow manner, which has the appeal of being easy to parallelise.</p></li>
<li><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-April/090743.html">stm-chans (4 Apr)</a></p>
<p>The stm-chans library offers a collection of channel types, similar to Control.Concurrent.STM.TChan but with additional features. It offers bounded, closable, and bounded closable FIFO channels, along with a partial TChan compatibility layer.</p></li>
<li><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-April/091520.html">timeplot 0.3.0, Eugene Kirpichov (30 Apr)</a></p>
<p>Eugene Kirpichov announced the release of timeplot 0.3.0, a tool to help you visualize arbitrary time-dependent data, e.g. log files, into big picture visualisations. Eugene also linked to a presentation with plenty of graphical examples for the tool and its sister &quot;splot&quot;, mostly on cluster computing use cases. The tools follow the basic philosophy of depending neither on the log format (the expected workflow is like <code>cat log | awk foo | plot</code>), nor on the type of data (Eugene wants to visualise arbitrary signals).</p>
<p>You can also find more information in Eugene's <a href="http://jkff.info/presentations/two-visualization-tools.pdf">presentation</a>.</p></li>
</ul>
<h2 id="mailing-list-discussions">Mailing list discussions</h2>
<ul>
<li><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-April/090673.html">Weird multi-threading runtime behaviour of single-threaded program with GHC-7.0.3 (1 Apr)</a> Herbert Valerio Riedel has a program which parses a roughly 8 MiB JSON file (with the aeson library) without any use of parallelism or concurrency constructs: just a straightforward, demanding, single-threaded computation. He compiled the program with <code>-threaded</code> and ran with <code>+RTS -N12</code> on a 12 core machine. By rights, only one core should be doing work but when he looks at the output of <code>top</code> it seems like all of them consume a substantial amount of CPU cycle. What's up with the remaining 11?</p></li>
<li><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-April/091325.html">A parallelization problem (25 Apr)</a></p>
<p>Wren Ng Thornton has some loopy code that he would like to parallelise. He was hoping to use pure parallelism (<code>par</code> and friends) instead of explicit concurrency because he's concerned that &quot;lightweight as Haskell's threads are [...] the overhead would swamp the benefits.&quot; Wren's code has an outer and inner loop fleshing out a map of maps. Wren wants to speed up his inner loop, which he thinks should be possible because the keys in the inner map are independent of each other.</p></li>
<li><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-April/090708.html">Multi agent system (2 Apr)</a></p>
<p>Yves Parès wants to build a simple game with multiple agents that communicate with each other. He asked if it was reasonable to launch one Haskell thread per agent and have them communicate over Chans (he will have something like 200 agents at a time). Felipe Lessa responded that hundreds of agents should be no problem but hundreds of thousands of agents might not work.</p></li>
<li><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-April/090970.html">Using DPH (12 Apr)</a></p>
<p>When experimenting with Repa and DPH, Wilfried Kirschenmann encountered strange compiler errors when parallelising his code; in particular, a strange pattern match failure in a do expression. Wilfried supplied a small demonstrator computing the norm of a vector. Ben Lippmeier responded with help and also a reminder that Repa and DPH are separate projects (the latter still being in the &quot;research prototype&quot; stage). Ben's version with more tweaks from Wilfried was 15x faster than his original (non-parallel?) version but still 15x slower than the C version. Parallelising the Repa sum and fold functions should help.</p></li>
<li><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-April/091014.html">Service Component Architecture and Distributed Haskell (13 Apr)</a></p>
<p>James Keats noticed the recent talk about distributed Haskell and wished to see &quot;interest from an experienced member of the Haskell community in implementing Haskell components with SCA/<a href="http://en.wikipedia.org/wiki/Apache_Tuscany">Tuscany</a>&quot;, as this would help him to use Haskell in the same project with Java and Scala.</p></li>
<li><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-April/091039.html">parallel-haskell mailing list (14 Apr)</a></p>
<p>Eric Kow announced the parallel-haskell mailing list and provided instructions for subscribing to the list. Whereas the Haskell Cafe list and Stack Overflow are the best places to ask your parallelism and concurrency questions, the parallel-haskell list is a great place to follow and discuss the research.</p></li>
<li><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-April/091040.html">Questioning seq (14 Apr)</a></p>
<p>Among other questions about <code>seq</code>, Andrew Coppin asked how <code>pseq</code> was different from it. Austin Seipp responded that while pseq and seq are semantically equivalent, (A) <code>pseq</code> is only strict in its first argument whereas <code>seq</code> is strict in both and (B) <code>pseq</code> guarantees the order of evaluation so that <code>pseq a b</code> means <code>a</code> will be strictly evaluated before <code>b</code> whereas for the purpose of potential optimisations <code>seq</code> does not. Albert Y. C. Lai provided a short example showing that &quot;there are non-unique evaluation orders to fulfill the mere strictness promise of <code>seq</code>&quot;.</p></li>
<li><p><a href="http://www.haskell.org/pipermail/haskell/2011-April/022730.html">select(2) or poll(2)-like function? (17 Apr)</a></p>
<p>Matthias Kilian wanted to know if the Haskell standard libraries had anything like a select or poll like function. Answering the immediate question, Don Stewart pointed to <a href="http://hackage.haskell.org/packages/archive/base/latest/doc/html/Control-Concurrent.html#v:threadWaitRead">Control.Concurrent.threadWaitRead</a>. Meanwhile, the thread opened into a long discussion on using high-level Haskell concurrency constructs vs. low-level polling.</p>
<p>For background, <code>select</code> (along with variants like poll, epoll, kqueue, etc) is a system-call which is often used for asynchronous IO, concurrently doing some processing work while some IO operation is underway (Matthias' use case is &quot;networking stuff listening on v4 and v6 sockets at the same time&quot;). A typical way of using &quot;select&quot; or its cousins would be to implement an event-driven programming model with a &quot;select loop&quot;: sleeping until some condition occurs on a file descriptor and, when woken, executing the appropriate bit of code depending on the file descriptor and the condition.</p>
<p>Some programmers opt for this approach because they believe using processes and OS threads would be too inefficient. Ertugrul Soeylemez and others argue that this is unnecessary in Haskell because Haskell threads are so lightweight (epoll under the hood); we can just stick with high-level concurrency constructs and trust the compiler and RTS to be smarter than us.</p>
<p>On the other hand, Mike Meyer is sceptical about the scalability and robustness of this solution. The sort of code Mike wants to write is for systems &quot;that run 7x24 for weeks or even months on end [where] [h]ardware hiccups, network failures, bogus input, hung clients, etc. are all just facts of life.&quot; Using <code>select</code> loops in his experience is not about efficiency, but avoiding the problems that arise from using high-level concurrency constructs in other languages. He favours to achieve concurrency with a &quot;more robust and understandable&quot; approach using separate Unix processes and an event-driven programming model. So far the only high-level approach Mike has found that provides for the sort of scalability is Eiffel's <a href="http://scoop.origo.ethz.ch/">SCOOP</a></p>
<p>How does Haskell stack up? And what about Haskell for the Cloud?</p></li>
<li><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-April/091089.html">Killing threads in foreign calls. (17 Apr)</a></p>
<p>Jason Dusek is building an application using Postgres for storage (with the help of the libpq binding on Hackage). He would like to kill off threads for queries that take too long to run, but trying Control.Concurrent.killThread seems not to work. This turns out to be a fairly non-trivial problem: most C code is not written to gracefully handle things like its underlying OS thread being killed off. The solution is to use whatever graceful cancellation mechanism is supplied by the foreign library; the PGCancel procedure in Jason's case.</p></li>
<li><p><a href="http://www.haskell.org/pipermail/haskell-cafe/2011-April/091134.html">Painless parallelization. (19 Apr)</a></p>
<p>Grigory Sarnitskiy believes that parallel programming is actually easier than sequential programming because it allows one &quot;to avoid sophisticated algorithms that were developed to gain performance on sequential architecture&quot;. So what are his options (preferably &quot;release state&quot; for writing pure functional parallel code with a Haskellish level of abstraction? Mats Rauhala pointed Grigory to the use of <code>par</code> and <code>pseq</code>, and the RWH chapter on parallelism and concurrency. Don Stewart followed up with a list of options:</p>
<ul>
<li>the &quot;parallel&quot; package Repa (parallel arrays)</li>
<li>DPH (experimental)</li>
<li>using concurrency</li>
</ul>
<p>Grigory's question also prompted Eric Kow to prioritise the <a href="http://haskell.org/haskellwiki/Parallel">Parallel Haskell portal</a>. Hopefully it will give us a standard resource to point new parallel Haskellers like Grigory to.</p></li>
</ul>
<h2 id="stack-overflow">Stack Overflow</h2>
<ul>
<li><a href="http://stackoverflow.com/questions/6185189/howto-kill-a-thread-in-haskell">Howto kill a thread in Haskell</a></li>
<li><a href="http://stackoverflow.com/questions/5926501/comparing-haskell-threads-to-kernel-threads-is-my-benchmark-viable">Comparing Haskell threads to kernel threads - is my benchmark viable?</a></li>
<li><a href="http://stackoverflow.com/questions/5942615/fair-concurrent-map-function-in-haskell">Fair concurrent <code>map</code> function in haskell?</a></li>
<li><a href="http://stackoverflow.com/questions/5935852/program-crashing-when-trying-to-create-multiple-threads">Program crashing when trying to create multiple threads</a></li>
<li><a href="http://stackoverflow.com/questions/5879128/a-way-to-form-a-select-on-mvars-without-polling">A way to form a 'select' on MVars without polling.</a></li>
<li><a href="http://stackoverflow.com/questions/5847642/haskell-lightweight-threads-overhead-and-use-on-multicores">Haskell lightweight threads overhead and use on multicores</a></li>
<li><a href="http://stackoverflow.com/questions/6242442/haskell-repa-mapping-with-indices">Haskell repa - mapping with indices</a></li>
<li><a href="http://stackoverflow.com/questions/6260263/do-accelerate-and-repa-have-different-use-cases">Do Accelerate and Repa have different use cases?</a></li>
</ul>
<h2 id="help-and-feedback">Help and feedback</h2>
<p>Got something for the digest? Send us an email at <script type="text/javascript">
<!--
h='&#x77;&#x65;&#108;&#108;&#x2d;&#116;&#x79;&#112;&#x65;&#100;&#46;&#x63;&#x6f;&#x6d;';a='&#64;';n='&#112;&#x61;&#114;&#x61;&#108;&#108;&#x65;&#108;';e=n+a+h;
document.write('<a h'+'ref'+'="ma'+'ilto'+':'+e+'">'+'<code>'+e+'</code>'+'<\/'+'a'+'>');
// -->
</script><noscript>&#112;&#x61;&#114;&#x61;&#108;&#108;&#x65;&#108;&#32;&#x61;&#116;&#32;&#x77;&#x65;&#108;&#108;&#x2d;&#116;&#x79;&#112;&#x65;&#100;&#32;&#100;&#x6f;&#116;&#32;&#x63;&#x6f;&#x6d;</noscript>. We're particularly interested in short parallelism/concurrency puzzles, and cool projects for featured code. Other comments and feedback (criticism and corrections especially!) would be welcome too.</p>
]]></content>
</entry>
<entry>
    <title type="text">Parallel GHC project: new opportunity for an organisation to participate</title>
    <published>2011-06-08T18:55:48Z</published>
    <updated>2011-06-08T18:55:48Z</updated>
    <link rel="alternate" type="application/atom+xml" href="http://www.well-typed.com/blog/54" />
    <summary type="text"><![CDATA[GHC HQ and Well-Typed are pleased to announce a new opportunity for an organisation to take part in the Parallel GHC Project. The project started in November 2010 with four industrial partners, and consulting and engineering support from Well-Typed. Each organisation is working on its own particular [...]]]></summary>
    <author><name>duncan</name></author>
    <category term="parallel" />
    <category term="well-typed" />
    <id>http://www.well-typed.com/blog/54</id>
    <content type="html"><![CDATA[<p
>GHC HQ and Well-Typed are pleased to announce a new opportunity for an
organisation to take part in the <a href="http://haskell.org/haskellwiki/Parallel_GHC_Project"
  >Parallel GHC Project</a
  >.</p
><p
>The project started in November 2010 with four industrial partners, and
consulting and engineering support from Well-Typed. Each organisation is
working on its own particular project making use of parallel Haskell.
The overall goal is to demonstrate successful use of parallel Haskell
and along the way to apply engineering effort to any problems with the
tools that the partner organisations might run into.</p
><p
>We have capacity to support another partner organisation for the
remaining duration of the project (at least another 12 months).
Organisations do not need to contribute financially but should be
prepared to make a significant commitment of their own time. Familiarity
with Haskell would be helpful, but Haskell expertise is not needed.
Partner organisations' choice of projects is similarly open-ended and
could be based on anything from pre-existing code bases to green field
endeavours.</p
><p
>We would welcome organisations interested in pure parallelism,
concurrency and/or distributed Haskell. Presently, two of our partner
organisations are using mainly pure parallelism and two are using
concurrency. What would be especially interesting for us is to diversify
this mix further by working with an organisation interested in making
use of of distributed Haskell, in particular the work highlighted in the
recent paper <a href="http://research.microsoft.com/en-us/um/people/simonpj/papers/parallel/remote.pdf"
  >&quot;Haskell for the Cloud&quot; (pdf)</a
  >.</p
><p
>To help give an idea what participating in the Parallel GHC Project is
like, here is what some of what our current partner organisations have
to say:</p
><blockquote
><p
  >The Parallel GHC Project has enabled us to make steady progress
towards our goals. Well-typed has provided support in the form
of best practice recommendations, general engagement with the
project, and directly coding up key components.</p
  ><p
  >I have been getting lots of help from Well-Typed, and enjoy
our weekly meetings.</p
  ><p
  >&#8213; Finlay Thompson, Dragonfly</p
  ></blockquote
><blockquote
><p
  >My organization is now trying to implement highly concurrent Web
servers. After GHC 7 was released we faced several critical bugs
in the new IO manager and one guy at Well-Typed kindly fixed all
the bugs. This has been a big benefit for our organization.</p
  ><p
  >Another benefit is feedback/suggestions from Well-Typed.
Well-Typed and our organization have meetings every other week
and we report progress to each other. During the discussions, we
can set the right direction to go in.</p
  ><p
  >&#8213; Kazu Yamamoto, IIJ Innovation Institute Inc.</p
  ></blockquote
><p
>Well-Typed is coordinating the project, working directly with the participating
organisations and the Simons at GHC HQ. If you think your organisation may be
interested then get in touch with me, Duncan Coutts,
via <a href="mailto:info@well-typed.com"
  >info@well-typed.com</a
  >.</p
>]]></content>
</entry>
</feed>

