Haskell Performance and Optimization

2–4 days

A course for developers who have made their first steps in Haskell and written a few programs, and who now want to take a look beyond the surface of Haskell and really get the details of how things are implemented.

It is part of the beauty of Haskell that you typically do not have to think too hard about performance. You can write code in a concise and declarative way, and still have it perform fast.

Sometimes, however, it pays off to know more about how programs run. Choosing the right data structure can make the difference between a program that runs instantaneous or one that runs forever. Making programs either too strict or too lazy can cause space leaks, which may not only result in too much memory being used (which can be very bad for long-running processes), but also result in more work for the garbage collector and make everything slower.

In this course, we’ll take a look at how Haskell (and GHC) implement things internally. We’ll discuss the internal representation of data on the heap, what exactly lazy evaluation means and how it works, how the compiler translates Haskell code to a target language via several internal representations, what you can and cannot reasonably expect the compiler to do, and how you can tweak the optimizer behaviour by using compiler pragmas such as inlining annotations and rewrite rules.

Despite all the low-level concept discussions, we will keep track of the high-level goals: to wrap certain performance-critical parts of the code in a high-level interface, so that you do not need to expose ugly details on the surface, but can write beautiful programs that scale. For this, the course will look at existing widely used and highly optimized libraries and take a look at how they are implemented internally. The course will also work through several carefully crafted hands-on exercises and examples to deepen the understanding of the concepts discussed and to illustrate common pitfalls and corner cases.

This course is designed such that Fast Track to Haskell covers all its prerequisites. However, the course is likely to be enjoyed more by participants who already have gained some practical experience in writing Haskell code.

Topics

The course includes the following topics:

  • Common Haskell data structures and their performance characteristics, such as lists, sequences, finite maps, hash maps, byte strings, text and arrays.

  • Unboxed types and the tradeoffs between boxed and unboxed types.

  • Lambda calculus and evaluation strategies; the differences between call-by-value, call-by-name and call-by-need (lazy evaluation).

  • GHC’s compilation process and its intermediate language: Core, STG and C--, with an emphasis on understanding Core.

  • Strictness analysis and space leaks.

  • Debugging tools such as time and space profiling.

  • How the inliner works and how it can be tweaked. How custom rewrite rules work.

  • Some interesting optimizations such as lambda lifting, (stream) fusion and worker-wrapper.

Duration

The minimal duration of this course is two full days. The course can easily be extended to up to four days, by giving more time for additional exercises and examples, covering some of topics in more detail, and adding some additional topics.

If delivered remotely, it is typically advisable to spread out the course sessions over a few more days, to provide the opportunity for participants to work on exercises and get feedback on their work in between sessions.

Cost

The base price of this course is GBP 3000 (one lecturer, two days, on-site). The base price excludes VAT and any other applicable taxes as well as travel costs which depend on the location of the course venue.

Prices for extending the course or adding additional lecturers (for larger on-site groups) on request.

We offer on-site consulting in combination with on-site courses at a reduced daily rate.

If you are interested in this course, or for more information, please email us with as many details as possible.