We enjoyed taking part in the Haskell Implementors’ Workshop (HIW 2021) this year, as part of ICFP 2021. Many thanks to the program chair, Ningning Xie, and the other organisers!

This year Andres served on the programme committee and Edsko co-hosted the workshop, with both of them chairing sessions on the day. Edsko and Douglas gave talks, and Ben helped deliver the annual GHC status update along with Simon Peyton Jones.

The full videos from the day are now available. Read on for more details of contributions to HIW from the Well-Typed team.

Avoiding quadratic GHC core code size

Edsko de Vries, Andres Löh

VideoSlides

There are (at least) two important sources of quadratic code size in GHC core: (1) we have no ability to inspect or update part of a record (or indeed type class dictionary) without referring to every field in the record individually, and (2) it is difficult or impossible to control sharing at the type level. This lack of type level sharing means for example that a chain of n applications of the Applicative <*> operator is O(n^2) in size.

This does not matter for small chains and small types, but it can make compilation slow and memory hungry for large types. In this talk we will take a closer look at the various sources of quadratic core code size, and then introduce the large-records library. This library makes it possible to define large records with an O(n) core size, at the cost of using an untyped internal representation. We discuss how the library works and the API that users are provided for working with these records, including an generics interface; the latter is styled on generics-sop but differs quite in a bit in the details to avoid all O(n^2) pitfalls. For a module containing just a single record with 100 fields, using large-records reduces the GHC core code size by more than two orders of magnitude.

Improvements to GHC’s parallel garbage collector

Douglas Wilson

VideoSlides

GHC’s parallel garbage collector periodically “stops the world” to perform collections. This requires synchronisation between all threads in the program, which must which must all be stopped before a collection can begin. This synchronisation imposes a significant cost on the runtime on both GHC and all parallel programs build with GHC. In its pre-9.2 implementation, this cross-thread synchronisation is achieved through spin locks and (on Linux) calls to sched_yield. We have modified this implementation to use mutexes and condition variables. These modifications will be available in GHC 9.2 We present benchmarks showing dramatic performance improvements for GHC itself while building the Cabal library, and provide some explanation for the surprising magnitude of the improvements.

The state of GHC

Simon Peyton Jones (Microsoft Research), Ben Gamari

VideoSlides

This presentation (a regular fixture at HIW) summarised the latest news about developments in GHC. We were particularly happy to see improvements in GHC’s compile-time performance being reported.