This is the tenth edition of our GHC activities report, which describes the work on GHC and related projects that we are doing at Well-Typed. The current edition covers roughly the months of December 2021 and January 2022.

You can find the previous editions collected under the ghc-activities-report tag.

A bit of background: One aspect of our work at Well-Typed is to support GHC and the Haskell core infrastructure. Several companies, including IOHK, Facebook, and GitHub via the Haskell Foundation, are providing us with funding to do this work. We are also working with Hasura on better debugging tools. We are very grateful on behalf of the whole Haskell community for the support these companies provide.

If you are interested in also contributing funding to ensure we can continue or even scale up this kind of work, please get in touch.

Of course, GHC is a large community effort, and Well-Typed’s contributions are just a small part of this. This report does not aim to give an exhaustive picture of all GHC work that is ongoing, and there are many fantastic features currently being worked on that are omitted here simply because none of us are currently involved in them in any way. Furthermore, the aspects we do mention are still the work of many people. In many cases, we have just been helping with the last few steps of integration. We are immensely grateful to everyone contributing to GHC. Please keep doing so (or start)!

Team

The current GHC team consists of Ben Gamari, Andreas Klebinger, Matthew Pickering, Zubin Duggal and Sam Derbyshire.

Many others within Well-Typed, including Adam Gundry, Alfredo Di Napoli, Alp Mestanogullari, Douglas Wilson and Oleg Grenrus, are contributing to GHC more occasionally.

Releases

  • Ben continued work preparing 9.2.2.
  • Zubin released GHC 9.0.2.
  • Matt reworked CI to make sure that binary distributions don’t depend on libnuma. This had regressed recently with the Debian 10 release bindists.
  • Matt, Zubin and then Matt again tag-teamed on some patches to allow GHC’s Hadrian build system to be bootstrapped without cabal-install. This allows GHC to be built by just depending on GHC and normal system dependencies.
  • Matt mastered the art of the ReadP parser in order to remove the parsec (and hence text) dependency from GHC.
  • Matt added Debian 11 release jobs to GHC’s CI pipeline.
  • Zubin is close to finishing a patch which makes the GHC library reinstallable, allowing users of the GHC API to rebuild it like any ordinary Haskell package, and not being forced to use the version of boot libraries that shipped with their GHC installation. It also paves the way for potentially smaller GHC binary distributions as profiled libraries and other custom configurations could be built by the user.

Typechecker

  • Sam has fixed a bug in typeclass instance resolution, in which the order in which GHC encountered instances unexpectedly mattered when resolving overlap.
  • Matt worked with Simon to better track the provenance of type variables in order to improve error messages. Now users should never see the dreaded “no skolem info” error. Along the way we added a bunch of new assertions which should mean less confusing bugs in future.
  • Matt finished implementing the instance environment rough map which provides a pre-filter before performing instance matching. The patch was started by Ben in order to improve type family instance matching but Matt added an observation which also greatly improved type class instance matching. In some profiles up to 50% of typechecking time was spent looking up instances, with this patch that cost has been greatly reduced (#20933, #9805).

GHCi

  • Ben worked closely with GHC contributor @nineonine to understand and fix a bug in GHC’s bytecode interpreter triggered by use of unlifted data types (#20194).

Typechecker plugins

  • Sam has fixed a bug in which unsolved Wanted constraints emitted by typechecking plugins would be reported to the user with an incorrect source location.

Code generation

  • Ben is engaged in a multi-week effort to update the GHC’s native Windows toolchain to use Clang and LLVM (#21019). This effort should solve a number of issues:

    • Fixes a number of code generation issues traceable to the current Gnu toolchain (e.g. #16780)
    • Significantly improves compilation, and more importantly, linking times (#16084)
    • Eliminates a class of compatibility issues faced by users of the LLVM backend (#16354)
    • Fixes some numerical precision issues due to the old msys runtime currently used (#15670)
    • Enables address space layout randomization for Haskell code

    However, this has required a significant investment of effort, involving changes in the linker, code generator, runtime system, compilation driver, build system, and packaging infrastructure. Happily, much of this effort, particularly that in the runtime system, will benefit other platforms as well due to improvements in linker robustness and various refactorings.

  • Ben refactored the compiler’s backend to consistently account for alignment of memory accesses, ensuring that the indexWord8ArrayAs*# family of primops are correctly lowered on platforms which lack support for unaligned memory accesses (#21015).

Primops

  • Sam has added support for levity-polymorphic arrays and mutable variables. This means that the primitive array types (Array#, MutableArray#, SmallArray#, MutableSmallArray#) can now hold unlifted values, with support for all the relevant operations (e.g. writeArray#, indexArray#, etc). This means that the untyped ArrayArray# interface is no longer needed. Mutable variable types, such as MutVar#, MVar#, TVar#, can now handle unlifted types too. This completes the implementation of the BoxedRep proposal.

  • In response to ticket #20769, Ben debugged a memory safety issue in the shortbytestring library. In the course of doing so, he also implemented optional dynamic bounds-checking functionality for the various array primops provided by GHC, filling a long-present need (#15092).

Runtime system

  • Doug fixed several statistics accounting problems in the garbage collector.
  • Doug fixed all “fail to inline” warnings in the rts.
  • Ben investigated and fixed a bug in the linker’s m32 memory allocator, which resulted in compiler errors on OpenBSD (#20734).
  • Ben finished a refactor of the runtime’s eventlog infrastructure, reducing initialization overhead and fixing a few subtle races in the face of changes in capabilities counts (#18948, !4477).
  • Ben started implementing support for finalizers in the runtime system linker. This is a first step in the road to robust C++ support in GHC, as now required by text (#20494).
  • Ben continued work characterising a subtle potential memory-ordering bug affecting blackholing of CAFs (#20129).
  • Ben fixed a number of bugs in WinIO traceable to inconsistencies in the treatment of deadlock detection for IO requests (#18382).

Template Haskell and Plugins

  • Zubin has been investigating the causes of segfaults which crop up when using different GHC installations, like a reinstalled GHC binary, or compiled on a different machine like a static HLS binary compiled on a different machine and distributed via GHCup or VSCode (#20742, #19896).

Error messages

  • Sam has migrated the reporting of unsolved Wanted constraints to use the new rich diagnostic infrastructure. This means tooling such as Haskell Language Server can give more fine-grained structured information, instead of having to parse the error message text.

Parsing

  • Sam fixed some issues surrounding the parsing and pretty-printing of unboxed sum types. This means that GHC can now parse and display unsaturated unboxed sum type constructors such as (# | | #).

Driver

  • Matt finalised and merged the support for multiple home units. The next step is to add support to Cabal and HLS.

  • Matt looked into adding support for mold (a new modern linker) but some shortcomings in mold stopped our forward progress.

Profiling

  • Andreas fixed a minor bug for the -fcaller-cc mode of adding profiling cost-centres (#20854).
  • Andreas is working on a new mode for automatic cost-centre annotations in !7242 which has been made possible through generous funding from Hasura. In this mode GHC will now insert cost-centres after optimization. Generally this results in more accurate profiling results which correspond much better to the characteristics of non-profiled builds. However profiles can sometimes be harder to interpret as GHC-internal names for functions will leak into the profile. For GHC versions 9.2 and 9.0 there is an experimental plugin available here which offers similar functionality.

Libraries

  • Ben fixed a number of bugs and regressions in the process library (#219, #227, #224).

Compiler Performance

  • Adam and Sam are finalising their work on directed coercions, which is expected to greatly speed up programs involving many type family reductions, such as the programs reported in #8095. The idea is to introduce a new representation for coercions, called directed coercions, which store fewer redundant types. This avoids producing coercions whose size is quadratic in the number of type family reduction steps.

  • Matt noticed that the Core lint phase was very slow and that we were running some performance tests with Core lint enabled. Now the in-tree performance tests and head.hackage tests collect more accurate performance statistics.

  • Matt backported a number of space-leak and performance fixes to the 9.2 branch.

  • Matt thought very carefully about the mkCoreUnfolding function and being slightly smarter about when things were forced led to a small improvement in maximum residency (#20905).

Infrastructure

  • Matt fixed some bugs with the merge bot; we now correctly use the merge base commit to calculate the difference in performance tests and search all commits in the batch for possible performance changes.

  • Matt and Ben worked on adding a “Notes” linter to ensure that notes are referenced accurately. As part of this Matt made it possible to run linters by the testsuite so that it is easier to accept changes locally.

  • Zubin made improvements to the GHC testsuite to enable it to request Hadrian for certain dependencies needed to execute particular tests, so that they can be built on demand only when those tests are actually requested to run.

  • Zubin fixed the GHC testsuite’s support for the stage 1 compiler as well as out-of-tree compilers, so that partial builds of GHC and compiler binaries from elsewhere can be easily tested.

  • Zubin taught Hadrian about custom Setup.hs scripts and used this to move logic required for reinstallable lib:ghc into the custom Setup.hs scripts and avoid duplication.