This is the ninth edition of our GHC activities report, which describes the work on GHC and related projects that we are doing at Well-Typed. The current edition covers roughly the months of October and November 2021.

You can find the previous editions collected under the ghc-activities-report tag.

A bit of background: One aspect of our work at Well-Typed is to support GHC and the Haskell core infrastructure. Several companies, including IOHK, Facebook, and GitHub via the Haskell Foundation, are providing us with funding to do this work. We are also working with Hasura on better debugging tools. We are very grateful on behalf of the whole Haskell community for the support these companies provide.

If you are interested in also contributing funding to ensure we can continue or even scale up this kind of work, please get in touch.

Of course, GHC is a large community effort, and Well-Typed’s contributions are just a small part of this. This report does not aim to give an exhaustive picture of all GHC work that is ongoing, and there are many fantastic features currently being worked on that are omitted here simply because none of us are currently involved in them in any way. Furthermore, the aspects we do mention are still the work of many people. In many cases, we have just been helping with the last few steps of integration. We are immensely grateful to everyone contributing to GHC. Please keep doing so (or start)!

Team

The current GHC team consists of Ben Gamari, Andreas Klebinger, Matthew Pickering, Zubin Duggal and Sam Derbyshire.

Many others within Well-Typed, including Adam Gundry, Alfredo Di Napoli, Alp Mestanogullari, Douglas Wilson and Oleg Grenrus, are contributing to GHC more occasionally.

Releases

  • Ben released GHC 9.2.1 and started work on preparing 9.2.2.
  • Ben and Zubin continue to finalize the 9.0.2 release.

Typechecker

  • Sam introduced Concrete# constraints, which allow GHC to enforce the representation-polymorphism invariants during typechecking. This brings GHC in line with the French approach (generate constraints then solve them).

  • Matt has started to refactor how the provenance of type variables is tracked to solve the dreaded “no skolem info” errors messages. (#20732)

  • Sam rectified some oddities surrounding defaulting in type families, fixing #17536.

  • Sam improved the way GHC understands class declarations with no methods in hs-boot files (#20661).

  • Sam fixed various bugs in the typechecker, such as incoherence stemming from GHC confusing Type and Constraint in some circumstances (#20521), a panic involving implication constraints (#20043).

  • Sam ensured that GHC rejects GADT pattern matches in arrow notation, as Alexis King discovered that the current implementation suffers from severe problems (#20469, #20470).

Code generation

GHCi

  • Ben fixed a desugarer bug resulting in modules to fail to load in GHCi (#20570), enabling the entirety of GHC to be loaded into GHCi

Runtime system

  • Ben debugged a runtime crash, identifying the cause as an interaction between the garbage collector’s treatment of CAFs and the linker’s code unloading logic (#20649).

  • Ben debugged and fixed a missing non-moving write barrier in the MVar implementation (#20399)

Error messages

  • Sam has been improving the error messages reported to the user in the case of unsolved constraints. For example, GHC can now remind the user about how overlapping instances work (#20542).

Driver

  • Matt has finished the multiple home units patch (!6805) which is now waiting for review. The patch allows multiple packages to be compiled in one GHC session. The largest example which has been tried so far is loading the whole of head.hackage at once. This amounts to 4700 modules and 450 packages in a single session.

  • Ben investigated a compilation failure on Windows (#20682) which ended up being due to incorrect an object merging implementation in Cabal. To avoid this, he implemented a new GHC mode, --merge-objs, which tools like Cabal can use to avoid repeating subtle linking logic throughout the ecosystem.

  • Matt fixed and clarified the logic around regenerating interface files (!6846) in --make mode. This only affects projects which have hs-boot files but was causing confusing core lint errors for packages such as Agda when build with HEAD.

  • Matt corrected some more bugs in -dynamic-too recompilation checking and tried to tidy up some loose ends to do with -dynamic-too. The testsuite coverage is also now much better for this feature (!6583).

  • Matt enabled support in GHCi for CApi FFI calls (!6904).

Profiling

  • Matt and Andreas have been working on improvements to ticky profiling. Each ticky counter is now given a source location using the info table map. This makes it much easier to work out which part of your program each counter has come from. We have also modernised how to inspect a ticky profile by adding support to eventlog2html. The profile is now rendered as an interactive, searchable, sortable table rather than the fixed textual format.

  • Ben fixed a data race in the ticky profiler which can result in runtime hangs during profile generation (#20451)

  • Ben introduced support for running hpc code coverage collection on GHC and is working to significantly improve performance of the hpc library

Libraries

  • Andreas revived an older patch for adding HasCallStack constraints to some notorious functions in base. The patch was initial written by Oleg Grenrus and the proposal discussed in #17040.

  • Matt has finished the revert of the Data.List specialisation patch which caused a large amount of unexpected ecosystem churn. The proposal now awaits a new plan from the revamped CLC.

  • Ben integrated text-2.0 into the compiler, fixing a number of issues pertaining to C++ linkage in the process (#20346).

  • Ben debugged a link failure when building foreign libraries on Windows and submitted a fix upstream to Cabal (#20520).

Compiler Performance

  • Matt continued to look into memory usage and found some large improvements for GHCi users. Now the memory overhead when reloading packages should be lower (!6773) and some leaks when using -fno-code were sorted out (!6775). Peak memory usage when loading Agda into GHCi is reduced by half, from 5GB to 2.5GB.

  • Matt, Sam and Zubin worked together to identify that part of the backpack implementation had adverse memory usage consequences for standard users. (!6763)

  • Matt wrote down (!6758) some heap structure invariants which we think should hold. These invariants are things which can be checked using ghc-debug.

  • Adam and Sam have continued to work on directed coercions. These store less information than ordinary coercions, which helps avoid generating quadratically-large Core when reducing type families (which is one of the main causes behind slow compilation of programs using type families, #8095).

Runtime Performance

  • Andreas looked at some issues regarding the CmmSink optimization in GHC (#20679, #20334). They were partially resolved by !6981. The remaining issues are discussed in #20679 and there is a WIP patch in !6988 which should fix these for good. This improves register allocation in certain edge cases involving unlifted data types or records with a large number of fields.

Infrastructure

  • Matt stabilised the CI performance tests by realising they were sensitive to the size of the environment and hence the length of a commit message (!6612).

  • Ben worked to fix a number of remaining testsuite failures in the statically-linked Alpine build (#20574, #20523, #20706).

  • Ben reworked the provisioning of the FreeBSD CI runners, started work to add FreeBSD 13 targets, and worked to debug many of the testsuite failures present on FreeBSD (#20095, #19723, #20354).

  • Ben reworked the infrastructure for managing the Linux runners provided by Azure.

  • Ben fixed a number of packaging issues (#19963, #20592, #20707).

  • Ben started collecting patches to remove the make build system from GHC, in preparation for a full migration to Hadrian.