This is the nineteenth edition of our GHC activities report, which describes the work on GHC and related projects that we are doing at Well-Typed. The current edition covers roughly the months of June and July 2023. You can find the previous editions collected under the ghc-activities-report tag.

Many thanks to our sponsors who make this work possible: Anduril, Hasura and Juspay. In addition, we are grateful to Mercury for funding specific work on improved performance for GHC, HLS and related projects. However, we need more sponsorship to sustain the team! If your company might be able to contribute funding to sustain this work, please read about how you can help or get in touch.

Of course, GHC is a large community effort, and Well-Typed’s contributions are just a small part of this. This report does not aim to give an exhaustive picture of all GHC work that is ongoing, and there are many fantastic features currently being worked on that are omitted here simply because none of us are currently involved in them in any way. Furthermore, the aspects we do mention are still the work of many people. In many cases, we have just been helping with the last few steps of integration. We are immensely grateful to everyone contributing to GHC!

Team

The GHC team at Well-Typed consists of Ben Gamari, Andreas Klebinger, Matthew Pickering, Zubin Duggal, Sam Derbyshire and Rodrigo Mesquita, with Jaro Reinders joining the team for an internship. Many others within Well-Typed are contributing to GHC more occasionally.

Releases

  • Ben has been working on the GHC 9.8 release, and prepared the first alpha.

  • Zubin prepared and released GHC 9.4.6.

  • Ben rewrote the Wiki documentation describing GHC’s release status to improve the transparency of GHC’s release process

Typechecker & renamer

  • Sam and Matthew refactored the Zonker and TcLclEnv respectively in order to decouple various parts of the compiler, greatly reducing the module dependencies of the Haskell syntax tree and parser.

  • Sam improved pattern-match checking in do-notation, which means that GHC will no longer spuriously emit an incomplete pattern-match warning in code such as do { x@(Right {}) <- ... ; let z = case x of { Right y -> y }; ... }

  • Jaro converted the underlying representation of Unique from Int to Word64 (!10568). This avoids the problem of running out of uniques on 32-bit platforms, which plagued some large packages such as pandoc or Agda.

  • Sam overhauled the treatment of representation-polymorphism checks on representation-polymorphic identifiers. This allows us to perform representation-polymorphism checks on types that are not in argument position, as necessary for e.g. the catch# primop (see #21906 and #21868). This also allows a few more programs to be accepted, e.g. (# , #) @(RR Int) x y can be accepted, where RR is a type family such that RR Int reduces to a concrete RuntimeRep (e.g. IntRep).

Records

  • Sam fixed a regression in GHC 9.6 which caused incomplete record update warnings to no longer be emitted.

  • Sam fixed a regression in the deprecation of record fields (!10761).

  • Sam ensured that record fields with NoFieldSelectors do not get imported by top-level variable imports in an import list (!10759). He also made it so that misspelled items in import lists now give rise to similar name suggestions.

Error messages and warnings

  • Matthew used the new error message infrastructure to print different error messages in ghci (!10305), to avoid making suggestions such as :set -package text that don’t make sense outside of ghci.

  • Matthew fixed some issues with pretty-printing and displaying of WARNINGs (!10595, !10752).

  • Sam finished migrating error messages to the new diagnostic infrastructure (!10908).

  • Sam added the ability for GHC to list out all the error codes that it can emit (!10857). This allowed writing a test that checks for the coverage of error codes in the testsuite.

  • Sam fixed a regression in which untouchable type variable information was no longer being included in error messages since 9.6 (!10800).

Simplifier

  • Matthew investigated a tricky issue involving polymorphic specialisation and type-checking plugins (#23469). In the short term, he prepared a patch that disables polymorphic specialisation by default (!10719), until we can arrive at a more robust solution.

  • Matthew investigated a compile-time loop (#23553), identifying that it was caused by a missing backport of the fix to #22272 (!9160).

  • Matthew identified and fixed a bug in RULE matching that was causing a compiler crash (#23630, !10891).

Driver

  • Matthew fixed the -S flag when compiling .cmm files (!10817). This combination is most likely never used by normal users but it was used in the test-primops testsuite.

  • Zubin tweaked the GHC API to allow clients such as HLS more control over the FinderCache (#23604, !10897).

Runtime system

  • Matthew implemented some missing functionality in ticky-ticky profiling, which should help debug regressions due to changes in stack allocation (!10483).

  • Ben ensured that -Werror is correctly passed to the C compiler when building the runtime system, and fixed the errors that turned up as a result (!9579).

  • Ben replaced the old-style explicit barrier load/store/write functions from the RTS, opting instead to use the C11-style acquire/release operations. In particular, this adds C11-style fences operations to GHC (!10628).

  • Ben fixed a segfault by ensuring that allocatePinned takes into account any padding required for alignment, so we don’t allocate beyond the block size (!10524).

  • Ben fixed a deadlock which was caused by a call to exit from within a signal handler, which is not safe; instead, we call _exit which performs only minimal cleanup (#23417, !10511).

  • Ben introduced support for capturing info tables of entered thunks on the stack, enabled by the the -forig-thunk-info flag. This makes it easier to debug why a thread is blocked on a blackhole (#23255, !10271).

  • Ben fixed a darwin bug in which the FFI_GO_CLOSURES macro was not defined despite being relied upon (#23568, !10750).

Code generation

  • Andreas identified the root cause of an issue where the AVX instruction set was incorrectly enabled on certain platforms (#23718).

  • Ben and Rodrigo ensured that data constructor wrappers are given the correct LambdaFormInfo, in order to avoid a crash (#23146, !10165).

  • Ben fixed a correctness bug in handling of the MulMayOflo primop on AArch64 (#23721) and improved testing infrastructure for this operation in test-primops.

  • Ben started merging and backporting the various memory ordering fixes that he has collected over the past months.

Linking

  • Matthew debugged a GHCi linking issue (#23580), tracing it to lack of support for Addr# literals. Support for these had already been added, and backporting that change fixed the problem.

  • Ben implemented several correctness fixes related to split-sections, allowing this flag to be re-enabled on Windows (!9810, !10959).

  • Ben characterised and identified the cause of slow dynamic linking performance on Darwin (#23415).

  • Ben worked on fixing the runtime system linker’s support for AArch64 toolchains using out-of-line atomics (#22012).

  • Ben worked on fixing bytecode generation for nullary data constructor workers (#23210).

Packaging

  • Matthew made GHC emit the necessary build metadata information which was required for ghcup to provide support for nightly GHC releases (!10473, !10705, !10808).

  • Matthew fixed the version of emsdk used in the build images after breakage was introduced by an inadvertent upgrade to the toolchain (#23641).

  • Matthew upgraded the i386 CI runners to use Debian 10 to build bindists (!10769).

GHC build system

  • Rodrigo has made significant progress towards a runtime retargetable GHC with his work on ghc-toolchain (!9263). We plan to publish a blog post with more details soon.

  • Ben fixed the behaviour of Hadrian in the presence of symlinks by canonicalising topDirectory (#22451, !10559).

  • Ben fixed a couple of issues with Hadrian on CentOS platforms (!10760).

  • Ben worked around ld.gold producing invalid static constructor tables on i386 by refusing to use ld.gold (!10764).

  • Ben fixed a Hadrian bug where dependency information for C sources could be generated using different flags than compilation, resulting in broken recompilation avoidance.

Haddock

  • Finley finalised the hi-haddock work, allowing Haddock to re-use interface files created by GHC in order to generate documentation (!10469). He also implemented other memory usage improvements to Haddock.

  • Matthew fixed a bug where default methods were displayed incorrectly in Haddock output (#23616).

  • Finley ensured that Haddock comments are properly inserted into the AST for Backpack signature modules (!10370).

CI & testing

  • Matthew worked out that some performance tests were sensitive to small changes due to sitting on a stack overflow boundary, and added some options in order to stabilise these tests (!10507).

  • Matthew declared dependencies on the python3-dev and wheel-packages to fix issues with the build environment of CI machines running linting jobs (ci-images MR 121).

  • Ben added a test for the interface stability of the base package (!9816). This will ensure that any contributions to GHC don’t inadvertently make any changes to base. To improve the developer workflow, he also made CI jobs able to save unexpected test outputs as a CI artifact (!10535).

  • Ben improved the portability of the test-primops testsuite and enabled support for testing of i386 code generation on x86-64 (test-primops MR 10)

  • Matthew worked on integrating the test-primops testsuite into the default CI pipelines. The goal is to be able to run the testsuite on merge requests that might affect code generation. Between starting and finishing the patch, some patches were merged which introduced bugs caught by test-primops! (!10910)

  • Matthew slimmed down the merge request validation pipeline. There have been several instances recently where there has been a big queue to start CI for merge requests, and we want to reduce this latency as much as possible. Therefore, fewer jobs now run by default on each merge request, but developers can request full CI treatment by applying the ~full-ci label. Full CI is still run on each merge batch to prevent any bad commits entering the tree.

head.hackage

  • Matthew improved the performance of the Grafana head.hackage allocations dashboard.

Core libraries

  • Ben implemented CLC proposal 144 and CLC proposal 153, rejecting FilePaths containing interior NULs and providing variants of {new,with}CStringLen which NUL-terminate their output. This ensures we don’t silently truncate any FilePaths (#13660, !10110).

  • Ben continued work implementing the joint CLC/GHC plan for separating GHC’s internals from base and introducing a ghc-experimental package for experimental interfaces.

  • Adam identified a soundness bug (#23454) resulting from missing role annotations on SNat, SChar and SSymbol. Matthew opened a CLC proposal to add the missing annotations.

  • Ben added the unsafeThawByteArray# primop, to go with the existing unsafeFreezeByteArray# primop (#22710, !9739).

GHC Proposals

  • The GHC Steering Committee accepted Adam’s proposal 581: Namespace-specified imports (#23781). This will make it easier to distinguish between the type and data namespaces in import and export lists, and will fix various issues with the design of the ExplicitNamespaces extension. It builds on efforts by Artyom Kuznetsov and others in previous proposals.