We’re hiring Haskell experts to work on GHC, open-source and commercial Haskell projects! Find out more.

This is the fifteenth edition of our GHC activities report, which describes the work on GHC and related projects that we are doing at Well-Typed. The current edition covers roughly the months of October and November 2022. You can find the previous editions collected under the ghc-activities-report tag.

One aspect of our work at Well-Typed is to support GHC and the Haskell core infrastructure. Several companies, including IOG and GitHub via the Haskell Foundation, are providing us with funding to do this work. We are also working with Hasura and Mercury on specific improvements. We are very grateful on behalf of the whole Haskell community for the support these companies provide.

If you are interested in also contributing funding to ensure that we can continue or even scale up this kind of work, please read about how you can help or get in touch.

Of course, GHC is a large community effort, and Well-Typed’s contributions are just a small part of this. This report does not aim to give an exhaustive picture of all GHC work that is ongoing, and there are many fantastic features currently being worked on that are omitted here simply because none of us are currently involved in them in any way. Furthermore, the aspects we do mention are still the work of many people. In many cases, we have just been helping with the last few steps of integration. We are immensely grateful to everyone contributing to GHC!

Team

The current GHC team consists of Ben Gamari, Andreas Klebinger, Matthew Pickering, Zubin Duggal and Sam Derbyshire. Many others within Well-Typed are contributing to GHC more occasionally.

Releases

  • Zubin prepared and released GHC 9.2.5.

  • Ben began preparations to fork GHC 9.6.

Driver

  • Ben started a comprehensive rework of how GHC handles native toolchain interactions. This work is one of the last prerequisites necessary for making GHC a runtime-retargetable compiler. This work is planned to land in GHC 9.8. (#19877)

  • Matt worked on a prototype for adding parallelism to the simplifier. At the moment these are showing modest improvements in compile-time but we hope to integrate it into the -jsem work for bigger gains. (!9356)

  • Matt modified the driver to capture timing information when building a project so simple statistics such as longest path can be calculated about a build graph (!9435). This is intended for users to be able to catch cases where they introduce parallelism-limiting import chains into their module structure.

  • Sam wrote up GHC proposal #540 which proposes a semaphore-based mechanism for finer-grained control of parallelism between ghc and cabal. Sam implemented this mechanism in GHC, based off a prototype from Douglas Wilson and with help from Matt.

    Core utilisation when compiling pandoc and all dependencies using -jsem with 8 capabilities

Compiler performance

  • Andreas helped land !4140, allowing GHC to unpack sum types. Most of the work was originally done by Ömer Sinan Ağacan, formerly of Well-Typed, and most of the rebasing has been done by contributors from Obsidian Systems.

  • Matt investigated some space leaks when doing partial reloads in GHCi. Due to a combination of reasons, GHCi would retain old copies of modules which should have been collected. Fixing these leaks halved the amount of memory necessary to develop packages. For example, Agda memory usage was reduced from 2.4G to 1.3G. (#22530)

  • Zubin fixed a memory leak that manifested when using GHCi with -haddock and without -fwrite-interface where excessive laziness resulted in retaining intermediate compilation artifacts for the entire GHCi sesssion (!9494).

Typechecker and renamer

  • Sam has been working on an overhaul of the renamer, simplifying how it handles record fields and making it possible to migrate record disambiguation logic to the renamer from the typechecker. This will fix many bugs (#22125, #21898, #21959, #21443).

  • Sam has been helping GHC contributor Soham Chowdhury improve the treatment of import and export lists, in order to improve error messages (see #21826).

Error messages

  • Sam helped GHC contributor Andrei Borzenkov migrate error messages to the new diagnostic infrastructure in GHC.Rename.Expr.

Code generation

  • Sam fixed some bugs in Cmm involving the interaction of unboxed sums and SIMD vectors. (#22187, #22296)

  • Andreas fixed a bug where certain uses of unboxed sums would cause GHC to panic with “Can’t find slot.” (#22208)

  • Ben introduced code generation support for ThreadSanitizer (!6232), giving the sanitizer full visibility into the memory accesses performed by GHC-compiled programs. This revealed a number of potential memory ordering issues (e.g. #22468) which have now been resolved (!9372).

  • Ben finished and merged a patch deduplicating string unpacking thunks, significantly reducing binary sizes of programs containing many String literals. (#16014)

  • Ben fixed a correctness issue in the AArch64 NCG in the handling of operations requiring sign extension. (#22282)

Core-to-Core pipeline

  • Andreas fixed a bug in 9.4.3 and 9.2.5 in which excessive eta-expansion would sometimes cause programs to stop sharing work, resulting in catastrophic slowdowns from repeatedly performing the same work. (#22425)

Runtime system

  • Andreas looked into some inconsistencies in the behaviour of isByteArrayPinned#, potentially causing segfaults at runtime when adding large byte arrays into a compact-normal-form object.

  • Andreas fixed a race condition when using spark based parallelization. When it triggered, some sparks were deemed to be dead and were inappropriately collected, resulting in massively reduced parallelism. (#22528)

  • Ben investigated a number of non-moving GC issues, and in the process introduced some significant latency optimisations. (#22264)

  • Ben continued work, started by contributor John Ericson, to move the runtime system’s configure script logic out of GHC’s top-level configure script. The goal of this work is to clean up the dependencies between GHC, its bootstrap toolchain, and the runtime system, aiding the cross-compilation effort.

  • Ben refactored the representation of info-table provenance information, significantly reducing its size and impact on linking times. (#22077)

Libraries

  • Matt fixed a bug where some wired-in identifiers were given the wrong module in Typeable evidence. (!9459)

  • Ben continued work on polishing up the implementation of the exception provenance proposal (!8869). While sadly the revised proposal itself will not be through the GHC Steering Committee in time for GHC 9.6, we plan to have this work landed well ahead of GHC 9.8.

Testsuite

  • Matt implemented a quality of life change in the testing framework, improving the error messages when certain tests fail. (!9249)

  • Ben started introducing support for testing of cross-compilers into GHC’s testsuite driver. (!9184)

ghc-debug

  • Matt finished support for tracing static reference tables. As a result, it is now possible to investigate leaks caused by retained static objects.

  • Matt improved the progress reporting in the terminal UI of ghc-debug.

  • Matt added some new analysis scripts for debugging duplicated strings and byte arrays.

  • Andres revamped ghc-debug’s terminal UI with a new colour scheme and other rendering improvements.

  • Finley fixed a long standing bug where it was only possible to connect to an instrumented process once. It is now possible to connect and disconnect to a process as many times as you wish.