One aspect of our work at Well-Typed is to support GHC and the Haskell core infrastructure. Several companies are providing us with funding to do this work, for which we are extremely grateful.
If you are interested in also contributing funding to ensure we can continue or even scale up this kind of work, please get in touch.
Starting immediately, we will try to provide monthly updates on the work we have been doing. In this first edition, we will cover roughly two months, June and July 2020.
Of course, GHC is a large community effort, and Well-Typed’s contributions are just a small part of this. This report is not aiming to give an exhaustive picture of all GHC work that is ongoing, and there are many fantastic features currently being worked on that are omitted here simply because none of us are currently involved in them in any way. Furthermore, the aspects we do mention are still the work of many people. In many cases, we have just been helping with the last few steps of integration. We are immensely grateful to everyone contributing to GHC. Please keep doing so (or start)!
Historically, GHC has performed IO on Windows via the POSIX-compatible interfaces provided by
libc. While simple, this implementation strategy has led to numerous bugs due to the impedance mismatch between the POSIX and Windows IO models. WinIO is an overhaul of GHC’s Windows IO subsystem providing GHC with a proper event manager implementation built around Windows’ I/O completion port mechanism.
This brings a number of benefits:
- Unicode support will be more consistent, particularly in console applications (see #18307, #16917, #12869, #10542).
- Server applications will be able to benefit from native asynchronous I/O support.
- Interruption of console applications will work more reliably (see #13516, #13396).
- I/O operations with timeouts (e.g.
hWaitForInput) will work consistently across file handle types (see #12873).
- Many more, as summarized in #11394.
The WinIO effort was started and implemented by Tamar Christina, and the vast majority of the work is due to his tireless efforts. Thanks to primarily our client IOHK, we had the funding to help finish this work and integrating it into the
These final steps have involved a great deal of debugging, a final code review, merging work in the
haskeline submodules upstream, rebasing and clean-up of the version control history, and merging to
master (!1224, !3669).
With this work behind us, we can all look forward to more robust I/O support on Windows in GHC 9.0. However, there remains plenty of work to be done. In particular, taking advantage of the new I/O manager in Haskell’s foundational
network library could bring significant performance benefits. This work may require some rearchitecting of
network’s implementation. You can follow this work in the corresponding
network issue #364.
We have been working on several performance-related changes.
One of the significant costs of GHC/Haskell’s lazy evaluation model arises from the need for checks of whether a value has been evaluated. In GHC this takes the form of a check of a pointer’s “tag bits” [Marlow2007], resulting in frequent conditional branches when scrutinizing lifted values. However, there are many cases where these checks are in principle redundant. For instance, given the program
the compiler should in principle be able to exploit the fact that
a_field is strict and of a single-constructor type.
Well-Typed GHC contributor Andreas Klebinger has recently been looking into tightening up GHC’s invariants surround code generation of strict fields and introducing a STG pass to exploit these invariants to elide tag checks.
This month we rebased this work and began a set of performance measurements in preparation for potentially merging for 9.2.
Another performance improvement opportunity in GHC revolves around the compiler’s treatment of dictionary arguments. GHC has long had a flag,
-fdicts-strict, which allows the GHC’s demand analysis to assume that ad-hoc overloaded functions place strict demands on their dictionary arguments (see #17758). This allows more aggressive application of the worker-wrapper transformation thereby reducing allocations and tag checks, as mentioned above.
Last month Andreas carried out a characterisation of the effect of enabling
-fdicts-strict by default. These measurements confirmed that there is indeed a good improvement in runtime performance to be had by enabling strict dictionaries. However, we also show that there can be non-negligible regressions in compiler performance for certain cases (e.g. Generics-heavy code, due to more aggressive simplification). This leads us to believe that this is best enabled with
Improving demand analysis for recursive products
We analyzed the source of a demand analysis issue for recursive products which caused #18304. Based on this analysis a patch was committed by Simon Peyton Jones which fixed the issue but has the potential to pessimize nested data types.
We are looking into reducing the potential impact of this change for demand analysis for such types while avoiding the looping behavior noted in #18304.
Performance regression testing
We have been addressing inconsistencies in the performance testsuite driver prompted by a recent spate of seemingly-spurious CI failures due to performance tests. We have identified that these failures are in part due to the behavior of the logic which determines the baselines which serve as the basis for comparison of performance metrics. We are currently working on refactoring the driver to make this logic more predictable.
Enabling large address-space support on Windows
GHC has long used a two-step address-space allocation strategy on Linux. This allows the Haskell heap to be allocated into a contiguous block of memory, greatly improving garbage collection efficiency. In the past, we have been unable to enable this scheme on Windows due to platform limitations. This situation has changed in the past few years. For this reason we re-evaluated enabling large address-space support on Windows and found that not only did it not regress in the ways that it did in the past, but it gave quite significant performance improvements. We will be enabling large address-space support by default in GHC 9.0.
List fusion for
There was a buggy rewrite rule that prevented list fusion to properly work in the rather common case when
elem is called with a constant list as its second argument (see !2580).
Improvements to the linear register allocator
The linear register allocator now remembers past assignments, with benchmarks indicating close to 1% improvement in both run and compile time.
Better code layout for loops
We have been reworking the way GHC implements rebindable syntax (see #17582 for an overview of the existing and new approaches), starting with a patch (!2960) that makes GHC rebind
if expressions with the new approach, including some general infrastructure that we will be able to reuse for other constructs. The aforementioned patch has just been merged, the next step is to move over the treatment of rebindable monad operations to the new approach.
We have been working on moving GHC’s error representation from textual documents to properly structured ADT values, as described in this wiki page, in order to make the life of tooling authors easier when it comes to extracting information out of errors (expressions, types, suggestions, …) when using the GHC API. The beginning of the plan described in the wiki page has been implemented and recently pushed as !3691.
We currently have three releases in-flight:
- GHC 8.8.4, which should bring a few important fixes, primarily on Windows, was released on 15th of July,
- GHC 8.10.2, which should bring fixes for Windows as well as the new non-moving garbage collector,
- the next major release of GHC, 9.0.1, has been branched and alpha releases will start shortly.
Other open-source work
In addition to our regular work on GHC, we are also performing some open-source work on other Haskell tools, such as cabal-install and Liquid Haskell. We may report on these in future blog posts. We are always interested in improving Haskell and its ecosystem. If you have a project for us, please let us know.
[Marlow2007] Marlow, Simon; Rogriguez Yakushev, Alexey; Peyton Jones, Simon: Faster Laziness Using Dynamic Pointer Tagging. In: ICFP’07, 2007, pp. 277–288. https://www.microsoft.com/en-us/research/wp-content/uploads/2007/10/ptr-tagging.pdf