The lack of type-level sharing in Haskell is a long-standing problem. It means, for example, that this simple Haskell code

```
apBaseline :: Applicative f => (A -> B -> C -> r) -> f r
=
apBaseline f pure f
<*> pure A
<*> pure B
<*> pure C
```

results in core code that is quadratic in size, due to type arguments; in pseudo-Haskell, it will look something like

```
apBaseline :: Applicative f => (A -> B -> C -> r) -> f r
=
apBaseline f pure f)
(<*> @A @(B -> C -> r) (pure A)
<*> @B @( C -> r) (pure B)
<*> @C @( r) (pure C)
```

Each application of `(<*>)`

records both the type of the argument that we are
applying, *as well as the types of all remaining arguments*. Since the latter is
linear in the number of arguments, and we have a linear number of applications
of `<*>`

, the core size of this function becomes quadratic in the number of arguments.

We recently discovered a way to solve this problem, in `ghc`

as it is today
(tested with 8.8, 8.10, 9.0 and 9.2). In this blog post we will describe the
approach, as well as introduce a new typechecker plugin (available from
Hackage or GitHub) which makes the process
more convenient.

It is exciting that we now have *an* answer to a previously unsolved problem,
but it must be admitted that the resulting code is not very elegant, and fairly
unpleasant to write. It will be necessary to explore this new approach, find new
ways to use it and adapt it to improve the user experience. We are nonetheless
releasing this blog post, and the plugin, in the hope that others might feel
inspired to experiment with it and devise new ways in which it can be used.

### Vanilla `ghc`

Before we introduce the new typechecker plugin, we will first demonstrate the
concept in vanilla `ghc`

. Here’s the main idea: we will represent a type-level
`let`

-binding as an existentially quantified type variable, along with an
equality that specifies the value of that variable; the equality will be opaque
to `ghc`

until we reveal it.

That probably sounds rather abstract, so let’s make this more concrete:

```
data a :~: b where -- defined in base (Data.Type.Equality)
Refl :: a :~: a
data LetT :: a -> Type where -- new
LetT :: (b :~: a) -> LetT a
```

The `LetT`

constructor has two type variables, `a`

and `b`

; `b`

is the
existential type variable mentioned above, while `a`

is a regular type variable,
and will correspond to the *value* of the type variable we are “let-binding”. In
other words, think of this as a type-level assignment `b := a`

. The argument `b :~: a`

records the equality between `a`

and `b`

; it is opaque to `ghc`

in the
sense that `ghc`

will not be aware of this equality until we pattern match on
the `Refl`

constructor.

When we construct a let-binding, `a`

and `b`

will (by definition) have the same
value, and so we can introduce a helper function:

```
{-# NOINLINE letT #-}
letT :: LetT a
= LetT Refl letT
```

This gives us a let-binding with value `a`

, for an existential variable that we
will discover when we pattern match on the `LetT`

constructor.^{1}

This is all probably *still* quite abstract, so let’s see a simple example of
how we might use this:

```
castSingleLet :: Int -> Int
=
castSingleLet x case letT of { LetT (p :: b :~: Int) -> -- (1)
let x' :: b
= case p of Refl -> x -- (2)
x' in case p of Refl -> x' -- (3)
}
```

In `(1)`

, we introduce a type-level let-binding `b := Int`

. Then in `(2)`

we
define a value `x'`

of type `b`

; *we* know that `b := Int`

, but `ghc`

doesn’t,
and so we explicitly pattern match on the equality proof. Finally, in `(3)`

we
want to use `x'`

as the result of the function; for this we need to cast *back*
from `b`

to `Int`

.

Of course, this example is a bit pointless, so let’s consider how we might actually use this to solve a problem.

### Heterogenous lists

We will come back to the applicative example from the introduction a bit later, but let’s consider a slightly simpler example first. Recall this definition of heterogenous lists:

```
data HList :: [Type] -> Type where
HNil :: HList '[]
HCons :: x -> HList xs -> HList (x : xs)
```

Without type-level sharing, we cannot construct values of type `HList`

without
resulting in quadratic core code size, for much the same reason as before.
For example,

```
hlistBaseline :: HList '[A, B, C]
=
hlistBaseline HCons A
$ HCons B
$ HCons C
$ HNil
```

will be expanded with type variables to something like

```
hlistBaseline :: HList '[A, B, C]
=
hlistBaseline HCons @'[A, B, C] A
$ HCons @'[ B, C] B
$ HCons @'[ C]
$ HNil
```

where we again have a linear number of calls to `HCons`

, each of which has a
list of type arguments which is itself linear; hence, this value is quadratic
in size.

Let’s fix that. Instead of *repeating* the list each time, we will introduce
type-level sharing so that we can express, “this list is like that other list
over there, but with an additional value at the front”. Let’s first define the
various type-level lists:

```
hlist1 :: HList '[A, B, C]
=
hlist1 case letT of { LetT (p2 :: r2 :~: (C : '[])) -> -- r2 := C : []
case letT of { LetT (p1 :: r1 :~: (B : r2 )) -> -- r1 := B : r2
case letT of { LetT (p0 :: r0 :~: (A : r1 )) -> -- r0 := A : r1
...
}}}
```

With the type-level lists defined, we can now define the corresponding values.
Just like before, we need to cast explicitly. For example, the list `HCons C HNil`

has type `HList (C : '[])`

; *we* know that this is the same as `HList r2`

, but
to convince `ghc`

of that fact, we need to appeal to the explicit equality.

```
let xs2 :: HList r2
xs1 :: HList r1
xs0 :: HList r0
= case p2 of Refl -> HCons C HNil
xs2 = case p1 of Refl -> HCons B xs2
xs1 = case p0 of Refl -> HCons A xs1 xs0
```

Finally, we need to cast *back* from `HList r0`

to `HList '[A, B, C]`

; we will
need to appeal to *all* equalities in order to do so. The full function is:

```
hlist1 :: HList '[A, B, C]
=
hlist1 case letT of { LetT (p2 :: r2 :~: (C : '[])) ->
case letT of { LetT (p1 :: r1 :~: (B : r2 )) ->
case letT of { LetT (p0 :: r0 :~: (A : r1 )) ->
let xs2 :: HList r2
xs1 :: HList r1
xs0 :: HList r0
= case p2 of Refl -> HCons C HNil
xs2 = case p1 of Refl -> HCons B xs2
xs1 = case p0 of Refl -> HCons A xs1
xs0
in case p0 of { Refl ->
case p1 of { Refl ->
case p2 of { Refl ->
xs0
}}}
}}}
```

### Unpacking equalities

It is critical that `ghc`

cannot see the equalities we introduce, because if it
did, it would just unfold the definition and we’d lose the sharing we worked so
hard to introduce. Nonetheless, the need to match on all these equality proofs
in order to cast values to the right type is certainly inconvenient. It is also
easy to get wrong; we will discuss that problems in this section, but
fortunately we can avoid the need for pattern matching altogether when we use
the `typelet`

type checker plugin; we will introduce this plugin in the next
section.

The reason that the pattern matches are easy to get wrong is that we need to
match *in the right order*. Concretely, if instead of the order above, we
instead did

```
case p2 of { Refl ->
case p1 of { Refl ->
case p0 of { Refl ->
xs0 }}}
```

we would end up with quadratic code again.

This is due to the shape of the equality proof that `ghc`

constructs: `xs0`

has
type `HList r0`

, but we want to use it at type `HList '[A, B, C]`

. There are
sufficient equalities in scope to enable `ghc`

to prove that these two types
are in fact the same, which is why the program is accepted, but the resulting
core code will look like

``cast` {- .. proof that HList r0 ~ HList '[A, B, C] .. -} xs0 `

In the linear version (where we pattern match on `p0`

first), we end up with
the proof

```
let co2 :: r2 ~ (C : [])
co1 :: r1 ~ (B : r2)
co0 :: r0 ~ (A : r1)
= ..
co2 = ..
co1 = ..
co0
in .. co2 .. co1 .. co0 ..
```

which mirrors our own definitions very closely. However, if we match in the wrong order, we get this proof instead:

```
let co2 :: r2 ~ '[ C]
co1 :: r1 ~ '[ B, C]
co0 :: r0 ~ '[A, B, C]
= ..
co2 = .. co2 ..
co1 = .. co1 ..
co0
in .. co0 ..
```

When we unpack the equalities in the right order, `ghc`

first learns that `r0 ~ (A : r1)`

, without yet knowing what `r1`

is, and so it just constructs a proof
for that; similarly, on the next equality, it learns that `r1 ~ (B : r2)`

,
without knowing what `r2`

is, and so it constructs the corresponding proof
(without modifying the proof it generated previously). When we do things in the
opposite order, `ghc`

first learns that `r2 ~ (C : '[])`

; then when it learns
that `r1 ~ (B : r2)`

, it already knows what `r2`

is, and so it constructs a
proof for `r1 ~ (B : C)`

, and we have lost sharing.

Of course, we don’t really want these proofs *at all*, and indeed, when we use
the plugin, we won’t get them.

### The `typelet`

typechecker plugin

To use the `typelet`

typechecker plugin, just add

`{-# OPTIONS_GHC -fplugin=TypeLet #-}`

at the top of your module. When we use type plugin, we can write the `HList`

example as

```
hlistLet :: HList '[A, B, C]
=
hlistLet case letT (Proxy @(C : '[])) of { LetT (_ :: Proxy r2) ->
case letT (Proxy @(B : r2)) of { LetT (_ :: Proxy r1) ->
case letT (Proxy @(A : r1)) of { LetT (_ :: Proxy r0) ->
let xs2 :: HList r2
xs1 :: HList r1
xs0 :: HList r0
= castEqual (HCons C HNil)
xs2 = castEqual (HCons B xs2)
xs1 = castEqual (HCons A xs1)
xs0
in castEqual xs0
}}}
```

We still need to be explicit about *when* we want to cast, but we don’t need to
be explicit anymore about *how* we want to cast. As an additional bonus, the
resulting core code also doesn’t have any coercion proofs. (We will see below
how we can rewrite this example more compactly using another combinator from the
library.)

In the remainder of this blog post we will discuss the type system extension
provided by the plugin. We will *not* discuss how it works internally; it is
a reasonably simple type checker plugin and a discussion of the implementation
is not relevant for our goal here, which is type-level sharing.

`Let`

and `Equal`

The `typelet`

library introduces two new classes, `Let`

and `Equal`

. `Let`

is defined as

`class Let (a :: k) (b :: k)`

A constraint `Let x t`

, where `x`

is an existentially bound type variable,
models a type-level let-bnding `x := t`

. Only constraints of this shape (with
`x`

a variable^{2}) are valid, and let-bindings cannot be recursive; if
either of these conditions are not met, the plugin will emit a type error.

`Let`

has a single instance for reflexivity (much like the use of `Refl`

in
`letT`

above):

`instance Let a a`

In order to introduce the existential type variable, we define a `LetT`

type
much like we did above, but now carrying evidence of a `Let`

constraint:

```
data LetT (a :: k) where
LetT :: Let b a => Proxy b -> LetT a
letT :: Proxy a -> LetT a
= LetT p letT p
```

Of course, introducing `let`

bindings is only one half of the story. We must
also be able to *apply* them. This is where the second class, `Equal`

, comes
in:

```
class Equal (a :: k) (b :: k)
castEqual :: Equal a b => a -> b
= unsafeCoerce castEqual
```

`Equal`

is a class without any instances; constraints `Equal a b`

are instead
solved by the plugin. Function `castEqual`

allows to coerce from `a`

to `b`

whenever the plugin can prove that `a`

and `b`

are equal^{3}.
Formally:

In order words, the `Let`

constraints define an (idempotent) substitution, and
an `Equal a b`

constraint is solvable whenever `a`

and `b`

are nominally equal
types after applying that substitution.

For a trivial example, two types that are already nominally equal will also
be `Equal`

:

```
castReflexive :: Int -> Int
= castEqual castReflexive
```

The following example is slightly more interesting, and is the equivalent of
`castSingleLet`

that we already saw above, but now using the plugin:

```
castSingleLet :: Int -> Int
=
castSingleLet x case letT (Proxy @Int) of
LetT (_ :: Proxy t1) ->
let y :: t1
= castEqual x
y in castEqual y
```

We saw a more realistic example above, in the definition of `hlistLet`

.

### Combining a type-level let with a cast

In `castSingleLet`

we define a type variable `t1`

, and then immediately
cast a value to that type. That is such a common idiom that the `typelet`

library provides a custom combinator for it:

```
data LetAs f (a :: k) where
LetAs :: Let b a => f b -> LetAs f a
letAs :: f a -> LetAs f a
= LetAs x letAs x
```

Most of the time, we don’t want to hide the *entire* type of some value, because
then that value would become unuseable without a cast (we’d have no information
about its type). That’s why `LetAs`

is parameterised by some functor `f`

; when
we have a value of type `f a`

, `letAs`

introduces a let-binding `b := a`

, and
then casts the value to `f b`

. Here is a simple example:

```
castSingleLetAs :: Identity Int -> Identity Int
=
castSingleLetAs x case letAs x of
LetAs (x' :: Identity t1) ->
castEqual x'
```

For a more realistic example, let’s consider the `HList`

example once more. Here
too we see the same idiom: we introduce a type-level let binding for the
type-level list, and then cast a term-level value. Using `letAs`

we can do that
in one go:

```
hlistLetAs :: HList '[A, B, C]
=
hlistLetAs case letAs (HCons C HNil) of { LetAs (xs02 :: HList t02) ->
case letAs (HCons B xs02) of { LetAs (xs01 :: HList t01) ->
case letAs (HCons A xs01) of { LetAs (xs00 :: HList t00) ->
castEqual xs00 }}}
```

### CPS

Both `letT`

and `letAs`

introduce a data constructor, only for us to then
directly pattern match on it again. The obvious question then is whether we
might be able to avoid this using CPS form. Indeed we can, but we do have to be
careful. The library defines CPS forms of both `letT`

and `letAs`

:

```
letT' :: forall r a. Proxy a -> (forall b. Let b a => Proxy b -> r) -> r
= case letT pa of LetT pb -> k pb
letT' pa k
letAs' :: forall r f a. f a -> (forall b. Let b a => f b -> r) -> r
= case letAs fa of LetAs fb -> k fb letAs' fa k
```

The problem is that these abstractions introduce a type variable for the
continuation (`r`

), which may itself require sharing. The “obvious but wrong”
translation of `hlistLetAs`

, above, is

```
hlistLetAsCPS_bad :: HList '[A, B, C]
=
hlistLetAsCPS_bad HCons C HNil) $ \(xs02 :: HList t02) ->
letAs' (HCons B xs02) $ \(xs01 :: HList t01) ->
letAs' (HCons A xs01) $ \(xs00 :: HList t00) ->
letAs' ( castEqual xs00
```

This is wrong, because the continuation variable `r`

for each call to `letAs'`

is `HList '[A, B, C]`

: the type of the final result. But this means that we
have `n`

occurrences of the full `n`

elements of the list, and so we are back
to code that is `O(n²)`

in size. If we do want to use CPS form, we have to
introduce one additional `let`

binding for the final result:

```
hlistLetAsCPS :: HList '[A, B, C]
=
hlistLetAsCPS Proxy @'[A, B, C]) $ \(_ :: Proxy r) -> castEqual $
letT' (@(HList r) (HCons C HNil) $ \(xs02 :: HList t02) ->
letAs' @(HList r) (HCons B xs02) $ \(xs01 :: HList t01) ->
letAs' @(HList r) (HCons A xs01) $ \(xs00 :: HList t00) ->
letAs' castEqual xs00
```

### Applicative

For the `HList`

example, we can give a type-level let for the tail of the list
`(C ': [])`

, and at the same type provide a value of that type. The fact that we
do things “in the same order” at the type level is what made it possible to use
the `letAs`

combinator.

Unfortunately, that is not always the case. For example, in the applicative chain example from the introduction the order doesn’t quite work out: we must give type-level let bindings starting at the end

```
l02 := C -> r
l01 := B -> l01
l00 := A -> l00
```

but we need to apply arguments in the reverse order (`A`

, then `B`

, then `C`

).
Put another way, associativity at the type-level and at the term-level match for the
`HList`

example (both right-associative), but *mismatch* for the applicative
example (right associative for the function type and left associative for
function application).

Perhaps we will discover further combinators that help with this example, but for now it means that we need to use the more verbose option to write the function:

```
apLet :: forall f r. Applicative f => (A -> B -> C -> r) -> f r
=
apLet f case letT (Proxy @(C -> r)) of { LetT (_ :: Proxy l02) ->
case letT (Proxy @(B -> l02)) of { LetT (_ :: Proxy l01) ->
case letT (Proxy @(A -> l01)) of { LetT (_ :: Proxy l00) ->
let f00 :: f l00
f01 :: f l01
f02 :: f l02
res :: f r
= pure (castEqual f)
f00 = castEqual f00 <*> pure A
f01 = castEqual f01 <*> pure B
f02 = castEqual f02 <*> pure C
res
in res
}}}
```

### Benchmarks

Is all this really worth it? It depends. For the applicative chain, the difference is not huge, and the let-bindings add some overhead:

For GADTs, however, the difference is much more dramatic. For the `HList`

example, after desugaring:

and after the simplifier:

This is all with `-O0`

; the primary goal here is to optimize compilation time
during development. Talking of compilation time, let’s measure that too.
Unfortunately, the performance of the `HList`

example using `letAs`

(rather than
the CPS version) depends critically on the performance of `ghc`

’s pattern match
exhaustiveness checker, which differs quite a bit between `ghc`

versions. Let’s
first disable that altogether:

`-Wno-incomplete-patterns -Wno-incomplete-uni-patterns -Wno-overlapping-patterns`

With these options, compilation time for the three `HList`

variations (baseline
without sharing, `letAs`

, and the CPS version `letAs'`

) are similar across
`ghc`

8.8, 8.10, 9.0 and 9.2, and look something like (compilation time in ms
versus number of entries in the list):

With the exhaustiveness checker enabled, times vary wildly between `ghc`

versions for the non-CPS version (note: these graphs have different ranges
on their y-axes):

8.8 | 8.10 |

9.0 | 9.2 |

The non-CPS version is up to 1.6x faster than the baseline in 8.8, but up to 44x
*slower* in 8.10. The situation improves a bit in 9.0, but it’s still up to
10x slower than the baseline, until sanity is restored in 9.2 and the non-CPS
version is up to 3x faster than the baseline.

The CPS version meanwhile is more stable across versions: up to 3x faster in 8.8 and 8.10, a slightly less impressive improvement of up to 2.4x faster in 9.0, and then back to up to 3.4x faster in 9.2.

### Conclusions

The `typelet`

type checker plugin offers an API that makes it possible to
introduce type-level sharing in `ghc`

as it is today (tested with 8.8, 8.10, 9.0
and 9.2). This is pretty exciting, but like any new abstraction, we need to
experiment with it to discover the best way it can be used. We are releasing the
plugin as well as this blog post at this early stage in the hope that others
will feel inspired to try it and share their discoveries.

Our own motivation for developing this now is our continued efforts on behalf of Juspay to improve their compile times. In particular, we are currently looking at how we could support large anonymous records that compile in less-than-quadratic core space. Of course, type level sharing is only one weapon in our arsenal if we want to optimize for compilation time; my previous two blog posts in this series (Avoiding quadratic core code size with large records and Induction without core-size blow-up) discuss many others, and the search is not yet over.

For this version (without the plugin), it is crucial that

`letT`

is marked as`NOINLINE`

, because if it isn’t, even with`-O0`

the optimizer will inline it, evaluate the`case`

expressions, and we lose all sharing again.↩︎In fact,

`x`

must be a skolem variable: one that cannot unify with anything.↩︎It is possible that the lack of a explicit dependency of

`castEqual`

on the evidence for`Equal`

(and then, transitively, the lack of an explicit dependency of the evidence for`Equal`

on the`Let`

constraints that justify it), may under certain circumstances result in`ghc`

’s optimizer floating the`unsafeCoerce`

too far up. It is not clear to me at present whether this can actually happen. Although the`Equal`

evidence is trivial, it does at least record the full types of the left-hand side and right-hand side of the equality; we have also marked`castEqual`

in an attempt to make sure that the`unsafeCoerce`

does not escape the scope of the`Equal`

evidence. Nonetheless, it is conceivable that we might have to revisit this.↩︎