Edsko will be talking about the problems discussed in this blog post in his Haskell Implementors’ Workshop talk this Sunday, Aug 22. The talk will be broadcast live on YouTube.

Consider a module that contains nothing but the definition of a single large record and some type class instances:

```
{-# OPTIONS_GHC -fplugin=RecordDotPreprocessor #-}
module Example where
import Data.Aeson
import Generics.SOP.JSON
import Generics.SOP.TH
import GHC.TypeLits
newtype T (i :: Nat) = MkT Word
deriving (Show, Eq, ToJSON)
data R = MkR {
f00 :: T 00
f01 :: T 01
,-- .. lots more ..
f98 :: T 98
, f99 :: T 99
,
}deriving (Eq, Show)
'R
deriveGeneric '
instance ToJSON R where
= gtoJSON defaultJsonOptions toJSON
```

As it stands—using `ghc`

’s standard representation for records, along with the code generated by the `RecordDotPreprocessor`

plugin and the `Generic`

instance generated by `generics-sop`

—this results in a `core`

representation of a whopping 450,000 terms/types/coercions, takes 3 seconds to compile, and requires 500M of RAM to compile.^{1}

If we change this module to

```
module Example where
import Data.Aeson (ToJSON(..))
import Data.Record.Generic.JSON
import Data.Record.TH
import Test.Record.Size.Infra
largeRecord defaultLazyOptions [d|
data R = MkR {
f00 :: T 00
, f01 :: T 01
-- .. lots more ..
, f98 :: T 98
, f99 :: T 99
}
deriving (Eq, Show)
|]
instance ToJSON R where
= gtoJSON toJSON
```

we get a module with essentially the same functionality, *but* with a `core`

size of a mere 14,000 terms/types/coercions, which compiles within 1 second and requires roughly 100M of RAM.

In this blog post we describe why this simple module generates so much code, and how the `large-records`

library manages to reduce this by more than an order of magnitude.

We wrote this library because Juspay recently engaged Well-Typed’s services, and one of their requests to us was to try and improve compilation time and compilation memory requirements for their code base. Juspay very generously allowed us to make `large-records`

open source, and it is now available on `Hackage`

.

### Quadratic code size at every level

The reason the `core`

representation of our example module is so large is that unfortunately there are many examples of `ghc`

and other libraries being accidentally quadratic. Before we look at some concrete examples, let’s first investigate where this quadratic code size is coming from. As we will see, it arises at every level: terms, types, type classes, and type level programming.

#### Warmup: terms

For our running example, we will want to have a record with lots of fields. To avoid some “accidental optimizations”, we’ll give each of those fields a different type. To make that a little bit easier, we’ll just introduce a single type that is indexed by a natural number, so that this one type definition gives us as many different types as we need:

`data T (n :: Nat) = MkT Int`

That out of the way, consider a record with lots of fields, such as

```
data R = MkR {
f00 :: T 00
f01 :: T 01
, f02 :: T 02
,-- .. lots more ..
f98 :: T 98
, f99 :: T 99
, }
```

When we define a record, `ghc`

will generate field accessors for all fields in the record. In other words, it will derive functions such as

`f00 :: R -> T 0`

These functions are not difficult to generate, of course. Each function is just a simple `case`

statement:

```
= \(r :: R) ->
f00 case r of
MkR x00 x01 x02 x03 x04 x05 x06 x07 x08 x09
x10 x11 x12 x13 x14 x15 x16 x17 x18 x19-- .. lots more ..
->
x90 x91 x92 x93 x94 x95 x96 x97 x98 x99 x00
```

Although simple, this case statement mentions a lot of unused variables (99 of them, in fact). Moreover, each of those variables is annotated with their type. This means that this one function is actually rather large; `ghc`

reports that it contains 5 terms and 202 types. The size of this function is clearly linear in the number `n`

of fields we have; moreover, `ghc`

will generate one function for each of those `n`

fields; that means that simply declaring the record will already generate code that is `O(n²)`

in size.

#### More subtle: types

Suppose we define an applicative “`zip`

” function for `R`

, something like

```
zipMyRecordWith ::
Applicative f
=> (forall n. T n -> T n -> f (T n))
-> R -> R -> f R
=
zipMyRecordWith f r r' pure MkR
<*> f (f00 r) (f00 r')
<*> f (f01 r) (f01 r')
<*> f (f02 r) (f02 r')
-- .. lots more ..
<*> f (f98 r) (f98 r')
<*> f (f99 r) (f99 r')
```

Clearly the size of this function is at least linear in the number of record fields, that much is expected. However, `-ddump-simpl`

tells us that this function contains *50,818 types*! Where are all of those coming from?

Recall the type of `(<*>)`

:

`(<*>) :: Applicative f => f (a -> b) -> f a -> f b`

Those type variables need to be instantiated; `f`

is always instantiated to the `f`

type parameter passed to `zipMyRecordWith`

, `a`

is the type of the next field we’re applying, but what about `b`

? Let’s annotate `zipMyRecordWith`

with the types of `a`

and `b`

in pseudo-Haskell:

```
zipMyRecordWith ::
Applicative f
=> (forall n. T n -> T n -> f (T n))
-> R -> R -> f R
=
zipMyRecordWith f r r' pure MkR
<*> @(T 00) @(T 01 -> T 02 -> T 03 -> .. -> T 99 -> R) f (f00 r) (f00 r')
<*> @(T 01) @( T 02 -> T 03 -> .. -> T 99 -> R) f (f01 r) (f01 r')
<*> @(T 02) @( T 03 -> .. -> T 99 -> R) f (f02 r) (f02 r')
-- .. lots more ..
<*> @(T 98) @( T 99 -> R) f (f98 r) (f98 r')
<*> @(T 99) @( R) f (f99 r) (f99 r')
```

The first instantiation of `(<*>)`

mentions the type of every single field; the second mentions the types of all-but-one field, the next of all-but-two, etc. This means that the size of this *single* function is once again `O(n²)`

in the number of record fields.

#### Type class dictionaries

Suppose we wanted to capture the concept “some constraints `c`

applied to the types of all fields in our record”:

```
class (
T 00)
c (T 01)
, c (T 02)
, c (-- .. lots more ..
T 98)
, c (T 99)
, c (=> Constraints_R c )
```

This should be fine right? Right? Wrong.

When we declare a type class, we’re effectively constructing a record type with fields for each of the methods of the class; this record is known as the “dictionary” for the class. Superclass constraints translate to “subdictionaries”: when a class such as `Ord`

has a superclass constraint on `Eq`

, then the dictionary for `Ord`

will have a field for an `Eq`

dictionary.

This means that this definition of `Constraints_R`

is actually of a very similar nature to the definition of `R`

itself: it defines a record with 100 fields. And just like for the record, `ghc`

will generate “field accessors” to extract the fields of this dictionary; put another way, those field accessors “prove” that if we know `Constraints_R c`

, we also know `c (T 00)`

, `c (T 01)`

, etc. What do those field accessors look like? You guessed it, a big pattern match; in pseudo-Haskell:

```
$p1Constraints_R :: Constraints_R c => c (T 0)
$p1Constraints_R = \dict ->
case dict of
Constraints_R d00 d01 d02 d03 d04 d05 d06 d07 d08 d09
d10 d11 d12 d13 d14 d15 d16 d17 d18 d19-- .. lots more ..
->
d90 d91 d92 d93 d94 d95 d96 d97 d98 d99 d00
```

Since `ghc`

generates a projection like this for each superclass constraint, this once again results in code of size that is `O(n²)`

in the number of record fields.

#### Type level induction

So far all our examples have been simple Haskell; for our next example, we’ll get a bit more advanced. Just like we can have list *values*, we can also have list *types*; for example, here is a type-level list of the indices of the `T`

types used inside our running example record `R`

:

```
type IndicesR = '[
00, 01, 02, 03, 04, 05, 06, 07, 08, 09
10, 11, 12, 13, 14, 15, 16, 17, 18, 19
, -- .. lots more ..
90, 91, 92, 93, 94, 95, 96, 97, 98, 99
, ]
```

There are many use cases for type level lists. For example, we can define a type `NP`

such that `NP f [x, y, .., z]`

is basically the same as `(f x, f y, .., f z)`

:

```
data NP (f :: k -> Type) (xs :: [k]) where
Nil :: NP f '[]
(:*) :: f x -> NP f xs -> NP f (x ': xs)
```

If we have a `T`

for every index in `IndicesR`

, we can construct a value of our record:

```
npToR :: NP T IndicesR -> R
:* f01 :* f02 :* f03 :* f04 :* f05 :* f06 :* f07 :* f08 :* f09
npToR ( f00 :* f10 :* f11 :* f12 :* f13 :* f14 :* f15 :* f16 :* f17 :* f18 :* f19
-- .. lots more ..
:* f90 :* f91 :* f92 :* f93 :* f94 :* f95 :* f96 :* f97 :* f98 :* f99
:* Nil ) = MkR {..}
```

The compiled size of `npToR`

is large, *but* it is linear in size (total size of 4441 terms, types and coercions for 100 fields, and a total size of 2241 for 50 fields). So far so good.

In order to get to the problem I’d like to illustrate in this section, we need one more concept. Suppose wanted to write a function that can construct a value of `NP T xs`

for *any* `xs`

:

`mkNP :: NP T xs`

This should be possible, since `T`

is just a wrapper around an `Int`

, and so all we need to do is generate as many `T`

s as there are elements in the type level list `xs`

. However, `xs`

is a *type level* list, and we cannot pattern match on types in Haskell; indeed, they do not even exist at all at run-time.

Therefore we somehow need to *reflect* the type level list at the term level: we need a value that corresponds exactly to the type. We do this by introducing a new type, indexed by a type level list, so that given a type level list of a particular length, our new type has exactly *one* value. Such a type—a type with exactly one value—is known as a *singleton type*:

```
data SList :: [k] -> Type where
SNil :: SList '[]
SCons :: SList xs -> SList (x ': xs)
```

`SList`

gives us a value that we can pattern match on, and when we do we discover something about the shape of the type level list `xs`

:

```
mkNP' :: SList xs -> NP T xs
SNil = Nil
mkNP' SCons s) = MkT 0 :* mkNP' s mkNP' (
```

The closest we can come to `mkNP`

is to make this singleton value implicit:

```
class SListI (xs :: [k]) where
sList :: SList xs
mkNP :: SListI xs => NP T xs
= mkNP' sList mkNP
```

If we now try to use `mkNP`

to construct a value of our record

```
r0 :: R
= npToR mkNP r0
```

we will of course find that we need an instance of `SListI`

for `IndicesR`

. Our first instinct might be to write something like

```
instance SListI IndicesR where
=
sList SCons
$ SCons
$ SCons
-- .. lots more ..
$ SCons
$ SCons
$ SNil
```

but if we do that, we will soon discover that the compiled code is quadratic in size. We could have predicted that: it’s the same problem as in the “Types” section above, with `($)`

playing the role of `(<*>)`

. But even if we write it as

```
instance SListI IndicesR where
=
sList SCons (
SCons (
SCons (
-- .. lots more ..
SCons (
SCons (
SNil
{- .. lots more brackets .. -} )) )))
```

we’re still in trouble: each of those `SCons`

applications has two type argument `x`

and `xs`

(the type level list of the tail). So with some type annotations, this code is

```
instance SListI IndicesR where
=
sList SCons @00 @'[01, 02, 03, .., 99] (
SCons @01 @'[ 02, 03, .., 99] (
SCons @02 @'[ 03, .., 99] (
-- .. lots more ..
SCons @98 @'[ 99] (
SCons @99 @'[ ] (
SNil
{- .. lots more brackets .. -} )) )))
```

So this code is again `O(n²)`

in size (actually, the real code generated by `ghc`

is much worse than this, due to the fact that `SList`

is a GADT; after desugaring, the function has a total size of 15,352, and after the simplifier runs (in `-O0`

) that expands to a whopping 46,151).

Experienced type-level Haskellers might be surprised that we’d try to write this `SListI`

instance by hand. After all, the *definition* of a singleton type is that it is a type with only a single value, and so we should be able to just derive it automatically. Indeed we can:

```
instance SListI '[] where
= SNil
sList
instance SListI xs => SListI (x ': xs) where
= SCons sList sList
```

Surely we should be good now, right? These definitions are small, and don’t deal with concrete large lists, and so we avoid quadratic code size. Right? Wrong.

Although it is true that the two instances for `SListI`

are unproblematic, the moment that we use `npToR mkNP`

, `ghc`

needs to prove `SListI '[00, 01, .. 99]`

. In other words, it must generate code that produces a *dictionary* for `SListI [00, 01, .. 99]`

. Since `SListI`

for `(x ': xs)`

has a superclass constraint `SListI xs`

, the dictionary for `SListI [00, 01, .., 99]`

will have a field for the dictionary for `SListI [01, .., 99]`

, all the way down to the empty type level list. This means that `ghc`

will generate 100 dictionaries; each of those dictionaries contains an `SCons`

application with the same type annotation as in hand-written code above. This means that we *still* have code that is `O(n²)`

in size.

### Concrete examples

In the previous section we discussed the ways in which we might end up with accidentally quadratic code size. In this section we will consider some examples of code generated by specific libraries. We will start with GHC generics, which is actually a *good* example: it generates code of size `O(n log n)`

rather than `O(n²)`

. After that we will discuss `record-dot-preprocessor`

and `generics-sop`

, both of which do generate code of `O(n²)`

size.

#### GHC Generics

The goal of *generic programming* is to be able to write a single function that can be applied to values of lots of different types. Generics libraries such as `GHC.Generics`

and `generics-sop`

(discussed below) do this by translating the value to a *representation type*; since every type can be translated to only a handful of different representation types, it suffices to write a function over all of those representation types.

Here we will discuss a simplified form of `GHC.Generics`

that still illustrates the same point. The generic representation of a record such as the one above is essentially just a large nested tuple. For the GHC library itself it does not actually matter terribly *how* this tuple is created; for example, this would work:

```
type family GHC_Rep (a :: Type) :: Type
type instance GHC_Rep R = (T 00, (T 01, (T02, ... (T98, T99))))
```

Although it would work, it would not be great from a code size perspective. Consider the function that would translate `R`

to `GHC_Rep R`

:

```
ghcTo :: R -> GHC_Rep R
MkR{..} =
ghcTo @(T 00) @(T 01, (T02, ... (T98, T99))) f00 (
(,) @(T 01) @(T02, ... (T98, T99)) f01 (
(,) @(T 02) @(... (T98, T99)) f02 (
(,) -- .. lots more ..
@(T 99) @(()) f99 (
(,) () ))))
```

This pattern is starting to get familiar at this point; with this representation, `ghcTo`

would be `O(n²)`

in size. Fortunately, GHC generics avoids this problem by instead generating a *balanced* representation, something like

```
type instance GHC_Rep R =
T 00, ( T 01, T 02 ) )
( ( ( ( ( ( T 03, ( T 04, T 05 ) )
, (
)-- .. lots more ..
T 46, T 47 )
, ( ( T 48, T 49 )
, (
) ) ) ) )T 50, ( T 51, T 52 ) )
, ( ( ( ( ( T 53, ( T 54, T 55 ) )
, (
)-- .. lots more ..
T 96, T 97 )
, ( ( T 98, T 99 )
, ( ) ) ) ) ) )
```

With this representation, the number of branches is still the same, but in the translation function the type annotations are now *halved* in size at each branch, rather than reduced by 1:

```
ghcTo :: R -> GHC_Rep R
MkR{..} =
ghcTo
( ( ( ( ( ( f00, ( f01, f02 ) )
, ( f03 , ( f04, f05 ) )
)-- .. lots more ..
, ( ( f96, f97 )
, ( f98, f99 ) ) ) ) ) ) )
```

As a consequence, the size of *this* version, and the cost of GHC generics in general, is actually `O(n log n)`

in the number of record fields, although the constant factor is reasonably high.

It is worth emphasizing how much better `O(n log n)`

is than `O(n²)`

. Here is a plot of the cost of GHC generics (in terms of AST size: terms, types and coercions) as we vary the number of record fields from 100 fields to 1000 fields:

This almost looks linear. It isn’t; the cost per field is roughly 460 terms/types/coercions when we have 100 fields, and that increases to roughly 625 when we get to 1000 fields, but the cost only goes up very slowly.

#### The `RecordDotSyntax`

preprocessor

The `record-dot-preprocessor`

is a preprocessor and GHC plugin for the RecordDotSyntax GHC proposal. The preprocessor interprets specialized “record syntax”; for example, it translates

`.lbl2 = val} expr{lbl1`

to

`@"lbl1" expr $ setField @"lbl2" (getField @"lbl1" expr) val setField `

These two functions `getField`

and `hasField`

come from a `HasField`

class currently provided by `record-hasfield`

(although now that the Add `setField`

to `HasField`

proposal is accepted, this should eventually move to `base`

):

```
class HasField x r a | x r -> a where
hasField :: r -> (a -> r, a)
```

When you include

`{-# OPTIONS_GHC -fplugin=RecordDotPreprocessor #-}`

at the top of your Haskell file, the `RecordDotPreprocessor`

plugin will generate `HasField`

instances for you. They look innocuous enough:

```
instance HasField "f00" R (T 00) where
= (\x -> r { f00 = x }, f00 r) hasField r
```

Unfortunately, once we get to `ghc`

’s internal representation, this is much less innocent:

```
hasField_f00 :: R -> (T 0 -> R, T 0)
= (
hasField_f00 r -> case r of
\new MkR x00 x01 x02 x03 x04 x05 x06 x07 x08 x09
x10 x11 x12 x13 x14 x15 x16 x17 x18 x19-- .. lots more ..
->
x90 x91 x92 x93 x94 x95 x96 x97 x98 x99 MkR new x01 x02 x03 x04 x05 x06 x07 x08 x09
x10 x11 x12 x13 x14 x15 x16 x17 x18 x19-- .. lots more ..
x90 x91 x92 x93 x94 x95 x96 x97 x98 x99case r of
, MkR x00 x01 x02 x03 x04 x05 x06 x07 x08 x09
x10 x11 x12 x13 x14 x15 x16 x17 x18 x19-- .. lots more ..
->
x90 x91 x92 x93 x94 x95 x96 x97 x98 x99
x00 )
```

We saw this before when we discussed the record field accessors; the same linear cost that a field accessor induces is induced here as well, with a larger constant factor. As for field accessors, we need to generate a `HasField`

instance for every field, and hence altogether we once again have code that is `O(n²)`

in size.

This really matters: a module containing *just* the declaration of our record size has a total size of 22,065 terms/types/coercions after desugaring, and 22,277 after the simplifier (with `-O0`

). This is already much bigger than it should be, due to the quadratic nature fo the field accessors. Generating `HasField`

instances for all fields results in a total code size of 58,665 after desugaring and 78,977 after the simplifier. And all we’ve done is define the record! (For comparison, with `large-records`

, a module containing a single record with 100 fields has a mere total size of 8,305 after desugaring, expanding to 13,958 after the simplifier, and that is *including support for generics*).

#### SOP generics

The `generics-sop`

library is similar in nature to GHC generics, but it uses a different generic representation. It is described in detail in the paper True Sums of Products; here we give a simplified presentation.

In fact, we have already seen most ingredients. The `generics-sop`

representation for a record is essentially the `NP`

type that we discussed in “Type level induction”, above. In that section we saw that the function `npToR`

that translates *from* the generic representation *to* `R`

is linear in size. Unfortunately, the same is not true for the inverse function:

```
npFromR :: R -> NP T IndicesR
MkR{..} = (
npFromR :* f01 :* f02 :* f03 :* f04 :* f05 :* f06 :* f07 :* f08 :* f09
f00 :* f10 :* f11 :* f12 :* f13 :* f14 :* f15 :* f16 :* f17 :* f18 :* f19
-- .. lots more ..
:* f90 :* f91 :* f92 :* f93 :* f94 :* f95 :* f96 :* f97 :* f98 :* f99
#endif
:* Nil
)
```

That wildcard pattern match `MkR{..}`

will expand to a pattern match for a variable for every field, but that doesn’t matter here: we only generate *one* translation function, and it’s fine if that is linear in size. The problem however is in the body of this function. After the previous examples, perhaps you can spot the problem already: `(:*)`

has a bunch of type arguments, and one of those is the list of indices at the tail; so this code looks something like

```
MkR{..} =
npFromR :*) @00 @'[1, 2, .., 98, 99] f00 (
(:*) @01 @'[ 2, .., 98, 99] f01 (
(:*) @02 @'[ .., 98, 99] f02 (
(-- .. lots more ..
:*) @98 @'[ 99] f98 (
(:*) @99 @'[ ] f99 (
(Nil )))))
```

a depressingly familiar sight at this point (and again, the real code is worse, due to the fact that `NP`

is a GADT). This matters: the size of `npToR`

is 4,441 terms/types/coercions, the size of `npFromR`

is 46,459.

The `generics-sop`

library suffers from quadratic code size in other places as well. It makes heavy use of type-level lists, in a similar style to `SListI`

above, and with the same kinds of problems. It also represents *metadata* at the type-level, rather than just at the term level, which are more type-level lists. These are not fundamental problems; `generics-sop`

simply wasn’t designed with the goal to optimize for code size reduction in mind. As we saw in the section on GHC generics, these costs can probably be brought down to `O(n log n)`

, though this will require careful thought.

### The `large-records`

library

When you use the `large-records`

library and define

```
largeRecord defaultLazyOptions [d|
data R = MkR {
f00 :: T 00
, f01 :: T 01
, f02 :: T 02
-- .. lots more ..
, f98 :: T 98
, f99 :: T 99
} |]
```

You get the definition of a type `R`

with field accessors, `HasField`

instances for every field, and a `Generic`

instance (albeit for a custom generics library), *but* the code will be entirely linear—`O(n)`

—in the size of the record. In this section we will see how `large-records`

achieves this for the basic definitions; we will discuss generics separately in the next section.

#### Representation

As we saw, the moment that we declare a record, `ghc`

will generate field accessors for each field of the record, resulting in code that is `O(n²)`

in size. It follows that we cannot use the normal representation of the record. Instead, `large-records`

generates the following:

`newtype R = LR__MkR { vectorFromR :: Vector Any }`

That is, we are representing the record basically by an untyped vector which will have an entry for every field in the record.^{2} Of course, typically users will never deal with this untyped representation directly, but use the field accessors or `HasField`

instances, which we will discuss next.

#### Field accessors

Along with the definition of `R`

, `large-records`

generates an unsafe function that can return any element of the vector at any type:

```
unsafeGetIndexR :: Int -> R -> a
= noInlineUnsafeCo $ vectorFromR t n unsafeGetIndexR n t
```

where `noInlineUnsafeCo`

is a non-inlinable form of `unsafeCoerce`

^{3}. Just like the internal representation of `R`

, this function is not intended for normal use. Instead, it is used to define field accessors for each field. For example, here is the definition of the accessor `f00`

:

```
f00 :: R -> T 0
= unsafeGetIndexR 0 f00
```

One of these accessors is generated for every field, but the size of each accessor is constant (and tiny), so the generation of all accessors results in code that is `O(n)`

in size.

`HasField`

instance

The `HasField`

instance is very similar. Along with the unsafe accessor, we also define an unsafe update function:

```
unsafeSetIndexR :: Int -> R -> a -> R
= LR__MkR $
unsafeSetIndexR n r x unsafeUpd (vectorFromR r) [(n, noInlineUnsafeCo x)]
```

The `HasField`

instance is now easy, and once again constant in size and tiny:

```
instance HasField "f00" R (T 0) where
= (unsafeSetIndexR 0 r, unsafeGetIndexR 0 r) hasField r
```

#### Pattern synonym

By default `large-records`

does *not* generate a pattern synonym for `R`

. It *can* do, if requested:

```
= True}) [d|
largeRecord (defaultLazyOptions {generatePatternSynonym
data R = MkR {
f00 :: T 00
, f01 :: T 01
, f02 :: T 02
-- .. lots more ..
, f98 :: T 98
, f99 :: T 99
} |]
```

With the `generatePatternSynonym`

option, `large-records`

generates two new definitions. First, a function that constructs a tuple containing all fields of the record:

```
tupleFromR :: R
-> ( (T 00, T 01, T 02, {- .. lots more .. -}, T 60, T 61)
T 62, T 63, T 64, {- .. lots more .. -}, T 98, T 99)
, (
)= (
tupleFromR r 00 r
unsafeGetIndexR 01 r
, unsafeGetIndexR 02 r
, unsafeGetIndexR -- .. lots more ..
60 r
, unsafeGetIndexR 61 r
, unsafeGetIndexR
)62 r
, ( unsafeGetIndexR 63 r
, unsafeGetIndexR 64 r
, unsafeGetIndexR -- .. lots more ..
98 r
, unsafeGetIndexR 99 r
, unsafeGetIndexR ) )
```

(It was careful to use a nested tuple, because `ghc`

does not support tuples with more than 62 fields.) It then uses this function as a view pattern in an explicitly bidirectional pattern synonym:

```
pattern MkR :: T 0 -> T 1 -> T 2 -> {- .. lots more .. -} -> T 98 -> T99 -> R
pattern MkR{f00, f01, f02, {- .. lots more .. -}, f98, f99} <-
-> ( ( f00, f01, f02, f03, f04, f05, f06, f07, f08, f09
tupleFromR
, f10, f11, f12, f13, f14, f15, f16, f17, f18, f19-- .. lots more ..
, f60, f61
)
, ( f62, f63, f64, f65, f66, f67, f68, f69-- .. lots more ..
, f90, f91, f92, f93, f94, f95, f96, f97, f98, f99
) )where
MkR x00 x01 x02 {- .. lots more .. -} x98 x99 = RFromVector $ fromList [
unsafeCoerce x00
, unsafeCoerce x01
, unsafeCoerce x02-- .. lots more ..
, unsafeCoerce x98
, unsafeCoerce x99 ]
```

The pattern synonym makes it possible to pattern match on `R`

or construct `R`

values as if it was defined in the normal way. The reason that we don’t generate it by default is an annoying one: when we declare a record pattern synonym, `ghc`

very “helpfully” generates field accessors, which was precisely what we were trying to avoid.

So the option is there for code bases that need it; when enabled, we reintroduce a quadratic component, albeit a reasonably small one, and the rest of the generated code is still linear. Fortunately, many code bases that deal with large records never actually construct such records directly. They are instead populated from databases or from REST requests; in such cases, the pattern synonym is not required. In addition, `large-records`

provides a quasi-quoter that can be used in lieu of the pattern synonym so that you can write, for example,

```
firstTwo :: R -> (T 0, T1)
= (x0, x1) firstTwo [lr| MkR { f00 = x0, f01 = x1 } |]
```

In `ghc- 9.2`

we have the new `NoFieldSelectors`

language pragma (ghc proposal, merge request) which might improve the situation here, but I haven’t experimented with that yet.

### Generics

The `large-records`

library comes with its own `Generic`

class. Although the class and its instances are defined rather differently, *usage* of the class—that is, the way you’d write generic functions—is very similar in style to `generics-sop`

. The class is defined as

```
class Generic a where
-- Translation
from :: a -> Rep I a
to :: Rep I a -> a
-- Constraints
type Constraints a :: (Type -> Constraint) -> Constraint
dict :: Constraints a c => Proxy c -> Rep (Dict c) a
-- Metadata
metadata :: Proxy a -> Metadata a
```

We will discuss the various parts of this class separately, and finish with an example of a generic function that ties all this together.

#### Translation to and from the generic representation

The generic representation used by `large-records`

is

`newtype Rep f a = Rep (Vector (f Any))`

In other words, if we pick the identity functor for `f`

, then `Rep f a`

is just `Vector Any`

; this means that `from`

and `to`

can just be `coerce`

:^{4}

```
instance Generic R where
= coerce
from = coerce
to -- .. more ..
```

#### Constraints

The `Constraints`

type family is meant to capture the concept of “a constraint applied to all fields in the record”. We encountered this idea above in the section “Type class dictionaries”, where we noticed we have to be careful. Here is how `large-records`

instantiates `Constraints`

for our running example:

```
class Constraints_R c where
dictConstraints_R :: Proxy c -> Rep (Dict c) R
instance Generic R where
type Constraints R = Constraints_R
= dictConstraints_R
dict -- .. more ..
```

`Constraints_R`

does *not* have any superclass constraints, thus avoiding the quadratic cost of all the dictionary projections. Instead, it is constructing a *vector of dictionaries*. Essentially, we are constructing *our own representation of a dictionary*, just like we constructed our own representation for a record. Unlike the GHC representation of dictionaries, we can project from our representation by just doing a vector index.

Fortunately, we *can* give an *instance* for `Constraints_R`

:

```
instance ( c (T 0)
T 1)
, c (T 2)
, c (-- .. lots more ..
T 98)
, c (T 99)
, c (=> Constraints_R c where
) = Rep $ fromList [
dictConstraints_R p Proxy @(T 0))
unsafeCoerce (dictFor p) (Proxy @(T 1))
, unsafeCoerce (dictFor p) (Proxy @(T 2))
, unsafeCoerce (dictFor p) (-- .. lots more ..
Proxy @(T 98))
, unsafeCoerce (dictFor p) (Proxy @(T 99))
, unsafeCoerce (dictFor p) ( ]
```

In core, this results in a *single* function with 100 arguments, that constructs our custom dictionary representation. This function is reasonably large (3523 terms/types/coercions), but not excessively so, and moreover, it is `O(n)`

in the number of record fields.

#### Metadata

The `Metadata`

provided by `large-records`

is currently fairly minimal; it tells us the names of the record, its constructor, and the fields, as well as the strictness of each field:

```
data Metadata a = Metadata {
recordName :: String
recordConstructor :: String
, recordSize :: Int
, recordFieldMetadata :: Rep FieldMetadata a
,
}
data FieldStrictness = FieldStrict | FieldLazy
data FieldMetadata x where
FieldMetadata ::
KnownSymbol name
=> Proxy name
-> FieldStrictness
-> FieldMetadata x
```

Metadata is generated automatically by the `largeRecord`

invocation.

Note that the name of each field is at the type level; this is primarily to help with integration with GHC generics. This is essential for the integration of `large-records`

with other libraries, but a detailed discussion of this is outside the scope of this blog post.

#### Writing generic functions

An in-depth discussion on how to write generic functions in the style of `generics-sop`

is beyond the scope of this blog post; interested readers may wish to refer to the True Sums of Products paper or watch Andres Löh’s `generics-sop`

tutorial at ZuriHac 2016 or read his SSGEP summer school lecture notes. Here we will discuss one (typical) example only.

Generic functions in `generics-sop`

style are written using a set of combinators on the generic representation. As a simple example, we can *zip* two `Rep`

s:

```
zipWith :: (forall x. f x -> g x -> h x)
-> Rep f a -> Rep g a -> Rep h a
```

It is worth spelling out what this function is doing: it is zipping together all the fields of two records, albeit in their generic representation (`Rep`

). We therefore require that the function we’re zipping with is polymorphic: we must be able to zip fields no matter their type.

As another example, we can *map* a function over all fields in the record; there is a function `map`

, as you’d expect:

`map :: (forall x. f x -> g x) -> Rep f a -> Rep g a`

but there is also a slightly more useful version of this:

```
cmap :: (Generic a, Constraints a c)
=> Proxy c
-> (forall x. c x => f x -> g x)
-> Rep f a -> Rep g a
```

This *constrained map* still requires the function we’re applying to all fields to be polymorphic, *but* it is given the guarantee that the types it is applied to all satisfy constraint `c`

. Of course, in order to be able to do that, we need constraint `c`

to hold for all fields, which is why this function requires `Constraints a c`

. There is a similar generalization of `zipWith`

called `czipWith`

, as well as applicative versions of both of these functions.

The final combinator we will discuss applies when we have a vector in which every element has the *same* type: in this case, we can *collapse* the vector to an ordinary list:

`collapse :: Rep (K a) b -> [a]`

Here is an example of a generic function that can translate any value to JSON, provided it has a `Generic`

instance and provided we have `ToJSON`

instances for all fields:

```
gtoJSON :: forall a. (Generic a, Constraints a ToJSON) => a -> Value
=
gtoJSON
Aeson.object. Rep.collapse
. Rep.zipWith (mapKKK $ \n x -> (Text.pack n, x)) recordFieldNames
. Rep.cmap (Proxy @ToJSON) (K . toJSON . unI)
. from
where
Metadata{..} = metadata (Proxy @a)
```

If you are new to `generics-sop`

style programming, this function may require careful reading, but the essence of it is simple:

- Translate the value to its generic representation using
`from`

- Apply
`toJSON`

to all fields - Zip it with the metadata to get a vector of
`(name, value)`

pairs - Collapse that vector to a list
- Use Aeson’s standard
`object`

function to create the JSON object.

Once you are used to the paradigm, `generics-sop`

style programming allows for the very succinct definition of generic functions (I’d go so far as calling it “elegant”, but then I am biased). Typically a function like `gtoJSON`

would be used to derive a `ToJSON`

instance:

```
instance ToJSON R where
= gtoJSON toJSON
```

Indeed, the `large-records`

library can generate such instances automatically for a handful of classes.

#### Transforms

The only aspect of the generics programming aspect of `large-records`

we have not covered here is what is known as “transforms” in `generics-sop`

style programming. Transforms allow you to generically modify type indices; for example, suppose we have a HKD-style definition of a table, as used in the `beam`

database library:

```
data Table f = MkTable {
tableField1 :: f Int
tableField2 :: f Maybe
, }
```

We could write a generic function that could, say, turn a `Table Identity`

into a `Table Maybe`

. Doing this without using type-level lists anywhere is surprisingly difficult, and a discussion of how this is done in `large-records`

is probably a good topic for a separate blog post. For now we just refer any interested readers to the examples in the test suite.

Talking of `beam`

, `large-records`

integrates smoothly with `beam-core`

by means of an auxiliary library `beam-large-records`

. This library is on GitHub but is not available on Hackage yet because it depends on a small patch to `beam`

. Hopefully it will be released soon.

### Benchmarks

For the purpose of these benchmarks, we are interested in *compile-time*, not run-time performance. I have not yet done any benchmarks of run-time performance. There might be some run-time benefits compared to other generics approaches because there is no conversion to and from a generic representation, but there are probably some performance penalties for normal functions because the record representation used internally in `large-records`

is much less transparent to the GHC optimizer. Measuring this is future work.

In terms of compile time performance, however, the results are pretty clear. Here is a graph of the size of the core AST of the module (the sum of the terms, types and coercions in the module) plotted against the number of record fields, after desugaring:

As you can see, for vanilla `ghc`

the size grows quadratically to nearly 450,000 nodes, but stays nice and flat for `large-records`

. The GHC simplifier reduces the size of the AST a bit in the case of vanilla `ghc`

and *increases* it a bit for `large-records`

, but `large-records`

is still the clear winner:

In terms of “real” performance measurements, here is a plot of `ghc`

memory usage as reported by the OS (in kB), against the number of record fields:

As you can see, `ghc`

climbs to roughly 500 MB for 100 fields, whereas `large-records`

stays near the 100 MB mark. There is a similar graph for compilation time (in terms of elapsed wall-clock time in seconds):

For vanilla `ghc`

compilation takes roughly 3 seconds for 100 fields, whereas `large-records`

is 3x times faster and stays under one second.

(These numbers have been normalized against compiling an empty module. The error bars here indicate standard error; each module has been compiled 100 times.)

### Conclusions

The `large-records`

library generates code that is strictly `O(n)`

in size in the number of record fields. It does so at the cost of generating basically untyped `core`

. This does not matter for code *using* `large-records`

, which is still presented with a perfectly safe, typed interface. It might however reduce the applicability of the techniques employed by `large-records`

more generally within `ghc`

: the History of Haskell paper (Section 9.1) emphasizes the importance of having a typed internal `ghc`

language to catch and help debug compiler bugs.

Perhaps aiming for linear cost however is not necessary. We saw in the section on GHC generics that if we are careful how we set things up, we can generate code that is `O(n log n)`

in size. Similar techniques can be applied in libraries such as `generics-sop`

as well, provided that we are careful at every step; for example, the use of type-level *lists* should be avoided and we should be using type-level (balanced) *trees* instead.

In order to tackle the problem at a more fundamental level, we might need to be able to introduce and control sharing at the type level; Richard Eisenberg opened ghc ticket #20264 after my HIW talk on `large-records`

with some thoughts on the topic. As a completely alternative approach, the work on Scrap your type applications removes type arguments altogether. This may well address many of the concerns in this blog post, but it’s such a radically different internal language that it’s hard to foresee quite all the consequences.

Addressing the overabundance with types is however only one aspect of the problem, albeit an important one. Specifically, it does not prevent record pattern matches and record updates (either on term level records or on type class dictionaries) in `core`

always being linear in the size of the record. Perhaps an extension to `core`

could alleviate this problem, though changes to `core`

affect to the entire compiler backend and so are rightly treated with extreme prejudice.

When such problems are addressed in `ghc`

, a library like `large-records`

may no longer be required. Until such time, however, the `large-records`

library can be used to significantly reduce compile time and memory requirements for code bases that contain many large records.

Compilation time and memory usage normalised to an empty module.↩︎

We will not discuss it in this blog post, but the library currently has support for records where

*all*fields are strict and records where*all*fields are lazy. It would not be too difficult to extend it to records where some fields are lazy and some are strict. It does not currently support`UNPACK`

pragmas; adding support for that would be more difficult, we would need to add an unboxed vector for each type of unpacked field.↩︎This is to avoid a segfault with

`ghc`

prior to version 9.0. See GHC ticket #16893.↩︎The actual code generated uses an indirection through another function to deal with strictness: if the record has strict fields, then those fields are forced in the

`to`

translation.↩︎