Implementing a minimal version of haskell-servant

Wednesday, 04 November 2015, by Andres Löh.
Filed under coding.

Recently, there was a question on Stack Overflow on how Servant actually works. Others were quick to suggest the Servant paper as a thorough explanation of the approach and the implementation.

As a co-author, I’m obviously happy if the paper is being read, but it’s also 12 pages long in two-column ACM style. And while it explains the implementation, it does not necessarily make it easier to start playing with the code yourself, because it only shows excerpts, and code snippets in a paper are not easily runnable.

At the same time, whenever I want to demonstrate a new concept in the Servant context, or play with new ideas, I find myself not impementing it in the main Servant code base, but rather to create a small library that is “like Servant”, built on the same principles, but much simpler, so that I have less work to do and can evaluate the ideas more quickly. I’ve talked to some other contributors, and at least some of them are doing the same. So I thought it might be useful to develop and present the code of “TinyServant”, which is not exactly tiny, but still small compared to the full Servant code base, strips away a lot of duplication and unessential extras, but is still complete enough so that we can observe how Servant works. Obviously, this still won’t explain everything that one might want to know about the implementation of Servant, but I hope that it will serve as a useful ingredient in that process.

This blog post is a somewhat revised and expanded version of my Stack Overflow answer.

This is not a general tutorial on Servant and using Servant. For learning how to use Servant, the official Servant tutorial or the general documentation section of the Servant home page are better starting points.

(

Haskell Hackathon, Haskell eXchange, Haskell courses in London, October 2015

Friday, 11 September 2015, by Andres Löh.
Filed under well-typed, community, training.

Haskell events in London

In the time from 5–13 October, we are (co-)organizing a number of Haskell-related events in London, with Skills Matter.

Here’s the overview:

Haskell infrastructure Hackathon

We’ll co-organize and participate in a two-day Haskell Hackathon, which takes place directly after the Haskell eXchange.

This Hackathon aims at bringing together Haskell developers – both beginners and experts – who want to help improve the Haskell infrastructure, predominantly Hackage and Cabal.

We’ll aim to arrange some introductory overview talks, to e.g. provide an overview over the code bases and the most important open issues.

Participation is free, but please register via Skills Matter.

(

Hackage Security Beta Release

Tuesday, 25 August 2015, by Edsko de Vries, Duncan Coutts.
Filed under cabal, community, industrial-haskell-group.

Well-Typed and the Industrial Haskell Group are very happy to announce the beta release of the hackage-security library, along with its integration in hackage-server and cabal-install. The new security features of hackage are now deployed on the central server and there is a beta release of cabal available. You can install it through

cabal install \ \ \ \

This will install a cabal-secure-beta binary which you can use alongside your normal installation of cabal.

For a more detailed discussion of the rationale behind this project, see the annoucement of the alpha release or the initial announcement of this project. We will also be giving a talk about the details at the upcoming Haskell Implementors Workshop. In the remainder of this blog post we will describe what’s available, right now.

What’s in it for you?

Increased security

The Hackage server now does index signing. This means that if an attacker sits between you and Hackage and tries to feed you different packages than you think you are installing, cabal will notice this and throw a security exception. Index signing provides no (or very limited) security against compromise of the central server itself, but allows clients to verify that what they are getting is indeed what is on the central server.

 (Untrusted) mirrors

A very important corollary of the previous point is that we can now have untrusted mirrors. Anyone can offer to mirror hackage and we can gratefully accept these offers without having to trust those mirror operators. Whether we are downloading from the mirror or from the primary server, the new security features make it possible to verify that what we are downloading is what is on the primary server.

In practice this mean we can have mirrors at all, and we can use them fully automatically with no client side configuration required. This should give a huge boost to the reliability of Hackage; even AWS goes down from time to time but properly decentralised mirrors should mean there’s always a recent snapshot available.

On the client-side, the very first time cabal updates from the primary server it also finds out what mirrors are available. On subsequent updates it will automatically make use of any of those mirrors: if it encounters a problem with one it will try another. Updates to the list of mirrors is also fully automatic.

For operating a mirror, we have extended the hackage-mirror client (currently bundled in the hackage-server package) so that it can be used to mirror a Hackage repository to a simple set of local files which can then be served by an ordinary HTTP server.

We already have one mirror available in time for the beta. The OSU Open Source Lab have very kindly agreed to host a Hackage mirror for the benefit of the Haskell community. This mirror is now live at, but we didn’t need to tell you that: (the beta release of) cabal will notice this automatically without any configuration on the part of the user thanks to

Getting a mirror up and running is very easy, so if you would like to host a public Hackage mirror, then please do get in touch; during the beta period get in touch with us, or later on get in touch with the Hackage admins.

Incremental updates

Hackage provides a 00-index.tar.gz resource which is a tarball containing the .cabal files for all packages available on Hackage. It is this file that cabal downloads when you call cabal update, and that it uses during dependency resolution.

However, this file is quite large, which is why cabal update can take a few seconds to complete. In fact at nearly 10Mb the index is now considerably larger than almost all package source tarballs.

As part of the security work we have had to extend this index with extra security metadata, making the file even larger. So we have also taken the opportunity to dramatically reduce download sizes by allowing clients to update this file incrementally. The index tarball is now extended in an append-only way. This means that once cabal has downloaded the tarball once, on subsequent updates it can just download the little bit it doesn’t yet have. To avoid making existing clients download the new larger index file each time, the 00-index.tar.gz is kept as it always was and repositories supporting the new features additionally provide a 01-index.tar.gz. In future we could additionally provide a .tar.xz variant and thereby keep the first-time update size to a minimum.

The append-only nature of the index has additional benefits; in effect, the index becomes a log of Hackage’s history. This log can be used for various purposes; for example, we can track how install plans for packages change over time. As another example, Herbert Valerio Riedel has been working on an “package-index wayback” feature for Cabal. This uses the index to recreate a past view of the package index for recovering now bit-rotted install-plans that were known to work in the past.

There are currently a few known issues that make cabal update slower than it needs to be, even though it’s doing an incremental update. This will be addressed before the official release.

Host your own private repository

It has always been possible to host your own Hackage repository, either for private packages or as a mirror of the public collection, but it has not always been convenient.

There is the “smart” server in the form of the hackage-server, which while relatively easy to build and run, isn’t as simple as just a bunch of files. There has also always been the option of a “dumb” server, in the form of a bunch of files in the right format hosted by an ordinary HTTP server. While the format is quite simple (reusing the standard tar format), there have not been convenient tools for creating or managing these file based repositories.

As part of the security work we have made a simple command line tool to create and manage file based Hackage repositories, including all the necessary security metadata. This tool has been released as hackage-repo-tool on Hackage.

So whether you want a private mirror of the public packages, or a repository for your own private packages, or both, we hope these new tools will make that much more convenient. Currently documentation on how to use these tools is still somewhat lacking; this is something we will address after this beta release. Getting started is not difficult; there are some brief instructions in the reddit discussion, and feel free to talk to us on #hackage on IRC or contact us directly at if you need help.

What’s next?

As mentioned, we would like to invite you to install cabal-secure-beta and start testing it; just use it as you would cabal right now, and report any problems you may find on the hackage-security issue tracker. Additionally, if you would like to host a public mirror for Hackage, please contact us.

This release is primarily intended as an in-the-wild test of the infrastructure; there are still several details to be dealt with before we call this an official release.

The most important of these is proper key management. Much like, say, HTTPS, the chain of trust starts at a set of root keys. We have asked the committee to act as the root of trust and the committee has agreed in principle. The committee members will hold a number of the root keys themselves and the committee may also invite other organisations and individuals within the community to hold root keys. There are some policy details that remain to be reviewed and agreed. For example we need to decide on how many root keys to issue, what threshold number of keys be required to re-sign the root info, and agree policies for storing the root keys to keep them safe (for instance, mandate an air gap where the root key is never on a machine that is connected to the Internet). We will use the opportunity of ICFP (and the HIW talk) in a couple weeks time to present more details and get feedback.

If you would like to help with development, please take a look at the issue list and get in touch!

Parametricity Tutorial (Part 2): Type constructors and type classes

Friday, 14 August 2015, by Edsko de Vries.
Filed under coding.

This is part 2 of a two-part series on parametricity.

In part 1 we covered the basics: constant types, functions and polymorphism (over types of kind *). In this post we will deal with more advanced material: type constructors, type classes, polymorphism over type constructors and type constructor classes.

Type constructors (types of kind * -> *)

Before considering the general case, let’s think about lists. Given a :: A ⇔ A', two lists xs :: [A] and ys :: [A'] are related iff their elements are related by a; that is,

[] ℛ([a]) []


     (x:xs') ℛ([a]) (y:ys')
iff  x ℛ(a) x'  and  xs' ℛ([a]) ys'

For the special case that a is a function a⃯ :: A -> A', this amounts to saying that map a⃯ xs ≡ ys.

You can imagine a similar relation F a exists for any type constructor F. However, we will not give a general treatment of algebraic data types in this blog post. Doing this would require giving instances for products and sums (which is fine), but also for (least) fixed points, and that would take us much too far afield.

Thankfully, we will not need to be quite so precise. Instead, it will only require the following characterization:

Characterization: Functors.

Let F be a functor. Then for all relations a :: A ⇔ A', b :: B ⇔ B' and functions f :: A -> B and g :: A' -> B', such that f ℛ(a -> b) g:

forall xs :: F A, xs' :: F A'.
  if    xs ℛ(F a) xs'
  then  F f xs ℛ(F b) F g xs'
where we overload F to also mean the “map” function associated with F. (Provided that the Functor type class instance for F is correct, F f should be the same as fmap f.)

(If we had the precise rules for algebraic data types we would be able to prove this characterization for any specific functor F.)

Intuitively, think about xs and xs' as two containers of the same shape with elements related by a, and suppose we have a pair of functions f and g which map a-related arguments to b-related results. Then the characterization states that if we apply function f to the elements of xs and g to the elements of xs', we must end up with two containers of the same shape with elements related by b.

For the special case that a and b are functions (and F is a functor), the mapping relations characterization simply says that

    if xs ℛ(F a⃯) xs' then F f xs ℛ(F b⃯) F g xs'
-- simplify
    if F a⃯ xs ≡ xs' then F f xs ℛ(F b⃯) F g xs'
-- simplify
    F b⃯ (F f xs) ≡ F g (F a⃯ xs)
-- functoriality
    F (b⃯ . f) xs ≡ F (g . a⃯) xs

which follows immediately from the premise that b⃯ . f ≡ g . a⃯ (which in turn is a consequence of f ℛ(a⃯ -> b⃯) g), so the mapping relations characterization is trivially satisfied (provided that the mapping of relations correspond to the functor map in the case for functions).

Technical note. When we use parametricity results, we often say something like: “specializing this result to functions rather than relations…”. It is important to realize however that if F is not a functor, then F a may not be a functional relation even if a is.

For example, let a⃯ :: A -> A', and take F(a) = a -> a. Then

     f ℛ(F a⃯) g
-- expand definition
iff  f ℛ(a⃯ -> a⃯) g
-- rule for functions
iff  forall x :: A, x' :: A'.
       if x ℛ(a⃯) x' then f x ℛ(a⃯) g x'
-- simplify (a⃯ is a function)
iff  forall x :: A.
       a⃯ (f x) ≡ g (a⃯ x)

Taking a :: Int -> Int ; a x = 0, this would relate two functions f, g :: Int -> Int whenever 0 ≡ g 0; it is clear that this is not a functional relation between f and g.

Given a function a⃯ :: A -> A', F a⃯ is a function F A -> F A' when F is a functor, or a function F A' -> F A if F is a contravariant functor. We will not consider contravariant functors further in this blog post, but there is an analogous Contravariant Functor Characterization that we can use for proofs involving contravariant functors.

 Example: ∀ab. (a -> b) -> [a] -> [b]

This is the type of Haskell’s map function for lists of course; the type of map doesn’t fully specify what it should do, but the elements of the result list can only be obtained from applying the function to elements of the input list. Parametricity tells us that

     f ℛ(ab. (a -> b) -> [a] -> [b]) f
-- apply rule for polymorphism, twice
iff  forall A, A', B, B', a :: AA', b :: BB'.
       f@A,B ℛ((a -> b) -> [a] -> [b]) f@A',B'
-- apply rule for functions, twice
iff  forall A, A', B, B', a :: AA', b :: BB'.
     forall g :: A -> B, g' :: A' -> B', xs :: [A], xs' :: [A'].
       if g ℛ(a -> b) g', xs ℛ([a]) xs' then f g xs ℛ([b]) f g' xs'

Specializing to functions a⃯ :: A -> A' and b⃯ :: B -> B', we get

     forall A, A', B, B', a⃯ :: A -> A', b⃯ :: B -> B'.
     forall g :: A -> B, g' :: A' -> B', xs :: [A], xs' :: [A'].
       if g ℛ(a⃯ -> b⃯) g', xs ℛ([a⃯]) xs' then f g xs ℛ([b⃯]) f g' xs'
-- simplify
iff  forall A, A', B, B', a⃯ :: A -> A', b⃯ :: B -> B'.
     forall g :: A -> B, g' :: A' -> B'.
       if b⃯ . g ≡ g' . a⃯ then map b⃯ . f g ≡ f g' . map a⃯

As an aside, Functor instances should satisfy two laws:

It turns out that the second property follows from the first by parametricity; see The free theorem for fmap.

Example: ∀a. F a -> G a

Consider a funtion f :: ∀a. F a -> G a, polymorphic in a but between fixed (constant) type constructors F and G; for example, a function of type ∀a. Maybe a -> [a] fits this pattern. What can we tell about f?

     f ℛ(a. F a -> G a) f
iff  forall A, A', a :: AA'.
       f@A ℛ(F a -> G a) f@A'
iff  forall A, A', a :: AA', x :: F A, x' :: F A'.
       if x ℛ(F a) x' then f x ℛ(G a) f x'

For the special case where we pick a function a⃯ :: A -> A' for a, this is equivalent to

forall A, A', a⃯ :: A -> A'.
  G a⃯ . f == f . F a⃯

For the categorically inclined, this means that polymorphic functions must be natural transformations.

Type classes

Now that we’ve covered the basics, it’s time to consider some more advanced language features. We will first consider qualified types, such as ∀a. Eq a => a -> a -> a.

The rule for a qualified type is

     f ℛ( a. C a => t) f'
iff  forall A, A', a :: AA'
     such that A, A' instances of C and a respects C.
       f@A ℛ(t) f'@A'

What does it mean for a relation a :: A ⇔ A' to respect a type class C? Every type class introduces a new constraint on relations defined by the members of the type class. Let’s consider an example; Haskell’s equality type class is defined by

class Eq a where
  (==) :: a -> a -> Bool

(Let’s ignore (/=) for simplicity’s sake.). Then a relation a respects Eq, written Eq(a), iff all class members are related to themselves. For the specific case of Eq this means that

     (==) ℛ(a -> a -> Bool) (==)
-- rule for functions, twice
iff  forall x :: A, x' :: A', y :: A, y' :: A'.
       if x ℛ(a) x', y ℛ(a) y' then x == y ℛ(Bool) x' == y'
-- Bool is a constant type, simplify
iff  forall x :: A, x' :: A', y :: A, y' :: A'.
       if x ℛ(a) x', y ℛ(a) y' then x == y ≡ x' == y'

For the special case where we pick a function a⃯ :: A -> A', the function respects Eq iff

forall x :: A, y :: A.
  x == y ≡ a⃯ x == a⃯ y

I.e., the function maps (==)-equal arguments to (==)-equal results.

Syntactic convention.

In the following we will write

forall A, A', a :: AA'
such that A, A' instances of C and a respects C.

more concisely as

forall C(A), C(A'), C(a) :: AA'.

Example: ∀a. Eq a => a -> a -> a

We already considered the free theorem for functions f :: ∀ a. a -> a -> a:

g (f x y) = f (g x) (g y)

Is this free theorem still valid for ∀a. Eq a => a -> a -> a? No, it’s not. Consider giving this (admittedly somewhat dubious) definition of natural numbers which considers all “invalid” natural numbers to be equal:

newtype Nat = Nat Int
  deriving (Show)

instance Eq Nat where
  Nat n == Nat n' | n <= 0, n' <= 0 = True
                  | otherwise       = n == n'

If we define

f :: forall a. Eq a => a -> a -> a
f x y = if x == y then y else x

g :: Nat -> Nat
g (Nat n) = Nat (n + 1)

then for x ≡ Nat (-1) and y ≡ Nat (-2) we have that g (f x y) ≡ Nat (-1) but f (g x) (g y) ≡ Nat 0. Dubious or not, free theorems don’t assume anything about the particular implementation of type classes. The free theorem for ∀a. Eq a => a -> a -> a however only applies to functions g which respect Eq; and this definition of g does not.

 Example: ∀ab. (Show a, Show b) => a -> b -> String

We promised to look at this type when we considered higher rank types above. If you go through the process, you will find that the free theorem for functions f of this type is

f x y = f (g x) (h y)

for any Show-respecting functions g and h. What does it mean for a function to respect Show? Intuitively it means that the function can change the value of its argument but not its string representation:

show (g x) = show x

Type constructor classes

Type constructor classes are classes over types of kind * -> *; a typical example is

class Functor f where
  fmap :: ab. (a -> b) -> f a -> f b

The final rule we will discuss is the rule for universal quantification over a qualified type constructor (universal quantification over a type constructor without a qualifier is rarely useful, so we don’t discuss it separately):

     g ℛ(f. C f => t) g'
iff  forall C(F), C(F'), C(f) :: FF'.
       g@F ℛ(t) g'@F'

If F and F' are type constructors rather than types (functions on types), f :: F ⇔ F' is a relational action rather than a relation: that is, it is a function on relations. As before, C(f) means that this function must respect the type class C, in much the same way as for type classes. Let’s consider what this means for the example of Functor:

     fmap ℛ(ab. (a -> b) -> f a -> f b) fmap
iff  forall A, A', B, B', a :: AA', b :: BB'.
       fmap@A,B ℛ((a -> b) -> f a -> f b) fmap@A,B
iff  forall A, A', B, B', a :: AA', b :: BB'.
     forall g :: A -> B, g' :: A' -> B', x :: F A, x' :: F' A'.
       if    g ℛ(a -> b) g', x ℛ(f a) x'
       then  fmap g x ℛ(f b) fmap g' x'

Example: ∀f. Functor f => f Int -> f Int

Intuitively, a function g :: ∀f. Functor f => f Int -> f Int can take advantage of the Int type, but only by applying fmap; for example, when we apply g to a list, the order of the list should not matter. Let’s derive the free theorem for functions of this type:

     g ℛ(f. Functor f => f Int -> f Int) g
iff  forall Functor(F), Functor(F'), Functor(f) :: FF'.
       g@F ℛ(f Int -> f Int) g@F'
iff  forall Functor(F), Functor(F'), Functor(f) :: FF'.
     forall x :: F Int, x' :: F' Int.
       if x ℛ(f Int) x' then g x ℛ(f Int) g x'

As before, we can specialize this to higher order functions, which are special cases of relational actions. Let’s use the notation f⃯ :: F -> F' (with F and F' type constructors) to mean f⃯ :: ∀ab. (a -> b) -> (F a -> F' b). Then we can specialize the free theorem to

iff  forall Functor(F), Functor(F'), Functor(f⃯) :: F -> F'.
     forall x :: F Int, x' :: F' Int.
       if x ℛ(f⃯ Int) x' then g x ℛ(f⃯ Int) g x'
-- `f⃯` is a function; recall that `Int` as a relation is the identity:
iff  forall Functor(F), Functor(F'), Functor(f⃯) :: F -> F'.
       f⃯ id . g ≡ g . f⃯ id

for any Functor-respecting f⃯.

Example continued: further specializing the free theorem

The free theorem we saw in the previous section has a very useful special case, which we will derive now. Recall that in order to prove that a higher order function f⃯ respects Functor we have to prove that

if g ℛ(a -> b) g', x ℛ(f⃯ a) x' then fmap g x ℛ(f⃯ b) fmap g' x'

As in the higher rank example, this is a proof obligation (as opposed to the application of a free theorem), so that we really have to consider relations a :: A ⇔ A' and b :: B ⇔ B' here; it’s not sufficient to consider functions only.

We can however derive a special case of the free theorem which is easier to use. Take some arbitrary polymorphic function k :: ∀a. F a -> F' a, and define the relational action f :: F ⇔ F' by

f(a) = k ⚬ F(a)

where we use k also as a relation. Then

     x ℛ(f a) x'
iff  i. x ℛ(k) i and i ℛ(F(a)) x'
-- k is a function
iff  k x ℛ(F(a)) x'
-- by the Functor Characterization
iff  F g (k x) ℛ(F b) F g' x'
-- naturality
iff  k (F g x) ℛ(F b) F g' x'
-- use k as a relation again
iff  F g x ℛ(k) k (F g x) ℛ(F b) F g' x'
-- pick k (F g x) as the intermediate
then F g x ℛ(f b) F g' x'
-- if we assume that fmap is the "real" functor instance
iff  fmap g x ℛ(f b) fmap g' x'

In the previous section we derived that the free theorem for g :: ∀f. Functor f => f Int -> f Int was

forall Functor(F), Functor(F'), Functor(f⃯) :: F -> F'.
  f⃯ id . g ≡ g . f⃯ id

for any higher order function which respects Functor. The f we defined above is a higher order function provided that a if a function, and we just proved that it must respect functor. The identity relation is certainly a function, so we can specialize the free theorem to

k . g ≡ g . k

for any polymorphic function k (no restrictions on k). As a special case, this means that we must have

reverse . g ≡ g . reverse

formalizing the earlier intuition that when we apply such a function to a list, the order of the list cannot matter.

 Example: ∀f. Functor f => (B -> f B) -> f A

As our last example, we will consider higher-order functions of type g :: ∀f. Functor f => (B -> f B) -> f A. The free theorem for such functions is

     g ℛ(f. Functor f => (B -> f B) -> f A) g
iff  forall Functor(F), Functor(F'), Functor(f) : FF'.
       g@F ℛ((B -> f B) -> f A) g@F'
iff  forall Functor(F), Functor(F'), Functor(f) : FF'.
     forall l :: B -> F B, l' :: B -> F' B.
       if l ℛ(B -> f B) l' then g l ℛ(f A) g l'

Specializing to higher order functions f⃯ :: ∀ab. (a -> b) -> F a -> F' b (rather than a relational action f), we get

     forall Functor(F), Functor(F'), Functor(f) : FF'.
     forall l :: B -> F B, l' :: B -> F' B.
       if l ℛ(B -> f⃯ B) l' then g l ℛ(f⃯ A) g l'
iff  forall Functor(F), Functor(F'), Functor(f) : FF'.
     forall l :: B -> F B, l' :: B -> F' B.
       if f⃯ id . l ≡ l' . id then f⃯ id (g l) ≡ g l'
-- simplify
iff  forall Functor(F), Functor(F'), Functor(f) : FF'.
     forall l :: B -> F B.
       f⃯ id (g l) ≡ g (f⃯ id . l)

for any Functor respecting f⃯; we can now apply the same reasoning as we did in the previous section, and give the following free theorem instead:

k (g l) ≡ g (k . l)

for any polymorphic function (that is, natural transformation) k :: ∀a. F a -> F' a and function l :: B -> F B. This property is essential when proving that the above representation of a lens is isomorphic to a pair of a setter and a getter; see Functor is to Lens as Applicative is to Biplate, Section 4, for details.


Parametricity allows us to formally derive what we can conclude about a function by only looking at its type. We’ve covered a lot of material in this tutorial, but there is a lot more out there still. If you want to know more, here are some additional references.


Thanks to Auke Booij on #haskell for his helpful feedback on both parts of this blog post.

Lightweight Checked Exceptions in Haskell

Friday, 31 July 2015, by Edsko de Vries, Adam Gundry.
Filed under coding.

Consider this function from the http-conduit library:

-- | Download the specified URL (..)
-- This function will 'throwIO' an 'HttpException' for (..)
simpleHttp :: MonadIO m => String -> m ByteString

Notice that part of the semantics of this function—that it may throw an HttpException—is encoded in a comment, which the compiler cannot check. This is because Haskell’s notion of exceptions offers no mechanism for advertising to the user the fact that a function might throw an exception.

Michael Snoyman discusses some solutions to this problem, as well as some common anti-patterns, in his blog post Exceptions Best Practices. However, wouldn’t it be much nicer if we could simply express in the type that simpleHttp may throw a HttpException? In this blog post I will propose a very lightweight scheme to do precisely that.

If you want to experiment with this yourself, you can download CheckedRevisited.hs (tested with ghc 7.2, 7.4, 7.6, 7.8 and 7.10).

Note. This is an improved version of this blog post; Checked.hs demonstrates the previous approach; see also the discussion on reddit on the original post and on the improved version.

Throwing checked exceptions

Let’s introduce a type class for “checked exceptions” (à la Java):

class Throws e where

Here’s the key idea: this will be a type class without any instances. If we want to record in the type that some IO action throws a checked exception, we can now just add the appropriate type class constraint. For instance, we can define

throwChecked :: (Exception e, Throws e) => e -> IO a
throwChecked = Base.throwIO

and then use that as

simpleHttp :: (MonadIO m, Throws HttpException) => String -> m ByteString
simpleHttp _ = liftIO $ throwChecked HttpException

Unless we explicitly catch this exception, this annotation will now be propagated to every use site of simpleHttp:

useSimpleHttp :: Throws HttpException => IO ByteString
useSimpleHttp = simpleHttp ""

Type annotations

There’s something a little peculiar about a type class constraint such as Throws HttpException: normally ghc will refuse to add a type class constraint for a known (constant) type. If you were to write

foo = throwChecked $ userError "Uhoh"

ghc will complain bitterly that

No instance for (Throws IOError)
    arising from a use of ‘throwChecked’
In the expression: throwChecked

until you give the type annotation explicitly (you will need to enable the FlexibleContexts language extension):

foo :: Throws IOError => IO a
foo = throwChecked $ userError "Uhoh"

I consider this a feature, not a bug of this approach: you are forced to explicitly declare the checked exceptions you throw.

Catching checked exceptions

In order to catch checked exceptions we need to somehow eliminate that Throws constraint. That is, we want a function of type

catchChecked :: Exception e => (Throws e => IO a) -> (e -> IO a) -> IO a

In the remainder of this section we explain how we can do this; it requires a bit of type level hacking, and the use of roles. Bear in mind though that you do not need to understand this section in order to be able to use checked exceptions; it suffices to know that a function catchChecked exists.

First, we define a newtype wrapper around an action that throws an exception:

newtype Wrap e a = Wrap { unWrap :: Throws e => a }

Then we define a newtype wrapper around the exception themselves:

newtype Catch e = Catch e

This Catch is used internally only and not exported; it is the only type that will get a Throws instance:

instance Throws (Catch e) where

Now we’re almost there. We are going to use coerce to pretend that instead of an exception of type e we have an exception of type Catch e:

coerceWrap :: Wrap e a -> Wrap (Catch e) a
coerceWrap = coerce

This requires the type argument e on the Throws class to be representational (this needs IncoherentInstances):

type role Throws representational

With all this in place, we can now eliminate Throws constraints:

unthrow :: proxy e -> (Throws e => a) -> a
unthrow _ = unWrap . coerceWrap . Wrap

and defining catchChecked is a simple matter:

catchChecked :: forall a e. Exception e
             => (Throws e => IO a) -> (e -> IO a) -> IO a
catchChecked act = Base.catch (unthrow (Proxy :: Proxy e) act)

Subclasses of exceptions

Suppose we had

readFile :: Throws IOException => FilePath -> IO String

then we can write a function to get a file either by reading a local file or by downloading it over HTTP:

get :: (Throws IOException, Throws HttpException)
    => String -> IO ByteString
get url = case removePrefix "file:" url of
            Just path -> readFile path
            Nothing   -> simpleHttp url

removePrefix :: [a] -> [a] -> Maybe [a]
removePrefix = ..

Alternatively we can define a bespoke exception hierarchy and combine the two exceptions:

data SomeGetException = forall e. Exception e => SomeGetException e

wrapIO :: (Throws IOException      => IO a)
       -> (Throws SomeGetException => IO a)
wrapIO = handleChecked $ throwChecked . SomeGetException

wrapHttp :: (Throws HttpException    => IO a)
         -> (Throws SomeGetException => IO a)
wrapHttp = handleChecked $ throwChecked . SomeGetException

get :: Throws SomeGetException => String -> IO ByteString
get url = case removePrefix "file:" url of
            Just path -> wrapIO   $ readFile path
            Nothing   -> wrapHttp $ simpleHttp url

This kind of custom exception hierarchy is entirely standard; I just wanted to show it fits nicely into this approach to checked exceptions.


There is one caveat to be aware of. Suppose we write

returnAction = return (simpleHttp "")

Ideally we’d give this a type such as

returnAction :: IO (Throws HttpException => IO ByteString)
returnAction = return (simpleHttp "")

But this requires impredicative types, which is still a no-go zone. Instead the type of returnAction will be

returnAction :: Throws HttpException => IO (IO ByteString)
returnAction = return (simpleHttp "")

which has the Throws annotation on returnAction itself; this means we can make the annotation disappear by adding an exception handler to returnAction even though it’s never called (because returnAction itself never throws any exception).

returnAction' :: IO (IO ByteString)
returnAction' = catchChecked returnAction neverActuallyCalled
    neverActuallyCalled :: HttpException -> IO (IO ByteString)
    neverActuallyCalled = undefined

This is somewhat unfortunate, but it occurs only infrequently and it’s not a huge problem in practice. If you do need to return actions that may throw exceptions, you can use a newtype wrapper such as Wrap that we used internally in rethrowUnchecked (for much the same reason):

returnAction :: IO (Wrap HttpException IO)
returnAction = return (Wrap $ simpleHttp "")

Of course you will probably want to define a datatype that is more meaningful for your specific application; for instance, see see the definition of HttpClient in the Repository.Remote module, which defines something like

data HttpClient = HttpClient {
    httpClientGet :: Throws SomeRecoverableException => URI -> ...


Of course, a type such as

simpleHttp :: (MonadIO m, Throws HttpException) => String -> m ByteString

does not tell you that this function can only throw HttpExceptions; it can still throw all kinds of unchecked exceptions, not least of which asynchronous exceptions. But that’s okay: it can still be incredibly useful to track some exceptions through your code.

So there you have it: checked exceptions in Haskell using

and without introducing a special new kind of monad (such as in the control-monad-exception package) and without complicated type level hacking as in the Checked Exception for Free blogpost.

Hackage Security Alpha Release

Wednesday, 08 July 2015, by Edsko de Vries.
Filed under cabal, community, industrial-haskell-group.

Well-Typed is very happy to announce the first alpha release of the Hackage Security library, along with integration into both cabal-install and the Hackage server and a tool for managing file-based secure repositories. This release is not yet ready for general use, but we would like to invite interested parties to download and experiment with the library and its integration. We expect a beta release running on the central Hackage server will soon follow.

Hackage Security and related infrastructure is a project funded by the Industrial Haskell Group to secure Hackage, the central Haskell package server. A direct consequence of this work is that we can have untrusted Hackage mirrors (mirror selection is directly supported by the library). A minor but important additional side goal is support for incremental updates of the central Hackage index (only downloading information about new packages, rather than all packages).

TL;DR: Hackage will be more secure, more reliable and faster, and cabal update should generally finish in seconds.

Security architecture

Security is notoriously difficult to get right, so rather than concocting something ad-hoc the Hackage Security framework is based on TUF, The Update Framework. TUF is a collaboration between security researches at the University of Arizona, the University of Berkeley and the Univerity of Washington, as well as various developers of the Tor project. It is a theory specifically designed for securing software update systems, and suits the needs of Hackage perfectly.

TUF covers both index signing and author signing. Index signing provides the means to verify that something we downloaded from a mirror is the same as what is available from the central server (along with some other security properties), thus making it possible to set up untrusted mirrors. It does not however deal with compromise of the central server. Author signing allows package authors to sign packages, providing a guarantee that the package you download is the one that the author uploaded. These two concerns are largely orthogonal, and the current project only adds support for index signing. Author signing will be the subject of a later project.

Very briefly, here is how it works. The bits in red refer to new features added as part of the Hackage Security work.

  1. Hackage provides a file 00-index.tar.gz (known as “the index”) which contains the .cabal files for all versions of all packages on the server. It is this file that cabal update downloads, and it is this file that cabal install uses to find out which packages are available and what their dependencies are.

    Hackage additionally provides a signed file /snapshot.json (“the snapshot”), containing a hash of the index. When cabal downloads the index it computes its hash and verifies it against the hash recorded in the snapshot. Since mirrors do not have the key necessary to sign the snapshot (“ the snapshot key”), if the hash matches we know that the index we downloaded, and hence all files within, was the same as on the central server.
  2. When you cabal install one or more packages, the index provides cabal with a list of packages and their dependencies. However, the index does not contain the packages themselves.

    The index does however contain package.json files for each version of each package, containing a hash for the .tar.gz of that package version. Since these files live in the index they are automatically protected by the snapshot key. When cabal downloads a package it computes the hash of the package and compares it against the hash recorded in the index; if it matches, we are guaranteed that the package is the same as the package on the central server, as the central server is the only one with access to the snapshot key.
  3. The client does not have built-in knowledge of the snapshot key. Instead, it can download /root.json (“the root metadata”) from the server, which contains the public snapshot key. The root metadata itself is signed by the root keys, of which the client does have built-in knowledge. The private root keys must be kept very securely (e.g. encrypted and offline).

This description leaves out lots of details, but the purpose of this blog post is not to give a full overview of TUF. See the initial announcement or the website of The Update Framework for more details on TUF; the Hackage Security project README provides a very detailed discussion on how our implemention of TUF relates to the official TUF specification.

Software architecture

Most of the functionality is provided through a new library called hackage-security, available from github, designed to be used by clients and servers alike.


Although we have integrated it in cabal-install, the hackage-security library is expressly designed to be useable by different clients as well. For example, it generalizes over the library to use for HTTP requests; cabal uses hackage-security-HTTP, a thin layer around the HTTP library. However, if a client such as stack wants to use the hackage-security library to talk to Hackage it may prefer to use hackage-security-http-client instead, a thin layer around the http-client library.

Using the library is very simple. After importing Hackage.Security.Client three functions become available, corresponding to points 1, 2 and 3 above:

checkForUpdates :: Repository -> CheckExpiry -> IO HasUpdates
downloadPackage :: Repository -> PackageId -> (TempPath -> IO a) -> IO a
bootstrap       :: Repository -> [KeyId] -> KeyThreshold -> IO ()

Some comments:

We have written an example client that demonstrates how to use this API; the example client supports both local and remote repositories and can use HTTP, http-client or curl, and yet is only just over 100 lines of code.


The server-side support provided by Hackage.Security.Server comes primarily in the form of datatypes corresponding to the TUF metadata, along with functions for constructing them.

It is important to realize that servers need not be running the Hackage software; mirrors of the central Hackage server may (and typically will) be simple HTTP file servers, and indeed company-internal package servers may choose not to use the Hackage software at all, using only file servers. We provide a hackage-security utility for managing such file-based repositories; see below for details.

Trying it out

There are various ways in which you can try out this alpha release, depending on what precisely you are interested in.

Resources at a glance

hackage-security library github tag “pre-release-alpha”
cabal-install github branch “using-hackage-security”
hackage github branch “using-hackage-security”

Secure Hackage snapshots

We provide two almost-complete secure (but mostly static) Hackage snapshots, located at

If you want to use cabal to talk to these repositories, you will need to download and build it from the using-hackage-security branch. Then change your cabal config and add a new section:

remote-repo alpha-snapshot
  secure: True
  root-keys: 89e692e45b53b575f79a02f82fe47359b0b577dec23b45d846d6e734ffcc887a
  key-threshold: 2

It suffices to point cabal to either mirror; TUF and the hackage-security library provide built-in support for providing clients with a list of mirrors. During the first check for updates cabal will download this list, and then use either mirror thereafter. Note that if you wish you can set the key-threshold to 0 and not specify any root keys; if you do this, the initial download of the root information will not be verified, but all access will be secure after that.

These mirrors are almost complete, because the first mirror has an intentional problem: the latest version of generics-sop does not match its signed hash (simulating an evil attempt from an attacker to replace the generics-sop library with DoublyEvil-0.3.142). If you attempt to cabal get this library cabal should notice something is amiss on this mirror, and automatically try again from the second mirror (which has not been “compromised”):

# cabal get generics-sop
Downloading generics-sop-
Selected mirror
Downloading package generics-sop-
Exception Invalid hash for .../generics-sop-
when using mirror
Selected mirror
Downloading package generics-sop-
Unpacking to generics-sop-

(It is also possible to use the example client to talk to these mirrors, or indeed to a secure repo of your own.)

Setting up a secure file-based repo

If you want to experiment with setting up your own secure repository, the easiest way to do this is to set up a file based repository using the hackage-security utility. A file based repository (as opposed to one running the actual Hackage software) is much easier to set up and will suffice for many purposes.

  1. Create a directory ~/my-secure-repo containing a single subdirectory ~/my-secure-repo/package. Put whatever packages you want to make available from your repo in this subdirectory. At this point your repository might look like


    (because obviously the generics-sop packages are the first things to come to mind when thinking about which packages are important to secure.) Note the flat directory structure: different packages and different versions of those packages all live in the one directory.

  2. Create public and private keys:

    # hackage-security create-keys --keys ~/my-private-keys

    This will create a directory structure such as


    containing keys for all the various TUF roles (proper key management is not part of this alpha release).

    Note that these keys are stored outside of the repository proper.

  3. Create the initial TUF metadata and construct an index using

    # hackage-security bootstrap \
                       --repo ~/my-secure-repo \
                       --keys ~/my-private-keys

    This will create a directory ~/my-secure-repo/index containing the .cabal files (extracted from the package tarballs) and TUF metadata for all packages


    and package the contents of that directory up as the index tarball ~/my-secure-repo/00-index.tar.gz; it will also create the top-level metadata files

  4. The timestamp and snapshot are valid for three days, so you will need to resign these files regularly using

    # hackage-security update \
                       --repo ~/my-secure-repo \
                       --keys ~/my-private-keys

    You can use the same command whenever you add any additional packages to your repository.

  5. If you now make this directory available (for instance, by pointing Apache at it) you should be able to use cabal to access it, in the same way as described above for accessing the secure Hackage snapshots. You can either set key-threshold to 0, or else copy in the root key IDs from the generated root.json file.

Setting up a secure Hackage server

If you are feeling adventurous you can also try to set up your own secure Hackage server. You will need to build Hackage from the using-secure-hackage branch.

You will need to create a subdirectory TUF inside Hackage’s datafiles/ directory, containing 4 files:


containing the list of mirrors, the root metadata, and the private snapshot and timestamp keys. You can create these files using the hackage-security utility:

  1. Use the create-keys as described above to create a directory with keys for all roles, and then copy over the snapshot and timestamp keys to the TUF directory.
  2. Use the create-root and create-mirrors commands to create the root and mirrors metadata. The create-mirrors accepts an arbitrary list of mirrors to be added to the mirrors metadata, should you wish to do so.

Note that the root.json and mirrors.json files are served as-is by Hackage, they are not used internally; the snapshot and timestamp keys are of course used to sign the snapshot and the timestamp.

Once you have created and added these files, everything else should Just Work(™). When you start up your server it will create TUF metadata for any existing packages you may have (if you are migrating an existing database). It will create a snapshot and a timestamp file; create metadata for any new packages you upload and update the snapshot and timestamp; and resign the snapshot and timestamp nightly. You can talk to the repository using cabal as above.

If you a have Hackage server containing a lot of packages (a full mirror of the central Hackage server, for instance) then migration will be slow; it takes approximately an hour to compute hashes for all packages on Hackage. If this would lead to unacceptable downtime you can use the precompute-fileinfo tool to precompute hashes for all packages, given a recent backup. Copy the file created by this tool to datafiles/TUF/ before doing the migration. If all hashes are precomputed migration only takes a few minutes for a full Hackage snapshot.

Integrating hackage-security

If you want to experiment with integrating the hackage-security library into your own software, the example client is probably the best starting point for integration in client software, and the hackage-security utility is probably a good starting point for integration in server software.


Please report any bugs or comments you may have as GitHub issues.


This is an alpha release, intended for testing by people with a close interest in the Hackage Security work. The issue tracker contains a list of issues to be resolved before the beta release, at which point we will make the security features available on the central Hackage server and make a patched cabal available in a more convenient manner. Note that the changes to Hackage are entirely backwards compatible, so existing clients will not be affected.

After the beta release there are various other loose ends to tie up before the official release of Phase 1 of this project. After that Phase 2 will add author signing.

Dependencies for Cabal Setup.hs files and other goodies

Monday, 06 July 2015, by Duncan Coutts.
Filed under cabal, community, industrial-haskell-group.

No untracked dependencies!

Years ago, back when Isaac Potoczny-Jones and others were defining the Cabal specification, the big idea was to make Haskell software portable to different environments. One of the mantras was “no untracked dependencies!”.

The problem at the time was that Haskell code had all kinds of implicit dependencies which meant that while it worked for you, it wouldn’t build for me. For example, I might not have some other module that it needed, or the right version of the module.

So of course that’s what the build-depends in .cabal files is all about, requiring that the author of the code declare just what the code requires of its environment. The other important part is that the build system only lets your code see the dependencies you’ve declared, so that you don’t accidentally end up with these untracked dependencies.

This mantra of no untracked dependencies is still sound. If we look at a system like nix, part of what enables it to work so well is that it is absolutely fanatical about having no untracked dependencies.

Untracked dependencies?!

One weakness in the original Cabal specification is with Setup.hs scripts. These scripts are defined in the spec to be the entry point for the system. According to the Cabal spec, to build a package you’re required to compile the Setup.hs script and then use its command line interface to get things done. Because in the original spec the Setup.hs is the first entry point, it’s vital that it be possible to compile Setup.hs without any extra fuss (the runhaskell tool was invented just to make this possible, and to make it portable across compilers).

But by having the Setup.hs as the primary entry point, it meant that it’s impossible to reliably use external code in a Setup.hs script, because you cannot guarantee that that code is pre-installed. Going back to the “no untracked dependencies” mantra, we can see of course that all dependencies of Setup.hs scripts are in fact untracked!

This isn’t just a theoretical problem. Haskell users that do have complex Setup.hs scripts often run into versioning problems, or need external tools to help them get the pre-requisite packages installed. Or as another example: Michael Snoyman noted earlier this year in a diagnosis of an annoying packaging bug that:

As an aside, this points to another problematic aspect of our toolchain: there is no way to specify constraints on dependencies used in custom Setup.hs files. That’s actually caused more difficulty than it may sound like, but I’ll skip diving into it for now.

The solution: track dependencies!

As I said, the mantra of no untracked dependencies is still sound, we just need to apply it more widely.

These days the Setup.hs is effectively no longer a human interface, it is now a machine interface used by other tools like cabal or by distro’s install scripts. So we no longer have to worry so much about Setup.hs scripts always compiling out of the box. It would be acceptable now to say that the first entry point for a tool interacting with a package is the .cabal file, which might list the dependencies of the Setup.hs. The tool would then have to ensure that those dependencies are available when compiling the Setup.hs.

So this is exactly what we have now done. Members of the Industrial Haskell Group have funded us to fix this long standing problem and we have recently merged the solution into the development version of Cabal and cabal-install.

From a package author’s point of view, the solution looks like this: in your .cabal file you can now say:

build-type: Custom

  setup-depends: base >= 4.6,
                 directory >= 1.0,
                 Cabal >= 1.18 && < 1.22,
                 acme-setup-tools == 0.2.*

So it’s a new stanza, like libraries or executables, and like these you can specify the library dependencies of the Setup.hs script.

Now tools like cabal will compile the Setup.hs script with these and only these dependencies, just like it does normally for executables. So no more untracked dependencies in Setup.hs scripts. Newer cabal versions will warn about not using this new section. Older cabal versions will ignore the new section (albeit with a warning). So over time we hope to encourage all packages with custom setup scripts to switch over to this.

In addition, the Setup.hs script gets built with CPP version macros (MIN_VERSION_{pkgname}) available so that the code can be made to work with a wider range of versions of its dependencies.

In the solver…

So on the surface this is all very simple and straightforward, a rather minor feature even. In fact it’s been remarkably hard to implement fully for reasons I’ll explain, but the good news is that it works and the hard work has also gotten us solutions to a couple other irksome problems.

Firstly, why isn’t it trivial? It’s inevitable that sooner or later you will find that your application depends on one package that has setup deps like Cabal == 1.18.* and another with setup deps like Cabal == 1.20.*. At that point we have a problem. Classically we aim to produce a build plan that uses at most one version of each package. We do that because otherwise there’s a danger of type errors from using multiple versions of the same package. Here with setup dependencies there is no such danger: it’s perfectly possible for me to build one setup script with one version of the Cabal library and another script with a different Cabal version. Because these are executables and not libraries, the use of these dependencies does not “leak”, and so we would be safe to use different versions in different places.

So we have extended the cabal solver to allow for limited controlled use of multiple versions of the same package. The constraint is that all the “normal” libraries and exes all use the same single version, just as before, but setup scripts are allowed to introduce their own little world where independent choices about package versions are allowed. To keep things sane, the solver tries as far as possible not to use multiple versions unless it really has to.

If you’re interested in the details in the solver, see Edsko’s recent blog post.

Extra goodies

This work in the solver has some extra benefits.

Improve Cabal lib API without breaking everything

In places the Cabal library is a little crufty, and the API it exposes was never really designed as an API. It has been very hard to fix this because changes in the Cabal library interface break Setup.hs scripts, and there was no way for packages to insulate themselves from this.

So now that we can have packages have proper dependencies for their custom Setup.hs, the flip side is that we have an opportunity to make breaking changes to the Cabal library API. We have an opportunity to throw out the accumulated cruft, clean up the code base and make a library API that’s not so painful to use in Setup.hs scripts.

Shim (or compat) packages for base

Another benefit is that the new solver is finally able to cope with having “base shim” packages, as we used in the base 3.x to 4.x transition. For two GHC releases, GHC came with both base-3.x and base-4.x. The base-4 was the “true” base, while the base-3 was a thin wrapper that re-exported most of base-4 (and syb), but with some changes to implement the old base-3 API. At the time we adapted cabal to cope with this situation of having two versions of a package in a single solution.

When the new solver was implemented however support for this situation was not added (and the old solver implementation was retained to work with GHC 6.12 and older).

This work for setup deps has made it relatively straightforward to add support for these base shims. So next time GHC needs to make a major bump to the version of base then we can use the same trick of using a shim package. Indeed this might also be a good solution in other cases, perhaps cleaner than all these *-compat packages we’ve been accumulating.

It has also finally allowed us to retire the old solver implementation.

Package cycles involving test suites and benchmarks

Another feature that is now easy to implement (though not actually implemented yet) is dealing with the dependency cycles in packages’ test suites and benchmarks.

Think of a core package like bytestring, or even less core like Johan’s cassava csv library. These packages have benchmarks that use the excellent criterion library. But of course criterion is a complex beast and itself depends on bytestring, cassava and a couple dozen other packages.

This introduces an apparent cycle and cabal will fail to find an install plan. I say apparent cycle because there isn’t really a cycle: it’s only the benchmark component that uses criterion, and nothing really depends on that.

Here’s another observation: when benchmarking a new bytestring or cassava, it does not matter one bit that criterion might be built against an older stable version of bytestring or cassava. Indeed it’s probably sensible that we use a stable version. It certainly involves less rebuilding: I don’t really want to rebuild criterion against each minor change in bytestring while I’m doing optimisation work.

So here’s the trick: we break the cycle by building criterion (or say QuickCheck or tasty) against another version of bytestring, typically some existing pre-installed one. So again this means that our install plan has two versions of bytestring in it: the one we mean to build, and the one we use as a dependency for criterion. And again this is ok, just as with setup dependencies, because dependencies of test suites and benchmarks do not “leak out” and cause diamond dependency style type errors.

One technical restriction is that the test suite or benchmark must not depend on the library within the same package, but must instead use the source files directly. Otherwise there would genuinely be a cycle.

Now in general when we have multiple components in a .cabal file we want them to all use the same versions of their dependencies. It would be deeply confusing if a library and an executable within the same package ended up using different versions of some dependency that might have different behaviour. Cabal has always enforced this, and we’re not relaxing it now. The rule is that if there are dependencies of a test suite or benchmark that are not shared with the library or executable components in the package, then we are free to pick different versions for those than we might pick elsewhere within the same solution.

As another example – that’s nothing to do with cycles – we might pick different versions of QuickCheck for different test suites in different packages (though only where necessary). This helps with the problem that one old package might require QuickCheck == 2.5.* while another requires QuickCheck == 2.8.*. But it’d also be a boon if we ever went through another major QC-2 vs QC-3 style of transition. We would be able to have both QC-2 and QC-3 installed and build each package’s test suite against the version it requires, rather than freaking out that they’re not the same version.

Private dependencies in general

Technically, this work opens the door to allowing private dependencies more generally. We’re not pursuing that at this stage, in part because it is not clear that it’s actually a good idea in general.

Mark Lentczner has pointed out the not-unreasonable fear that once you allow multiple versions of packages within the same solution it will in practice become impossible to re-establish the situation where there is just one version of each package, which is what distros want and what most people want in production systems.

So that’s something we should consider carefully as a community before opening those flood gates.

Summer School on Generic and Effectful Programming

Wednesday, 10 June 2015, by Andres Löh.
Filed under training, well-typed.

I’m one of the lecturers at

Summer School on Generic and Effectful Programming

St Anne’s College, Oxford, 6th to 10th July 2015

(Register here)

Datatype-generic programming was the topic of my PhD thesis many years ago, and it has continued to be a fascinating field of work and research for me since then.

At the upcoming summer school, I will give a three-lecture course on Applying Type-level and Generic Programming in Haskell. In this course, I will describe the state-of-the-art of datatype-generic programming in Haskell/GHC. This means we’ll look at the GHC extension that allows you to generically derive your own type classes, but also at the relatively recent generics-sop library. We will discuss the GHC type system features that make all of this possible, such as data kinds, kind polymorphism, GADTs, higher-rank types, constraint kinds and more, and we will look at a number of real-world applications of generic programming, taken, e.g., from the areas of web programming and databases.

But my course is only one of many. Ralf Hinze, the main organizer, has done an outstanding job and assembled a fantastic lineup of lecturers: I’m honoured to be teaching alongside Edwin Brady, Fritz Henglein, Conor McBride, Don Syme and Tarmo Uustalu. I am sure I will learn a lot from them and their lectures.

If you always wanted to learn more about generic and effectful programming, this is your chance! You can still register for the school! I’d be happy to see you there.

Cabal & Hackage hacking at ZuriHac

Monday, 01 June 2015, by Duncan Coutts.
Filed under cabal, community.

At ZuriHac this weekend we had eight people hacking on Cabal or Hackage, many of whom are new contributors. There were a number of projects started as well as a number of smaller fixes completed.

In addition, there are three Google Summer of Code students working on Cabal and Hackage projects this summer:

They’re all just getting started, so more news about them later. All in all there seems to be a decent amount of progress at the moment across a range of issues. In particular we’re getting closer to solving some of the thornier “Cabal Hell” problems.

Heroic bug squashing

Oleg Grenrus was a bit of a hero in that as a new Cabal contributor, over two days of the hackathon, he managed to send in pull requests to fix five open tickets.

Another couple chaps (whose names to my shame have slipped my mind) dived in to fix old tickets on sanity checking absolute/relative paths for library directories in .cabal files and config files, and on passing GHC env vars to sub-commands like in cabal run/exec/test.

These in addition to the flurry of pull requests in recent weeks, and others from the hackathon, has given the regular Cabal hackers quite a pile of patches to get through. We hope to review and merge them in the next week or so.

Integrating package security for Cabal/Hackage

The work on securing the package download process that we announced a while ago is nearing the integration phase. While it’s been useful to have a couple people concentrate on implementing the core hackage-security library, at this stage it makes sense to open the process up and get more people involved.

Matthias Fischmann had proposed it as a ZuriHac project and organised a group of people who were interested. We discussed some of the issues involved with using the new hackage-security code in the cabal-install tool, and got started on some of the tasks.

Bootstrapping repository security

With public key crypto systems there’s always a need to somehow bootstrap the trust chains. For example with the public web certificate system (used by TLS / HTTPS) the root of the trust chains is the certificate authorities. We must know and trust their public keys to be able to verify the chain of trust for any particular website. But how should we establish trust in the certificate authorities’ keys in the first place? With web browsers this bootstrapping problem is resolved by the browser (or OS) shipping pre-installed with all the CA public keys.

For hackage servers we face a similar bootstrapping problem. Hackage security does not use public certificate authorities but there is a similar root of trust in the form of a set of root keys. For the central community we can of course do the same thing as the web browsers and ship the server’s public root keys with cabal-install. But we need to support people making their own repositories and obviously we can’t ship all the public keys. So we need a convenient way for people to configure cabal-install to establish trust in a particular repository. The obvious thing is to specify the trusted public keys in the cabal configuration, where you specify the repository to use.

Currently in a cabal configuration file that part looks like:


This syntax is too limited to support adding extra attributes like keys. So what people were working on at ZuriHac was supporting something like this:

  keys: ed25519:9fc1007af2baff7088d082295e755102c1593cdd24b5282adbfa0613f30423f6

So someone hosting their own hackage repo can provide instructions with a sample cabal.config or a block of text like the above to copy and paste into their config file, for people to use to get started.

This more flexible syntax will also allow miscellaneous other repository settings such as specific mirrors to use, or the ability to turn off security entirely.


Another couple people got started on writing a mirror client using the hackage-security library. While mirroring does not need a dedicated tool it is a rather convenient and efficient approach. It means we can use the ordinary HTTP interface rather than providing rsync or another interface and we can still do very bandwidth-efficient synchronisation. The advantage over a generic HTTP mirroring tool is that we have an index of the available packages and we know that existing packages are immutable, so we can simply diff the source and target indexes and copy over the extra packages.

In fact there are already two hackage mirror clients that do this. One of them pulls from one repo and pushes to a “smart” hackage-server. The other pulls from a repo and pushes to a repo hosted via S3. What is missing is the ability to mirror to a simple local set of files. Mirrors don’t have to be full hackage-server instances or S3, they can be ordinary HTTP servers like Apache or nginx that point at a set of files in the right layout. The hackage-security library is a convenient tool to use to write this kind of mirror since it handles all the details of the repository layout, and it supports doing incremental updates of the repository index. In this use case the security checks are merely sanity checks, as in the end, clients downloading from this mirror do their own checks.

So the work started by taking the existing hackage-server mirror and hackage-security demo client with the goal of replacing (or extending) the guts of the mirror client to use the hackage-security lib to download and to be able to manage a target repo as a set of local files.

Once the security work is integrated it will become much more useful to have public mirrors because clients then don’t need to trust the mirrors (we’re safe from MITM attacks). And hackage will distribute a list of public mirrors that the clients will use automatically. So having a decent mirroring client will become rather important. It’s also useful for the synchronisation to be very efficient so that the public mirrors can be nearly live copies.

Solving the cabal sandbox / global packages problem

A problem that people have been complaining about recently is that the Haskell Platform ships with lots of packages in the global package database, making it hard to compile packages that require non-standard versions of the platform packages.

This isn’t really a problem with the Haskell Platform at all, it’s really a problem with how cabal-install constructs its sandboxes, and fortunately it’s one that seems relatively easy to fix. Good progress was made on this ticket over the hackathon and hopefully it will be completed within the next couple weeks.

The problem is that cabal sandbox init makes an environment where with a package database stack consisting of the global one, plus a new empty local one. This means all the global packages are implicitly inside the sandbox already. That’s not so useful when you want to start with a minimal sandbox.

Originally this was a GHC limitation, that we always had to use the global package DB, however that has been fixed now for a couple GHC releases. So the solution we went for is to use only a local empty package DB, and to copy the registration information for a certain set of core packages into the local package DB. Ultimately we would like GHC to come supplied with the list of core packages, but for now we can just hard code the list.

Improving the tagging feature on Hackage

One new contributor started work on reimplementing the Hackage website’s tagging feature to make it more flexible and useful. The key idea is to make package categories into tags and make it easier to curate tags by solving the problem that lots of tags are essentially aliases for each other. This happens because each package author picks their tags themselves. So we will have sets of tag aliases, each with a canonical representative. Then any package using any alias will be assigned the canonical tag. The data model and user interface will make it possible for trustees to decide which existing tags ought to be aliased together and then do it. Ultimately, the tags and aliases should be useful in the existing hackage search.

Supporting curated package collections in Cabal and Hackage

Curated package collections are one of the two major parts to solving Cabal Hell.

People started work on supporting these natively in Cabal and Hackage. The idea is that proper integration will make them easier to use, more flexible and easier for people to make and distribute curated collections. Examples of curated collections include stackage (LTS and nightly snapshots) and the sets of versions distributed by Linux distros. Integration will allow simpler and shorter configurations, easier switching between collections and the ability to easily define collections either to distribute on Hackage or to use locally. By teaching cabal about collections it can give better error messages (e.g. when something cannot be installed because it’s not consistent with the collection(s) in use). Making collections easier to distribute via Hackage, and easier to combine, might open up new possibilities. For example we might see collections specifically to help people work with the popular web stacks (e.g. if those cannot always fit into a large general purpose collection). Or we might see collections of things you might like to avoid such as deprecated or known-broken packages. Combining collections would then allow you to configure cabal to use a large collection intersected with the negation of the collection of deprecated packages.

Parametricity Tutorial (Part 1)

Saturday, 23 May 2015, by Edsko de Vries.
Filed under coding.

A powerful feature of Haskell’s type system is that we can deduce properties of functions by looking only at their type. For example, a function of type

f :: a. a -> a

can only be the identity function: since it must return something of type a, for any type a, the only thing it can do is return the argument of type a that it was given (or crash). Similarly, a function of type

f :: a. a -> a -> a

can only do one of two things: either return the first argument, or return the second. This kind of reasoning is becoming more and more important with the increasing use of types such as this definition of a “lens”:

type Lens s t a b = forall f. Functor f => (a -> f b) -> s -> f t

Since a lens is just a function of a particular type, the only thing we can conclude about such a function is whatever we can deduce from its type.

To reason about the properties of functions based only on their types we make use of the theory of parametricity, which tells how to derive a so-called “free theorem” from a type. This blog post is a tutorial on how to do this; it won’t explain why the theory works, only how it works. A Haskell practitioner’s guide to parametricity, if you will.

This is a two-part blog post. In part 1 (this post) we will cover the basics: constant types, functions and polymorphism (over types of kind *).

In part 2 we will deal with more advanced material: type constructor (types of kind * -> *), type classes, polymorphism over type constructors and type constructor classes.

The Basics

The main theorem of parametricity is the following:

if f :: t then f ℛ(t) f

When t is a closed type, ℛ(t) is a relation between two terms of type t (we shall see later that the type of is actually slightly more general). In words, parametricity states that any term f of type t is related to itself by ℛ(t). Don’t worry if this all looks incredibly abstract! We shall see lots and lots of examples.

Constant types (types of kind *)

For any constant type C, the relation ℛ(C) is the identity relation. In other words,

     x ℛ(C) x'
iff  x ≡ x'

(We will use ≡ throughout to mean mathematical equality, to distinguish it from Haskell’s equality function (==).)

Let’s see an example. Suppose that x :: Int. Then parametricity tells us that

     x ℛ(Int) x
iff  x ≡ x

I.e., it tells us that x is equal to itself. Not very insightful! Intuitively, this makes sense: if all we know about x is that it is an integer, we cannot tell anything about its value.

TOOLING. Many of the examples in this blog post (though sadly not all) can also be auto-derived by one of two tools. On the #haskell IRC channel we can ask lambdabot to derive free theorems for any types not involving type classes or type constructor classes. If you ask

@free x :: Int

lambdabot will reply with

x = x

(I recommend starting a private conversation with lambdabot so you avoid spamming the whole #haskell channel.)

Alternatively, you can also try the online free theorem generator. This free theorem generator is a bit more precise than lambdabot (which takes some shortcuts sometimes), and supports type classes, but cannot work with type constructors (lambdabot can work with unknown type constructors but not with quantification over type constructors, unfortunately).


For functions we map related arguments to related results:

     f ℛ(A -> B) f'
iff  forall x, x'.
       if x ℛ(A) x' then f x ℛ(B) f' x'

(The types of x and x' here depend on what precisely A is; see The type of ℛ, below.)

Example. Suppose f :: Int -> Bool. By parametricity

     f ℛ(Int -> Bool) f
iff  forall x :: Int, x' :: Int.
       if x ℛ(Int) x' then f x ℛ(Bool) f x'
-- both Int and Bool are constant types
iff  forall x :: Int, x' :: Int.
       if x ≡ x' then f x ≡ f x'
-- simplify
iff  f ≡ f

Again, not very informative. Parametricity doesn’t tell us anything about functions between constant types. Time to look at something more interesting!

Polymorphism (over types of kind *)

The definition for polymorphic values is

     f ℛ(a. t) f'
iff  forall A, A', a :: AA'.
       f@A ℛ(t) f'@A'           -- where 'a' can occur free in t

That is, whenever we pick two types A and A', and some relation a between A and A', the function f@A obtained by instantiating the type variable by A must be related to the function f'@A' obtained by instantiating the type variable by A'. In what follows we will write explicit type instantiation like this only if the type is not clear from the context; specifically, we will omit it when we supply arguments to the function.

The type of ℛ.

∀ab. a -> b -> a is an example of a closed type: all type variables are bound by a universal quantifier. An open type is a type with free type variables such as ∀b. a -> b -> a or even a -> b -> a. (Note that this distinction is harder to see in Haskell where universal quantifiers are often implicit. We will not follow that convention in this article.)

We said in the introduction that if t is a closed type, ℛ(t) relates two terms of type t. As we saw, in order to be able to give a meaning to open types we need a mapping from any free variable a to a relation a :: A ⇔ A'. In this article we somewhat informally maintain this mapping simply by using the same name for the type variable and the relation.

Given two relations a :: A ⇔ A' and b :: B ⇔ B', ℛ(a -> b -> a) relates terms of type A -> B -> A with terms of type A' -> B' -> A'. It is important to realize that can therefore relate terms of different types. (For a more precise treatment, see my Coq formalization of a part of this blog post.)

The interpretation of for a free type variable a is defined in terms of the corresponding relation:

     x ℛ(a) x'     -- the type variable
iff  (x, x') ∈ a    -- the relation

Example: ∀a. a -> a

Let’s consider a number of examples, starting with an f :: ∀a. a -> a:

     f ℛ(a. a -> a) f
-- parametricity
iff  forall A, A', a :: AA'.
       f@A ℛ(a -> a) f@A'
-- definition for function types
iff  forall A, A', a :: AA', x :: A, x' :: A'.
       if x ℛ(a) x' then f x ℛ(a) f x'

It might not be immediately evident from the last line what this actually allows us to conclude about f, so let’s look at this a little closer. A function g is a special kind of relation, relating any argument x to g x; since the property holds for any kind of relation a : A ⇔ A', it must also hold for a function a⃯ :: A -> A':

     forall x, x'.
       if x ℛ(a⃯) x' then f x ℛ(a⃯) f x'
-- x ℛ(a⃯) x' iff a⃯ x ≡ x'
iff  forall x :: A, x' :: A'.
       if a⃯ x ≡ x' then a⃯ (f x) ≡ f x'
-- simplify
iff  forall x :: A,
       a⃯ (f x) ≡ f (a⃯ x)

We can apply this result to show that any f :: ∀a. a -> a must be the identity function: picking a⃯ = const x, we get const x (f x) ≡ f (const x x), i.e. x ≡ f x, as required.

NOTE. We are doing fast and loose reasoning in this tutorial and will completely ignore any totality issues. See Automatically Generating Counterexamples to Naive Free Theorems, or the associated web interface, for some insights about what wrong conclusions we can draw by ignoring totality.

Example: ∀a. a -> a -> a

Intuitively, there are only two things a function f :: ∀a. a -> a -> a can do: it can either return its first argument, or it can return its second argument. What does parametricity tell us? Let’s see:

     f ℛ(a. a -> a -> a) f
iff  forall A, A', a :: AA'.
       f@A ℛ(a -> a -> a) f@A'
-- applying the rule for functions twice
iff  forall A, A', a :: AA', x :: A, x' :: A', y :: A, y' :: A'.
       if x ℛ(a) x', y ℛ(a) y' then f x y ℛ(a) f x' y'

Let’s again specialize the last line to pick a function a⃯ :: A -> A' for the relation a:

     forall x :: A, x' :: A', y :: A, y' :: A'.
       if x ℛ(a⃯) x', y ℛ(a⃯) y' then f x y ℛ(a⃯) f x' y'
-- a⃯ is a function
iff  forall x :: A, y :: A.
       if a⃯ x ≡ x' and a⃯ y ≡ y' then a⃯ (f x y) ≡ f x' y'
-- simplify
iff  a⃯ (f x y) = f (a⃯ x) (a⃯ y)

So parametricity allows us to “push in” or “distribute” a⃯ over f. The fact that f must either return its first or its second argument follows from parametricity, but not in a completely obvious way; see the reddit thread How to use free theorems.

 Example: ∀ab. a -> b

Other than undefined (which we are ignoring), there can be no function f :: ∀ab. a -> b. Let’s suppose that one did exist; what does parametricity tell us?

     f ℛ(ab. a -> b) f
-- applying the rule for universal quantification, twice
iff  forall A, A', B, B', a :: AA', b :: BB'.
       f@A,B ℛ(a -> b) f@A',B'
-- applying the rule for functions
iff  forall A, A', B, B', a :: AA', b :: BB', x :: A, x' :: A'.
       if x ℛ(a) x' then f x ℛ(b) f x'

Picking two functions a⃯ :: A -> A' and b⃯ :: B -> B' for a and b, we get

b⃯ . f = f . a⃯

It’s not too difficult to derive contradiction from this (remember that you can pick any two functions a⃯ and b⃯ between any types of your choosing). Hence, such a function cannot exist.

Example: ∀ab. (a -> b) -> a -> b

The only thing a function of this type can do is apply the supplied function to the supplied argument (alternatively, if you prefer, this must be the identity function). Let’s spell this example out in a bit of detail because it is our first example of a higher order function.

     f ℛ(ab. (a -> b) -> a -> b) f
-- apply rule for polymorphism, twice
iff  forall A, A', B, B', a :: AA', b :: BB'.
       f@A,B ℛ((a -> b) -> a -> b) f@A',B'
-- apply rule for functions, twice
iff  forall A, A', B, B', a :: AA', b :: BB'.
     forall g :: A -> B, g' :: A' -> B', x :: A, x' :: A'.
       if g ℛ(a -> b) g' and x ℛ(a) x' then f g x ℛ(b) f g' x'

Let’s expand what that premise g ℛ(a -> b) g' means:

     g ℛ(a -> b) g'
iff  forall y :: A, y' :: A'.
       if y ℛ(a) y' then g y ℛ(b) g' y'

For the special case that we pick functions a⃯ :: A -> A' and b⃯ :: B -> B' for a and b, that premise collapses to

     forall y :: A, y' :: A'.
       if y ℛ(a⃯) y' then g y ℛ(b⃯) g' y'
-- a⃯ and b⃯ are functions
iff  forall y :: A, y' :: A'.
       if a⃯ y ≡ y' then b⃯ (g y) ≡ g' y'
-- simplify
iff  forall y :: A.
       b⃯ (g y) ≡ g' (a⃯ y)
-- simplify (extensionality)
iff  b⃯ . g ≡ g' . a⃯

So that the free theorem for f :: ∀ab. (a -> b) -> a -> b becomes

if b⃯ . g ≡ g' . a⃯ then b⃯ . f g ≡ f g' . a⃯

Picking b⃯ = const g, g' = g, and a⃯ = id (verify that the premise holds) we get that indeed g ≡ f g, as expected.

Useful shortcut. This pattern is worth remembering:

      g ℛ(a⃯ -> b⃯) g'
iff   b⃯ . g ≡ g' . a⃯
whenever a⃯ and b⃯ are function(al relation)s.

Example: ∀ab. (∀c. c -> String) -> a -> b -> String

A function f :: ∀ab. (∀c. c -> String) -> a -> b -> String is not only higher order, but has a rank-2 type: it insists that the function it is given is itself polymorphic. This makes it possible to write, for example

f g x y = g x ++ g y

Note that since x and y have different types, it is important that g is polymorphic. What is the free theorem for f?

     f ℛ(ab. (c. c -> String) -> a -> b -> String) f
-- apply rule for polymorphism, twice
iff  forall A, A', B, B', a :: AA', b :: BB'.
       f ℛ((c. c -> String) -> a -> b -> String) f
-- apply rule for functions three times, and simplify ℛ(String) to (≡)
iff  forall A, A', B, B', a :: AA', b :: BB'.
     forall g :: c. c -> String, g' :: c. c -> String.
     forall x :: A, x' :: A', y :: B, y' :: B'.
         g ℛ(c. c -> String) g', x ℛ(a) x', y ℛ(b) y'
         f g x y ≡ f g' x' y'

Specializing this theorem to functions a⃯ :: A -> A' and b⃯ :: B -> B' we get

forall A, A', B, B', a⃯ :: A -> A', b⃯ :: B -> B'.
forall g :: c. c -> String, g' :: c. c -> String.
forall x :: A, y :: B.
    g ℛ(c. c -> String) g'
    f g x y ≡ f g' (a⃯ x) (b⃯ y)

But that is somewhat surprising, because it seems to say that the values of x and y cannot matter at all! What is going on? Expanding the first premise:

     g ℛ(c. c -> String) g'
iff  forall C, C', c :: CC',
       g ℛ(c -> String) g'
iff  forall C, C', c :: CC', z :: C, z' :: C'.
       if z ℛ(c) z' then g z = g' z'

Let’s stop for a moment to ponder what this requirement for g and g' really says: given any relation c, and any elements z and z' that are related by c—in other words, for any z and z' at all—we must have that g z and g' z' give us equal results. This means that g and g' must be constant functions, and the same constant function at that. As a consequence, for any function f of the above type, f g must itself be constant in x and y. In part two we will see a more useful variation which uses the Show type class.

Incidentally, note that this quantification over an arbitrary relation c is a premise to the free theorem, not a conclusion; hence we cannot simply choose to consider only functions c.

TOOLING. Indeed, if you enter

(forall c. c -> String) -> a -> b -> String
in the online free theorem generator you will see that it first gives the free theorem using relations only, and then says it will reduce all “permissible” relation variables to functions; in this example, that is all relations except for c; lambdabot doesn’t make this distinction and always reduces relations to functions, which is not correct.

To be continued

Part 2 covers type constructors, type classes and type constructor classes. Meanwhile, here are some links to papers on the subject if you want to read more.

Previous entries