A queue is a datastructure that provides efficient—O(1)—operations to remove an element from the *front* of the queue and to insert an element at the *rear* of the queue. In this blog post we will discuss how we can take advantage of laziness to implement such queues in Haskell, both with amortised and with worst-case O(1) bounds.

The results in this blog post are not new, and can be found in Chris Okasaki’s book “Purely Functional Data Structures”. However, the implementation and presentation here is different from Okasaki’s. In particular, the technique we use for real-time datastructures is more explicit and should scale to datastructures other than queues more easily than Okasaki’s.

To set the stage, consider this first attempt at implementing queues:

```
class Queue q where
empty :: q a
head :: q a -> a
tail :: q a -> q a
snoc :: q a -> a -> q a
data Queue0 a = Q0 [a]
instance Queue Queue0 where
empty = Q0 []
head (Q0 (x:_ )) = x
tail (Q0 (_:xs)) = Q0 xs
snoc (Q0 xs ) x = Q0 (xs ++ [x])
```

What is the complexity of `head`

and `snoc`

in this representation? Your first instinct might be to say that `head`

has O(1) complexity (after all, it doesn’t do anything but a pattern match) and that `snoc`

has O(*n*) complexity, because it needs to traverse the entire list before it can append the element.

However, Haskell is a lazy language. All that happens when we call `snoc`

is that we create a thunk (a suspended computation), which happens in O(1) time. Consider adding the elements `[1..5]`

into an empty queue, one at a time:

```
Q0 []
Q0 ([] ++ [1])
Q0 (([] ++ [1]) ++ [2])
Q0 ((([] ++ [1]) ++ [2]) ++ [3])
Q0 (((([] ++ [1]) ++ [2]) ++ [3]) ++ [4])
Q0 ((((([] ++ [1]) ++ [2]) ++ [3]) ++ [4]) ++ [5])
```

Now when we call `head`

on the resulting queue, `(++)`

needs to traverse this entire chain before it can find the first element; since that chain has O(*n*) length, the complexity of `head`

is O(*n*).

Thinking about complexity in a lazy setting can be confusing, so let’s first think about a spine strict queue. In order to define it, we will need a spine-strict list:

`data StrictList a = SNil | SCons a !(StrictList a)`

A bang annotation here means each evaluating an `SCons`

node to weak-head normal form (for instance by pattern matching on it) will also force its tail to weak head normal form, and hence the entire spine of the list; we cannot have an `SCons`

node with a pointer to an unevaluated tail.

We will also need a few operations on strict lists:

```
-- | Append two strict lists
app :: StrictList a -> StrictList a -> StrictList a
app SNil ys = ys
app (SCons x xs) ys = SCons x (app xs ys)
-- | Reverse a strict list
rev :: StrictList a -> StrictList a
rev = go SNil
where
go :: StrictList a -> StrictList a -> StrictList a
go acc SNil = acc
go acc (SCons x xs) = go (SCons x acc) xs
```

The definition of strict lists in hand, we can attempt our next queue implementation:

`data Queue1 a = Q1 !Int !(StrictList a) !Int !(StrictList a)`

Instead of using a single list, we split the queue into two parts: the *front* of the queue and the *rear* of the queue. The front of the queue will be stored in normal order, so that we can easily remove elements from the front of the queue; the rear of the queue will be stored in reverse order, so that we can also easily insert new elements at the end of the queue.

In addition, we also record the size of both lists. We will use this to enforce the following invariant:

Queue Invariant: The front of the queue cannot be shorter than the rear.

(Simpler invariants are also possible, but this invariant is the one we will need later so we will use it throughout this blogpost.)

When the invariant is violated, we restore it by moving the elements from the rear of the queue to the front; since the rear of the queue is stored in reverse order, but the front is not, the rear must be reversed:

```
inv1 :: Queue1 a -> Queue1 a
inv1 q@(Q1 f xs r ys)
| f < r = Q1 (f+r) (xs `app` rev ys) 0 SNil
| otherwise = q
```

The invariant can be violated when we shrink the front or grow the rear, so we end up with this implementation of the `Queue`

interface:

```
instance Queue Queue1 where
empty = Q1 0 SNil 0 SNil
head (Q1 _ (SCons x _ ) _ _ ) = x
tail (Q1 f (SCons _ xs) r ys) = inv1 $ Q1 (f-1) xs r ys
snoc (Q1 f xs r ys) y = inv1 $ Q1 f xs (r+1) (SCons y ys)
```

Since we don’t have to think about laziness, the complexity of this queue implementation is a bit easier to determine. Clearly, `head`

is O(1), and both `tail`

and `snoc`

have worst case `O(n)`

complexity because `rev`

has O(*n*) complexity. However, consider what happens when we insert [1..7] into an empty queue:

```
Q1 0 [] 0 []
Q1 1 [1] 0 [] -- invariant restored
Q1 1 [1] 1 [2]
Q1 3 [1..3] 0 [] -- invariant restored
Q1 3 [1..3] 1 [4]
Q1 3 [1..3] 2 [5,4]
Q1 3 [1..3] 3 [6,5,4]
Q1 7 [1..7] 0 [] -- invariant restored
```

Notice what happens: we only need to reverse *n* elements **after having inserted n elements**; we therefore say that the

The analysis in the previous section conveniently overlooked one fact: since values are immutable in Haskell, nothing is stopping us from reusing a queue multiple times. For instance, if we started from

`Q1 3 [1..3] 3 [6,5,4]`

we might attempt to insert 7, then 8, then 9, and finally 10 into this (same) queue:

```
Q1 7 [1,2,3,4,5,6,7] 0 [] -- invariant restored
Q1 7 [1,2,3,4,5,6,8] 0 [] -- invariant restored
Q1 7 [1,2,3,4,5,6,9] 0 [] -- invariant restored
Q1 7 [1,2,3,4,5,6,10] 0 [] -- invariant restored
```

Notice that *each* of these single insertions incurs the full cost of a reverse. Thus, claiming an amortised O(1) complexity is only valid if we use the queue linearly (i.e., never reusing queues). If we want to lift this restriction, we need to take advantage of laziness.

In order to get amortised constant time bounds even when the queue is not used linearly, we need to take advantage of lazy evaluation. We will change the front of the queue back to be a lazy list:

`data Queue2 a = Q2 !Int [a] !Int !(StrictList a)`

The remainder of the implementation is the same as it was for `Queue1`

, except that reverse now needs to take a strict list as input and return a lazy list as result:

```
rev' :: StrictList a -> [a]
rev' = go []
where
go :: [a] -> StrictList a -> [a]
go acc SNil = acc
go acc (SCons x xs) = go (x:acc) xs
```

All the other changes are just changing the operations on strict lists to operations on lazy lists:

```
inv2 :: Queue2 a -> Queue2 a
inv2 q@(Q2 f xs r ys)
| f < r = Q2 (f+r) (xs ++ rev' ys) 0 SNil
| otherwise = q
instance Queue Queue2 where
empty = Q2 0 [] 0 SNil
head (Q2 _ (x:_ ) _ _ ) = x
tail (Q2 f (_:xs) r ys) = inv2 $ Q2 (f-1) xs r ys
snoc (Q2 f xs r ys) y = inv2 $ Q2 f xs (r+1) (SCons y ys)
```

The genius of this representation lies in two facts. First, notice that when we construct the thunk `(xs ++ rev' ys)`

, we know that the `rev' ys`

will not be forced until we have exhausted `xs`

. Since we construct this thunk only when the rear is one longer than the front, we are indeed justified in saying that the cost of the reverse is amortised O(1).

But what about reusing the same queue twice? This is where we rely crucially on laziness. Suppose we have a sequence of operations

```
Q2 4 [1,2,3,4] 4 [8,7,6,5] -- initial queue
Q2 9 ([1..4] ++ rev' [9,8,7,6,5]) 0 [] -- snoc (invariant restored)
Q2 5 (rev' [9,8,7,6,5]) 0 [] -- tail 4 times
```

While it is true that we might call `tail`

on this resulting queue any number of times, they will *not* each incur the full cost of `rev'`

: since these thunks will all be shared, laziness will make sure that once this `rev'`

has been evaluated (“forced”) once, it will not be forced again.

Of course, if we started from that initial queue and inserted various elements, then each of those would create a separate (not shared) thunk with a call to `rev'`

: but those calls to `rev'`

will only be forced if for each of those separate queues we first do `f`

calls to tail (in this case, 4 calls).

The queues from the previous section will suffice for lots of applications. However, in some applications amortised complexity bounds are not good enough. For instance, in real time systems having normally-cheap operations occassionally take a long time is not acceptable; each operation should take approximately the same amount of time, even if that means that the overall efficiency of the system is slightly lower.

There are two sources of delays in the implementation from the previous section. The first is that when we come across the call to reverse, that whole reverse needs to happen in one go. The second source comes from the fact that we might still chain calls to append; consider what happens when we insert the elements `[1..7]`

:

```
Q2 0 [] 0 []
Q2 1 r1 0 [] -- invariant restored, r1 = [] ++ rev' [1]
Q2 1 r1 1 [2]
Q2 3 r2 0 [] -- invariant restored, r2 = r1 ++ rev' [3,2]
Q2 3 r2 1 [4]
Q2 3 r2 2 [5,4]
Q2 3 r2 3 [6,5,4]
Q2 7 r3 0 [] -- invariant restored, r3 = r2 ++ rev' [7,6,5,4]
```

This is similar to the behaviour we saw for the queues based on a single list, except we now have a maximum of O(log *n*) calls rather than O(*n*), because the distance between two calls to `reverse`

doubles each time.

Intuitively, we can solve both of these problems by doing a little bit of the append and a little bit of the reverse each time we call `tail`

or `snoc`

. We need to reestablish the invariant when *r* = *f* + 1. At this point the append will take *f* steps, and the reverse *r* steps, and we will not need to reestablish the invariant again until we have added *r + f + 2* elements to the rear of the queue (or added some to the rear and removed some from the front). This therefore gives us plenty of time to do the append and the reverse, if we take one step on each call to `tail`

and `snoc`

.

How might we “do one step of a reverse”? This is where we diverge from Okasaki, and give a more direct implementation of this idea. We can implement a datatype that describes the “progress” of an operation:

`data Progress = Done | NotYet Progress`

The idea is that we can execute one step of an operation by pattern matching on an appropriate value of type `Progress`

:

```
step :: Progress -> Progress
step Done = Done
step (NotYet p) = p
```

For `(++)`

it is easy to construct a `Progress`

value which will execute the append; all we need to do is force (part of) the spine of the resulting list:

```
forceSpine :: Int -> [a] -> Progress
forceSpine 0 _ = Done
forceSpine _ [] = Done
forceSpine n (_:xs) = NotYet (forceSpine (n-1) xs)
```

For other operations this is more difficult. We need some way to express a computation split into multiple steps. We can use the following datatype for this purpose:

`data Delay a = Now a | Later (Delay a)`

`Delay a`

is a computation of an `a`

, but we mark the various steps of the computation using the `Later`

constructor (this datatype is variously known as the delay monad or the partiality monad, but we will not need the fact that it is a monad in this blog post). For example, here is reverse:

```
revDelay :: StrictList a -> Delay [a]
revDelay = go []
where
go :: [a] -> StrictList a -> Delay [a]
go acc SNil = Now acc
go acc (SCons x xs) = Later $ go (x:acc) xs
```

We then need to be able to execute one step of such a computation. For this purpose we can introduce

`runDelay :: Delay a -> (a, Progress)`

which returns the final value, as well as a `Progress`

value which allows us to execute the computation step by step. The definition of `runDelay`

is somewhat difficult (see appendix, below), but the idea hopefully is clear: evaluating the resulting `Progress`

*n* steps will execute precisely *n* steps of the computation; if you look at the resulting `a`

value before having stepped the entire `Progress`

the remainder of the computation will run at that point.

Finally, we can execute two operations in lockstep by pattern matching on two `Progress`

values at the same time:

```
par :: Progress -> Progress -> Progress
par !p Done = p
par Done !p' = p'
par (NotYet p) (NotYet p') = NotYet (par p p')
```

We can use the `Progress`

datatype to implement real-time queues: queues where both insertion and deletion has O(1) worst case complexity. The representation is much like we used in the previous section, but we add a `Progress`

field (`Progress`

is an example implementation of what Okasaki calls a “schedule”):

`data Queue3 a = Q3 !Int [a] !Int !(StrictList a) !Progress`

Re-establishing the invariant happens much as before, except that we record the resulting `Progress`

on the queue:

```
inv3 :: Queue3 a -> Queue3 a
inv3 q@(Q3 f xs r ys _)
| f < r = let (ys', p1) = runDelay $ revDelay ys
xs' = xs ++ ys'
p2 = forceSpine f xs'
in Q3 (f+r) xs' 0 SNil (par p1 p2)
| otherwise = q
```

All that is left to do now is make sure we take a step of the background reverse and append actions on each call to `tail`

and `snoc`

:

```
instance Queue Queue3 where
empty = Q3 0 [] 0 SNil Done
head (Q3 _ (x:_ ) _ _ _) = x
tail (Q3 f (_:xs) r ys p) = inv3 $ Q3 (f-1) xs r ys (step p)
snoc (Q3 f xs r ys p) y = inv3 $ Q3 f xs (r+1) (SCons y ys) (step p)
```

It is difficult to develop data structures with amortised complexity bounds in strict but pure languages; laziness is essential for making sure that operations don’t unnecessarily get repeated. For applications where amortised bounds are insufficient, we can use an explicit schedule to make sure that operations get executed bit by bit; we can use this to develop a pure and persistent queue with O(1) insertion and deletion.

In his book, Okasaki does not introduce a `Progress`

datatype or any of its related functionality; instead he makes very clever use of standard datatypes to get the same behaviour somehow implicitly. Although this is very elegant, it also requires a lot of ingenuity and does not immediately suggest how to apply the same techniques to other datatypes. The `Progress`

datatype we use here is perhaps somewhat cruder, but it might make it easier to implement other real-time data structures.

Random access to (any of the variations on) the queue we implemented is still O(*n*); if you want a datastructure that provides O(1) insertion and deletion as well as O(log *n*) random access you could have a look at Data.Sequence; be aware however that this datatype provides amortised, not real-time bounds. Modifying `Sequence`

to provide worst-case complexity bounds is left an exercise for the reader ;-)

`runDelay`

The definition of `runDelay`

is tricky. The most elegant way we have found is to use the lazy ST monad:

```
runDelay :: Delay a -> (a, Progress)
runDelay = \xs -> runST $ do
r <- newSTRef xs
x <- unsafeInterleaveST $ readSTRef r
p <- next r
return (runNow x, p)
where
next :: STRef s (Delay a) -> ST s Progress
next r = do
xs <- readSTRef r
case xs of
Now _ -> return Done
Later d -> do writeSTRef r d
p' <- next r
return $ NotYet p'
runNow :: Delay a -> a
runNow (Now a) = a
runNow (Later d) = runNow d
```

In the lazy ST monad effects are only executed when their results are demanded, but are always executed in the same order. We take advantage of this to make sure that the calls to `next`

only happen when pattern matching on the resulting `Progress`

value. However, it is crucial that for the value of `x`

we read the contents of the `STRef`

only when the value of `x`

is demanded, so that we can take advantage of any writes that `next`

will have done in the meantime.

This does leave us with a proof obligation that this code is safe; in particular, that the value of `x`

that we return does not depend on *when* we execute this `readSTRef`

; in other words, that invoking `next`

any number of times does not change this value. However, hopefully this is relatively easy to see. Indeed, it follows from parametricity: since `runDelay`

is polymorphic in `a`

, the only `a`

it can return is the one that gets passed in.

To see that pattern matching on the resulting `Progress`

has the intended effect, note that the ST ref starts with “cost *n*”, where *n* is the number of `Later`

constructors, and note further that each call to `next`

reduces *n* by one. Hence, by the time we reach `Done`

, the computation has indeed been executed (reached the `Now`

constructor).

Note that for the case of the queue implementation, by the time we demand the value of the reversed list, we are sure that we will have fully evaluated it, so the definition

`runNow (Later d) = runNow d`

could actually be replaced by

`runNow (Later _) = error "something went horribly wrong!"`

Indeed, this can be used to debug designing these real time data structures to ensure that things are indeed fully evaluated by the time you expect them to. In general however it makes the `runDelay`

combinator somewhat less general, and strictly speaking it also breaks referential transparency because now the value of `x`

*does* depend on how much of the `Progress`

value you evaluate.

For more information about the (lazy) ST monad, see *Lazy Functional State Threads*, the original paper introducing it. Section 7.2, “Interleaved and parallel operations” discusses `unsafeInterleaveST`

.

## C◦mp◦se conference

Thursday, February 4 – Sunday, February 7, 2016, New York City

This conference is focused on typed functional programming and features a keynote by Eugenia Cheng and an excellent line-up of talks including one by our own Austin Seipp on *Cryptography and verification with Cryptol*. There’s also an “unconference” with small workshops and tutorials as well as the opportunity to get your hands dirty and try things out yourself.

For several years now, we have been running successful Haskell courses in collaboration in Skills Matter. The New York courses will be taught by Duncan Coutts, co-founder and partner at Well-Typed. He’s an experienced teacher and is involved in lots of commercial Haskell development projects at Well-Typed.

You can participate in our Haskell courses *directly before* or *directly after* C◦mp◦se in February, or if that doesn’t suit we are running two of the courses in London this April:

## Fast Track to Haskell

Tuesday, February 2 – Wednesday, February 3, 2016, New York City(and Monday, April 4 – Tuesday, April 5, 2016, London)

Find out more or register here.

This course is for developers who want to learn about functional programming in general or Haskell in particular. It introduces important concepts such as algebraic datatypes, pattern matching, type inference, polymorphism, higher-order functions, explicit effects and, of course, monads and provides a compact tour with lots of hands-on exercises that provide a solid foundation for further adventures into Haskell or functional programming.

## Guide to Haskell Performance and Optimization

Monday, February 8 – Tuesday, February 9, 2016, New York City(and Wednesday, April 6 – Thursday, April 7, 2016, London)

Find out more or register here.

This brand-new course looks under the surface of Haskell at how things are implemented, including how to reason about performance and optimize your programs, so that you can write beautiful programs that scale. It covers the internal representation of data on the heap, what exactly lazy evaluation means and how it works, how the compiler translates Haskell code to a target language via several internal representations, what you can and cannot reasonably expect the compiler to do, and how you can tweak the optimizer behaviour by using compiler pragmas such as inlining annotations and rewrite rules.

## Guide to the Haskell Type System

Wednesday, February 10, 2016, New York City

Find out more or register here.

This course is a one-day introduction to various type-system extensions that GHC offers, such as GADTs, rank-n polymorphism, type families and more. It assumes some familiarity with Haskell. It does not make use of any other advanced Haskell concepts except for the ones it introduces, so it is in principle possible to follow this course directly after Fast Track. However, as this course focuses on the extreme aspects of Haskell’s type system, it is particularly recommended for participants who are enthusiastic about static types and perhaps familiar with a strong static type system from another language.

In general, our courses are very practical, but don’t shy away from theory where necessary. Our teachers are all active Haskell developers with not just training experience, but active development experience as well. In addition to the courses in New York, we regularly offer courses in London.

We also provide on-site training on requests nearly anywhere in the world. If you want to know more about our training or have any feedback or questions, have a look at our dedicated training page or just drop us a mail.

]]>```
# curl http://localhost:8081/example/reverse
elpmaxe
# curl http://localhost:8081/example/caps
EXAMPLE
# curl http://localhost:8081/1234/inc
1235
# curl http://localhost:8081/1234/neg
-1234
```

Moreover, it can echo back any value:

```
# curl http://localhost:8081/example/echo
example
# curl http://localhost:8081/1234/echo
1234
```

However, it does not make sense to try to reverse an integer or increment a string; requesting either of

```
http://localhost:8081/1234/reverse
http://localhost:8081/example/inc
```

should result in HTTP error.

This is an example of a dependently typed server: the value that we are given as input determines the type of the rest of the server. If we get a string as input, we expect a string operation as the second argument and the response is also a string; if we get an integer as input, we expect an integer operation as the second argument and the response is also an integer.

We can encode the type of values that are either strings or integers using the GADT

```
data Value :: * -> * where
VStr :: Text -> Value Text
VInt :: Int -> Value Int
```

and the type of operations on a particular kind of value as

```
data Op :: * -> * where
OpEcho :: Op a
OpReverse :: Op Text
OpCaps :: Op Text
OpInc :: Op Int
OpNeg :: Op Int
```

The core of our server will be:

```
execOp :: Value a -> Op a -> Value a
execOp val OpEcho = val
execOp (VStr str) OpReverse = VStr $ Text.reverse str
execOp (VStr str) OpCaps = VStr $ Text.toUpper str
execOp (VInt i) OpInc = VInt $ i + 1
execOp (VInt i) OpNeg = VInt $ negate i
```

Note that the type of `execOp`

reflects precisely that the type of the input matches both the type of the operation we’re applying and the type of the result.

In the remainder of this blog post we will explain which combinators we need to add to Servant to be able to define such a dependently typed webserver. We will assume familiarity with the implementation of servant; see Implementing a minimal version of haskell-servant for an introduction. We will also assume familiarity with the following language extensions:

```
{-# LANGUAGE ConstraintKinds #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
```

all of which we will need in this development.

We will need some reasonably standard preliminary definitions.

Consider parsing a `Value`

; what would be the type of

`fromText :: Text -> Value ???`

Clearly, the type argument to `Value`

here depends on the actual run-time value of the `Text`

. Since we do not know this type argument before parsing, we need to hide it in an existential. We can define a general existential wrapper as

```
data Some :: (* -> *) -> * where
Some :: f a -> Some f
```

We can then define a `FromText`

instance as follows:

```
instance FromText (Some Value) where
fromText t = Just $ case readMaybe (Text.unpack t) of
Just n -> Some $ VInt n
Nothing -> Some $ VStr t
```

Consider trying to convince the type checker that all values (i.e., `String`

s and `Int`

s in our running example) admit an equality test. One way to do this is to use reified type class dictionaries:

```
data Dict (c :: Constraint) where
Dict :: c => Dict c
```

Using `Dict`

we can define

```
valueEq :: Value a -> Dict (Eq a)
valueEq (VStr _) = Dict
valueEq (VInt _) = Dict
```

The first use of `Dict`

in `valueEq`

captures the in-scope `Eq String`

constraint, and the second use captures the in-scope `Eq Int`

constraint. We can bring these constraints back into scope by pattern matching:

```
useEq :: Dict (Eq a) -> a -> a -> Bool
useEq Dict = (==)
```

In order to define the type dependency in the webserver we will need a type level function *f* that says “if the input has type *a*, then the remainder of the server has type *f a*”. We might try to use a type synonym for this:

```
type ExecOp a = Capture "op" (Op a)
:> Get '[PlainText] (Value a)
```

Unfortunately, we will need to be able to refer to `ExecOp`

without an argument, which is not possible in Haskell: type synonyms must always be fully applied. We could use a `newtype`

, but that will lead to other headaches (`HasServer`

cannot be derived).

Instead, we will apply defunctionalization and use a datatype to symbolically denote a particular type-level function, and then use a type family to implement application of the symbolic function to an argument:

```
data ExecOp
type family Apply f a :: *
type instance Apply ExecOp a = Capture "op" (Op a)
:> Get '[PlainText] (Value a)
```

As an example of using `Apply`

, given the definition of `execOp`

above, we can define the following wrapper:

```
serveExecOp :: Value a -> Server (Apply ExecOp a)
serveExecOp val op = return $ execOp val op
```

This use of defunctionalization to simulate type-level lambdas was pioneered in the work on the singletons package (see Promoting Functions to Type Families in Haskell).

A dependently typed server is a server with some argument, such that the value of that argument determines the type of the server. We can encode this as

```
newtype DepServer (ix :: * -> *) (f :: *) (m :: * -> *) =
DepServer (forall a. ix a -> ServerT (Apply f a) m)
```

Here `ix`

is the type of the index (`Value`

in our running example), and `f`

is the (dependent) type of the server (in our example, this will be `ExecOp`

). The `m`

parameter is Servant’s standard server monad argument.

We now introduce a new type class, alongside the standard Servant standard `HasServer`

, that corresponds to dependent servers. *The key idea is that we will need a HasServer instance for all valid instantiations of the type argument:*

```
class HasDepServer ix f where
hasDepServer :: Proxy f -> ix a -> Dict (HasServer (Apply f a))
```

Let’s consider an example:

```
instance HasDepServer Value ExecOp where
hasDepServer _ (VStr _) = Dict
hasDepServer _ (VInt _) = Dict
```

In order to show we can construct servers for `ExecOp`

, we need to show that we have a `HasServer`

instance for all valid indices; in our case, that is `String`

and `Int`

. In other words, we need to show we have `HasServer`

instances for both of

```
Capture "op" (Op String) :> Get '[PlainText] (Value String)
Capture "op" (Op Int) :> Get '[PlainText] (Value Int)
```

both of which we get (almost) for free from Servant. Hence, we can simply pattern match on the index and trivially construct the dictionary. This follows precisely the same strategy as in the `valueEq`

example.

We’re almost there now. We can now introduce a dependent version of capturing part of a URL:

`data DepCapture (ix :: * -> *) (f :: *)`

(For simplicity’s sake this combines the functionality of `(:>)`

and `Capture`

in a single combinator. It would be possible to separate this out.)

With this combinator we can define the API we want:

`type API = DepCapture Value ExecOp`

The (somewhat simplified) `HasServer`

instance for the standard, non-dependent, `Capture`

looks like this:

```
instance (FromText a, HasServer f) => HasServer (Capture a :> f)
type ServerT (Capture a :> f) m = a -> ServerT f m
route = ...
```

The corresponding `HasServer`

for `DepCapture`

follows this closely. First, we note that the server for a dependent capture must be a dependent server:

```
instance (FromText (Some ix), HasDepServer ix f)
=> HasServer (DepCapture ix f) where
type ServerT (DepCapture ix f) m = DepServer ix f m
```

(We need the `DepServer`

newtype wrapper because we are not allowed to have a polymorphic type as the right hand side of a type family.)

In the router we attempt to parse the index; if this succeeds, we unwrap the existential, discovering the type index we need for the rest of the server, use the `HasDepServer`

type class to get a `HasServer`

instance for this particular type index, and continue as usual:

```
route Proxy (DepServer subserver) request respond =
case processedPathInfo request of
(p:ps) ->
case fromText p :: Maybe (Some ix) of
Nothing ->
respond $ failWith NotFound
Just (Some (p' :: ix a)) ->
case hasDepServer (Proxy :: Proxy f) p' of
Dict -> route (Proxy :: Proxy (Apply f a))
(subserver p')
request{ pathInfo = ps }
respond
_ ->
respond $ failWith NotFound
```

To define the full server, we just need to wrap `serveExecOp`

in the `DepServer`

constructor and we’re done:

```
server :: Server API
server = DepServer serveExecOp
```

The full development is available from GitHub if you want to play with it. Note that the way we set things up defining dependent servers does not incur a lot of overhead; the use of the `Apply`

type family avoids the need for newtype wrappers, and providing `HasDepServer`

instances will typically be very simple.

As a co-author, I’m obviously happy if the paper is being read, but it’s also 12 pages long in two-column ACM style. And while it explains the implementation, it does not necessarily make it easier to start playing with the code yourself, because it only shows excerpts, and code snippets in a paper are not easily runnable.

At the same time, whenever I want to demonstrate a new concept in the Servant context, or play with new ideas, I find myself not impementing it in the main Servant code base, but rather to create a small library that is “like Servant”, built on the same principles, but much simpler, so that I have less work to do and can evaluate the ideas more quickly. I’ve talked to some other contributors, and at least some of them are doing the same. So I thought it might be useful to develop and present the code of “TinyServant”, which is not exactly tiny, but still small compared to the full Servant code base, strips away a lot of duplication and unessential extras, but is still complete enough so that we can observe how Servant works. Obviously, this still won’t explain everything that one might want to know about the implementation of Servant, but I hope that it will serve as a useful ingredient in that process.

This blog post is a somewhat revised and expanded version of my Stack Overflow answer.

This is not a general tutorial on Servant and using Servant. For learning how to use Servant, the official Servant tutorial or the general documentation section of the Servant home page are better starting points.

The full code that is discussed in this post is 81 lines of Haskell and available separately.

I’m going to show the following things:

How to define the web API specification language that Servant offers. We are going to define as few constructs as possible: we are not going to worry about content types (just plain text), we are not going to worry about different HTTP methods (just GET), and the only special thing we can do in routes will be that we can capture components of the path. Still, this is enough to show all relevant ideas of the Servant implementation.

How to define an interpretation of the specification language. The point of Servant is that we can define many of these: an API can be interpreted as a web server (for various web backends), a web client (for various frontend languages, such as Haskell or JavaScript), a mock server, as documentation (in various formats) and more. Here, I’m going to implement an interpretation as a simplified Haskell function that can be seen as simulating a primitive web server, but without incurring any actual web dependencies.

How to use TinyServant on an example. We are going to take the very first example of the Servant homepage and adapt it for our examples.

To start, here are the language extensions we’ll need:

```
{-# LANGUAGE DataKinds, PolyKinds, TypeOperators #-}
{-# LANGUAGE TypeFamilies, FlexibleInstances, ScopedTypeVariables #-}
{-# LANGUAGE InstanceSigs #-}
```

The first three are needed for the definition of the type-level DSL itself. The DSL makes use of type-level strings (`DataKinds`

) and also uses kind polymorphism (`PolyKinds`

). The use of the type-level infix operators such as `:<|>`

and `:>`

requires the `TypeOperators`

extension.

The second three are needed for the definition of the interpretation. For this, we need type-level functions (`TypeFamilies`

), some type class programming which will require `FlexibleInstances`

, and some type annotations to guide the type checker which require `ScopedTypeVariables`

.

Purely for documentation purposes, we also use `InstanceSigs`

.

Here’s our module header:

```
module TinyServant where
import Control.Applicative
import GHC.TypeLits
import Text.Read
import Data.Time
```

The import of `Data.Time`

is just for our example.

The first ingredient is to define the datatypes that are being used for the API specifications.

```
data Get (a :: *)
data a :<|> b = a :<|> b
infixr 8 :<|>
data (a :: k) :> (b :: *)
infixr 9 :>
data Capture (a :: *)
```

As I’ve said before, we define only four constructs in our simplified language:

A

`Get a`

represents an endpoint of type`a`

(of kind`*`

). In comparison with full Servant, we ignore content types here. We need the datatype only for the API specifications. There are no directly corresponding values, and hence there is no constructor for`Get`

.With

`a :<|> b`

, we represent the choice between two routes. Again, we wouldn’t need a constructor, but it will turn out to be useful later when we define handlers.With

`item :> rest`

, we represent nested routes, where`item`

is the first path component and`rest`

are the remaining components. In our simplified DSL, there are just two possibilities for`item`

: a type-level string, or a`Capture`

. Because type-level strings are of kind`Symbol`

, but a`Capture`

, defined below is of kind`*`

, we make the first argument of`:>`

*kind-polymorphic*, so that both options are accepted by the Haskell kind system. So in particular, we will be able to write both`"person" :> Get Person`

and

`Capture Currency :> Get Amount`

and it will be well-kinded.

A

`Capture a`

represents a route component that is captured, parsed and then exposed to the handler as a parameter of type`a`

. In full Servant,`Capture`

has an additional string as a parameter that is used for documentation generation. We omit the string here.

We can now write down a version of the API specification from the Servant home page, adapted to our simplified DSL, and replacing the datatypes used there by actual datatypes that occur in the `Data.Time`

library:

```
type MyAPI = "date" :> Get Day
:<|> "time" :> Capture TimeZone :> Get ZonedTime
```

The most interesting aspect is of course what we can do with the API. Servant defines several interpretations, but they all follow a similar pattern. We’ll define only one here, which is inspired by the interpretation as a web server.

In Servant, the `serve`

function has the following type:

```
serve :: HasServer layout
=> Proxy layout -> Server layout -> Application
```

It takes a proxy for the API type (we’ll get back to that in a moment), and a handler matching the API type (of type `Server layout`

) to an `Application`

. The `Application`

type comes from the excellent WAI library that Servant uses as its default backend.

Even though WAI is very simple, it is too complicated for the purposes of this post, so we’ll assume a “simulated server” of type

`[String] -> IO String`

This server is supposed to receive a request that is just a sequence of path components (`[String]`

). We do not care about request methods or the request body or anything like that. And the response it just a message of type `String`

. We ignore status codes, headers and anything else. The underlying idea is still the same though than that of the `Application`

type used in the actual Servant implementation.

So our `serve`

function has type

```
serve :: HasServer layout
=> Proxy layout -> Server layout -> [String] -> IO String
```

The `HasServer`

class, which we’ll define below, has instances for all the different constructs of the type-level DSL and therefore encodes what it means for a Haskell type `layout`

to be interpretable as an API type of a server.

The `Proxy`

type is defined as follows: It’s defined as

` data Proxy a = Proxy`

Its only purpose is to help the GHC type checker. By passing an explicitly typed proxy such as `Proxy :: Proxy MyAPI`

to `serve`

, we can explicitly instantiate the `serve`

function to a particular API type. Without the `Proxy`

, the only occurrences of the `layout`

parameter would be in the `HasServer`

class constraint and as an argument of `Server`

, which is a type family. GHC is not clever enough to infer the desired value of `layout`

from these occurrences.

The `Server`

argument is the handler for the `API`

. As just stated, `Server`

itself is a type family (i.e., a type-level function), and computes from the API type the type that the handler(s) must have. This is one core ingredient of what makes Servant work correctly.

From these inputs, we then compute the output function of type `[String] -> IO String`

as explained above.

`Server`

type familyWe define `Server`

as a type family first. (Again, this is somewhat simplified compared to `Servant`

, which defines a monad transformer type family called `ServerT`

as part of the `HasServer`

class and then a top-level type synonym `Server`

in terms of `ServerT`

.)

`type family Server layout :: *`

The handler for a `Get a`

endpoint is simply an `IO`

action producing an `a`

. (Once again, in the full Servant code, we have slightly more options, such as producing an error with a choice of status codes.)

`type instance Server (Get a) = IO a`

The handler for `a :<|> b`

is a pair of handlers, so we could just define

`type instance Server (a :<|> b) = (Server a, Server b) -- preliminary`

But with this definition, nested occurrences of `:<|>`

in the API would lead to nested pairs of handlers, so we’d have to write code like

`(handler1, (handler2, handler3))`

which looks a bit ugly. Instead, we’re going to make `:<|>`

equivalent to Haskell’s pair type, but with an infix constructor called `:<|>`

, so that we can write

`handler1 :<|> handler2 :<|> handler3`

for a nested pair. The actual definition of `Server`

for `:<|>`

is then

`type instance Server (a :<|> b) = Server a :<|> Server b`

It remains to explain how each of the path components is handled.

Literal strings in the routes do not affect the type of the handler:

`type instance Server ((s :: Symbol) :> r) = Server r`

A capture, however, means that the handler expects an additional argument of the type being captured:

`type instance Server (Capture a :> r) = a -> Server r`

If we expand `Server MyAPI`

, we obtain

```
Server MyAPI
~ Server ( "date" :> Get Day
:<|> "time" :> Capture TimeZone :> Get ZonedTime
)
~ Server ("date" :> Get Day)
:<|> Server ("time" :> Capture TimeZone :> Get ZonedTime)
~ Server (Get Day)
:<|> Server ("time" :> Capture TimeZone :> Get ZonedTime)
~ IO Day
:<|> Server ("time" :> Capture TimeZone :> Get ZonedTime)
~ IO Day
:<|> Server (Capture TimeZone :> Get ZonedTime)
~ IO Day
:<|> TimeZone -> Server (Get ZonedTime)
~ IO Day
:<|> TimeZone -> IO ZonedTime
```

where `~`

is GHC’s syntax for type equality.

Recall that `:<|>`

as defined is equivalent to a pair. So as intended, the server for our API requires a pair of handlers, one that provides a date of type `Day`

, and one that, given a time zone, provides a time (of type `ZonedTime`

). We can define the handler(s) right now:

```
handleDate :: IO Day
handleDate = utctDay <$> getCurrentTime
handleTime :: TimeZone -> IO ZonedTime
handleTime tz = utcToZonedTime tz <$> getCurrentTime
handleMyAPI :: Server MyAPI
handleMyAPI = handleDate :<|> handleTime
```

`HasServer`

classWe still have to implement the `HasServer`

class, which looks as follows:

```
class HasServer layout where
route :: Proxy layout -> Server layout -> [String] -> Maybe (IO String)
```

The task of the function `route`

is almost like `serve`

. Internally, we have to dispatch an incoming request to the right router. In the case of `:<|>`

, this means we have to make a choice between two handlers. How do we make this choice? A simple option is to allow `route`

to fail, by returning a `Maybe`

. Then in a choice we can just try the first option, and if it returns `Nothing`

, try the second. (Again, full Servant is somewhat more sophisticated here, and version 0.5 will have a much improved routing strategy, which probably at some point in the future deserves to be the topic of its own blog post.)

Once we have `route`

defined, we can easily define `serve`

in terms of `route`

:

```
serve :: HasServer layout
=> Proxy layout -> Server layout -> [String] -> IO String
serve p h xs = case route p h xs of
Nothing -> ioError (userError "404")
Just m -> m
```

If none of the routes match, we fail with a (simulated) 404. Otherwise, we return the result.

`HasServer`

instancesFor a `Get`

endpoint, we defined

`type instance Server (Get a) = IO a`

so the handler is an IO action producing an `a`

, which we have to turn into a `String`

. We use `show`

for this purpose. In the actual Servant implementation, this conversion is handled by the content types machinery, and will typically involve encoding to JSON or HTML.

```
instance Show a => HasServer (Get a) where
route :: Proxy (Get a)
-> IO a -> [String] -> Maybe (IO String)
route _ handler [] = Just (show <$> handler)
route _ _ _ = Nothing
```

Since we’re matching an endpoint only, the require the request to be empty at this point. If it isn’t, this route does not match and we return `Nothing`

.

Let’s look at choice next:

```
instance (HasServer a, HasServer b) => HasServer (a :<|> b) where
route :: Proxy (a :<|> b)
-> (Server a :<|> Server b) -> [String] -> Maybe (IO String)
route _ (handlera :<|> handlerb) xs =
route (Proxy :: Proxy a) handlera xs
<|> route (Proxy :: Proxy b) handlerb xs
```

Here, we get a pair of handlers, and we use `<|>`

for `Maybe`

to try both, preferring the first if it matches.

What happens for a literal string?

```
instance (KnownSymbol s, HasServer r) => HasServer ((s :: Symbol) :> r) where
route :: Proxy (s :> r)
-> Server r -> [String] -> Maybe (IO String)
route _ handler (x : xs)
| symbolVal (Proxy :: Proxy s) == x = route (Proxy :: Proxy r) handler xs
route _ _ _ = Nothing
```

The handler for `s :> r`

is of the same type as the handler for `r`

. We require the request to be non-empty and the first component to match the value-level counterpart of the type-level string. We obtain the value-level string corresponding to the type-level string literal by applying `symbolVal`

. For this, we need a `KnownSymbol`

constraint on the type-level string literal, but all concrete literals in GHC are automatically an instance of `KnownSymbol`

.

The final case is for captures:

```
instance (Read a, HasServer r) => HasServer (Capture a :> r) where
route :: Proxy (Capture a :> r)
-> (a -> Server r) -> [String] -> Maybe (IO String)
route _ handler (x : xs) = do
a <- readMaybe x
route (Proxy :: Proxy r) (handler a) xs
route _ _ _ = Nothing
```

In this case, we can assume that our handler is actually a function that expects an `a`

. We require the first component of the request to be parseable as an `a`

. Here, we use the `Read`

class, whereas in Servant, we use a special-purpose class called `FromText`

(or `FromHttpApiData`

in version 0.5). If reading fails, we consider the request not to match. Otherwise, we can feed it to the handler and continue.

Now we’re done.

We can confirm that everything works in GHCi:

```
GHCi> serve (Proxy :: Proxy MyAPI) handleMyAPI ["time", "CET"]
"2015-11-01 20:25:04.594003 CET"
GHCi> serve (Proxy :: Proxy MyAPI) handleMyAPI ["time", "12"]
*** Exception: user error (404)
GHCi> serve (Proxy :: Proxy MyAPI) handleMyAPI ["date"]
"2015-11-01"
GHCi> serve (Proxy :: Proxy MyAPI) handleMyAPI []
*** Exception: user error (404)
```

We now have a system that we can play with an extend and modify easily. We can for example extend the specification language by a new construct and see what we have to change. We can also make the simulation more faithful (e.g. include request bodies or query parameters). Or we can define a completely different interpretation (e.g. as a client) by following the same scheme.

]]>In the time from 5–13 October, we are (co-)organizing a number of Haskell-related events in London, with Skills Matter.

Here’s the overview:

- 5–6 October 2015: 2-day intro course Fast Track to Haskell
- 7 October 2015: 1-day course Guide to the Haskell Type System
- 8–9 October 2015: 2-day conference Haskell eXchange
- 10–11 October 2015: 2-day free Haskell infrastructure hackathon
- 12–13 October 2015: 2-day course Advanced Haskell

We’ll co-organize and participate in a two-day Haskell Hackathon, which takes place directly after the Haskell eXchange.

This Hackathon aims at bringing together Haskell developers – both beginners and experts – who want to help improve the Haskell infrastructure, predominantly Hackage and Cabal.

We’ll aim to arrange some introductory overview talks, to e.g. provide an overview over the code bases and the most important open issues.

Participation is **free**, but please register via Skills Matter.

The Haskell eXchange 2015 will be bigger than ever before. Expanded to two days and two tracks, it features four keynotes, two tutorials, and more than twenty speakers in total.

Here’s the preliminary list of speakers and topics:

- Keynote by
*Simon Peyton Jones*on “Into the Core: understanding GHC’s intermediate language” - Keynote by
*Lennart Augustsson*on “Giving Types to Relations” - Keynote by
*Simon Marlow*on “Fun with Haxl” - Keynote by
*Luite Stegeman*on “Solving the JavaScript Problem” - Workshop by
*Tom Ellis*on “Opaleye” - Workshop by
*Ivan Perez*on “Game programming for fun and profit” - Talk by
*Jasper van der Jeugt*on “The Ludwig DSL” - Talk by
*Gershom Bazerman*on “Programming with Universal Properties” - Talk by
*Neil Mitchell*on “Shake” - Talk by
*Johan Tibell*on “High-performance Haskell” - Talk by
*Francesco Mazzoli*on “inline-c” - Talk by
*Matthew Pickering*on “ghc-exactprint” - Talk by
*Alp Mestanogullari*on “Servant” - Talk by
*Alfredo di Napoli*on “Using Haskell at Iris Connect” - Talk by
*Miëtek Bak*on “Building your own proof assistant” - Talk by
*Lars Hupel*and*Miles Sabin*on “What Haskell can learn from Scala” - Talk by
*Vladimir Kirillov*on “Haskell goes Devops” - Talk by
*Nicolas Wu*on “Transformers: handlers in disguise” - Short talk by
*Martijn van Steenbergen*on “JsonGrammar” - Short talk by
*Bodil Stokke*on “PureScript” - Short talk by
*Blair Archibald*on “HdpH” - Short talk by
*Philipp Kant*on “Data avoidance in Haskell” - Short talk by
*Andraž Bajt*on “Using Haskell as a thinking tool” - Short talk by
*San Gillis*on “Haskell development with Docker”

Registration is possible via Skills Matter. **The promo code HASKELL-EXCHANGE-25 (has been extended to be valid until September 19!) can be used to get a 25% reduction.**

In connection with the Haskell eXchange and the Haskell infrastructure hackathon, Well-Typed are offering courses with Skills Matter. If you cannot come to London in October, but are interested in our course offerings, see Training for more information.

The Fast Track course is a two-day compact introduction to Haskell, assuming previous programming experience, but no familiarity with Haskell or functional programming. It covers topics such as defining datatypes and functions, higher-order functions, explicit side effects and monads.

The Guide to the Haskell Type System course is a one-day introduction to various type-system extensions that GHC offers, such as GADTs, rank-n polymorphism, type families and more. It assumes familiarity with Haskell. It does not make use of any other advanced Haskell concepts except for the ones it introduces, so it is in principle possible to follow this course directly after Fast Track. However, as this course focuses very much on the extreme aspects of Haskell’s type system, it should probably only be taken by participants who are enthusiastic about static types and perhaps familiar with a strong static type system from another language.

The Advanced Haskell course is targeted at Haskellers who are comfortable with the Haskell basics, and want to learn more about how to write larger Haskell programs. The course covers topics such as data structures, their complexity, pitfalls of lazy evaluation, profiling, GHC’s internal core language, and some more advanced design patterns such as monad transformers and how to use them effectively. Once again, strictly speaking this course can be followed when just having completed Fast Track. But the nature of this course’s contents also means that several of the topics can be more appreciated if one has written more Haskell code in practice already.

]]>`hackage-security`

library, along with its integration in `hackage-server`

and `cabal-install`

. The new security features of `hackage`

are now deployed on the central server hackage.haskell.org and there is a beta release of `cabal`

available. You can install it through
```
cabal install \
http://www.well-typed.com/hackage-security/Cabal-1.23.0.0.tar.gz \
http://www.well-typed.com/hackage-security/cabal-secure-beta.tar.gz \
http://www.well-typed.com/hackage-security/hackage-security-0.4.0.0.tar.gz \
http://www.well-typed.com/hackage-security/tar-0.4.2.2.tar.gz
```

This will install a `cabal-secure-beta`

binary which you can use alongside your normal installation of `cabal`

.

For a more detailed discussion of the rationale behind this project, see the annoucement of the alpha release or the initial announcement of this project. We will also be giving a talk about the details at the upcoming Haskell Implementors Workshop. In the remainder of this blog post we will describe what’s available, right now.

The Hackage server now does index signing. This means that if an attacker sits between you and Hackage and tries to feed you different packages than you think you are installing, `cabal`

will notice this and throw a security exception. Index signing provides no (or very limited) security against compromise of the central server itself, but allows clients to verify that what they are getting is indeed what is on the central server.

A very important corollary of the previous point is that we can now have untrusted mirrors. Anyone can offer to mirror `hackage`

and we can gratefully accept these offers without having to trust those mirror operators. Whether we are downloading from the mirror or from the primary server, the new security features make it possible to verify that what we are downloading *is* what is on the primary server.

In practice this mean we can have mirrors at all, and we can use them fully automatically with no client side configuration required. This should give a huge boost to the reliability of Hackage; even AWS goes down from time to time but properly decentralised mirrors should mean there’s always a recent snapshot available.

On the client-side, the very first time `cabal`

updates from the primary server it also finds out what mirrors are available. On subsequent updates it will automatically make use of any of those mirrors: if it encounters a problem with one it will try another. Updates to the list of mirrors is also fully automatic.

For operating a mirror, we have extended the `hackage-mirror`

client (currently bundled in the `hackage-server`

package) so that it can be used to mirror a Hackage repository to a simple set of local files which can then be served by an ordinary HTTP server.

We already have one mirror available in time for the beta. The OSU Open Source Lab have very kindly agreed to host a Hackage mirror for the benefit of the Haskell community. This mirror is now live at http://ftp.osuosl.org/pub/hackage/, but we didn’t need to tell you that: (the beta release of) cabal will notice this automatically without any configuration on the part of the user thanks to hackage.haskell.org/mirrors.json.

Getting a mirror up and running is very easy, so if you would like to host a public Hackage mirror, then please do get in touch; during the beta period get in touch with us, or later on get in touch with the Hackage admins.

Hackage provides a `00-index.tar.gz`

resource which is a tarball containing the `.cabal`

files for all packages available on Hackage. It is this file that `cabal`

downloads when you call `cabal update`

, and that it uses during dependency resolution.

However, this file is quite large, which is why `cabal update`

can take a few seconds to complete. In fact at nearly 10Mb the index is now considerably larger than almost all package source tarballs.

As part of the security work we have had to extend this index with extra security metadata, making the file even larger. So we have also taken the opportunity to dramatically reduce download sizes by allowing clients to update this file incrementally. The index tarball is now extended in an append-only way. This means that once `cabal`

has downloaded the tarball once, on subsequent updates it can just download the little bit it doesn’t yet have. To avoid making existing clients download the new larger index file each time, the `00-index.tar.gz`

is kept as it always was and repositories supporting the new features additionally provide a `01-index.tar.gz`

. In future we could additionally provide a `.tar.xz`

variant and thereby keep the first-time update size to a minimum.

The append-only nature of the index has additional benefits; in effect, the index becomes a log of Hackage’s history. This log can be used for various purposes; for example, we can track how install plans for packages change over time. As another example, Herbert Valerio Riedel has been working on an “package-index wayback” feature for Cabal. This uses the index to recreate a past view of the package index for recovering now bit-rotted install-plans that were known to work in the past.

There are currently a few known issues that make `cabal update`

slower than it needs to be, even though it’s doing an incremental update. This will be addressed before the official release.

It has always been possible to host your own Hackage repository, either for private packages or as a mirror of the public collection, but it has not always been convenient.

There is the “smart” server in the form of the `hackage-server`

, which while relatively easy to build and run, isn’t as simple as just a bunch of files. There has also always been the option of a “dumb” server, in the form of a bunch of files in the right format hosted by an ordinary HTTP server. While the format is quite simple (reusing the standard `tar`

format), there have not been convenient tools for creating or managing these file based repositories.

As part of the security work we have made a simple command line tool to create and manage file based Hackage repositories, including all the necessary security metadata. This tool has been released as hackage-repo-tool on Hackage.

So whether you want a private mirror of the public packages, or a repository for your own private packages, or both, we hope these new tools will make that much more convenient. Currently documentation on how to use these tools is still somewhat lacking; this is something we will address after this beta release. Getting started is not difficult; there are some brief instructions in the reddit discussion, and feel free to talk to us on `#hackage`

on IRC or contact us directly at info@well-typed.com if you need help.

As mentioned, we would like to invite you to install `cabal-secure-beta`

and start testing it; just use it as you would `cabal`

right now, and report any problems you may find on the `hackage-security`

issue tracker. Additionally, if you would like to host a public mirror for `Hackage`

, please contact us.

This release is primarily intended as an in-the-wild test of the infrastructure; there are still several details to be dealt with before we call this an official release.

The most important of these is proper key management. Much like, say, HTTPS, the chain of trust starts at a set of root keys. We have asked the Haskell.org committee to act as the root of trust and the committee has agreed in principle. The committee members will hold a number of the root keys themselves and the committee may also invite other organisations and individuals within the community to hold root keys. There are some policy details that remain to be reviewed and agreed. For example we need to decide on how many root keys to issue, what threshold number of keys be required to re-sign the root info, and agree policies for storing the root keys to keep them safe (for instance, mandate an air gap where the root key is never on a machine that is connected to the Internet). We will use the opportunity of ICFP (and the HIW talk) in a couple weeks time to present more details and get feedback.

If you would like to help with development, please take a look at the issue list and get in touch!

]]>In part 1 we covered the basics: constant types, functions and polymorphism (over types of kind `*`

). In this post we will deal with more advanced material: type constructors, type classes, polymorphism over type constructors and type constructor classes.

`* -> *`

)Before considering the general case, let’s think about lists. Given `a :: A ⇔ A'`

, two lists `xs :: [A]`

and `ys :: [A']`

are related iff their elements are related by `a`

; that is,

`[] ℛ([a]) []`

and

```
(x:xs') ℛ([a]) (y:ys')
iff x ℛ(a) x' and xs' ℛ([a]) ys'
```

For the special case that `a`

is a function `a⃯ :: A -> A'`

, this amounts to saying that `map a⃯ xs ≡ ys`

.

You can imagine a similar relation `F a`

exists for any type constructor `F`

. However, we will not give a general treatment of algebraic data types in this blog post. Doing this would require giving instances for products and sums (which is fine), but also for (least) fixed points, and that would take us much too far afield.

Thankfully, we will not need to be quite so precise. Instead, it will only require the following characterization:

Characterization: Functors.Let

`F`

be a functor. Then for all relations`a :: A ⇔ A'`

,`b :: B ⇔ B'`

and functions`f :: A -> B`

and`g :: A' -> B'`

, such that`f ℛ(a -> b) g`

:where we overload`forall xs :: F A, xs' :: F A'. if xs ℛ(F a) xs' then F f xs ℛ(F b) F g xs'`

`F`

to also mean the “map” function associated with`F`

. (Provided that the`Functor`

type class instance for`F`

is correct,`F f`

should be the same as`fmap f`

.)

(If we had the precise rules for algebraic data types we would be able to prove this characterization for any specific functor `F`

.)

Intuitively, think about `xs`

and `xs'`

as two containers of the same shape with elements related by `a`

, and suppose we have a pair of functions `f`

and `g`

which map `a`

-related arguments to `b`

-related results. Then the characterization states that if we apply function `f`

to the elements of `xs`

and `g`

to the elements of `xs'`

, we must end up with two containers of the same shape with elements related by `b`

.

For the special case that `a`

and `b`

are functions (and `F`

is a functor), the mapping relations characterization simply says that

```
if xs ℛ(F a⃯) xs' then F f xs ℛ(F b⃯) F g xs'
-- simplify
if F a⃯ xs ≡ xs' then F f xs ℛ(F b⃯) F g xs'
-- simplify
F b⃯ (F f xs) ≡ F g (F a⃯ xs)
-- functoriality
F (b⃯ . f) xs ≡ F (g . a⃯) xs
```

which follows immediately from the premise that `b⃯ . f ≡ g . a⃯`

(which in turn is a consequence of `f ℛ(a⃯ -> b⃯) g`

), so the mapping relations characterization is trivially satisfied (provided that the mapping of relations correspond to the functor map in the case for functions).

Technical note.When we use parametricity results, we often say something like: “specializing this result tofunctionsrather thanrelations…”. It is important to realize however that if`F`

is not a functor, then`F a`

may not be a functional relation even if`a`

is.For example, let

`a⃯ :: A -> A'`

, and take`F(a) = a -> a`

. Then`f ℛ(F a⃯) g -- expand definition iff f ℛ(a⃯ -> a⃯) g -- rule for functions iff forall x :: A, x' :: A'. if x ℛ(a⃯) x' then f x ℛ(a⃯) g x' -- simplify (a⃯ is a function) iff forall x :: A. a⃯ (f x) ≡ g (a⃯ x)`

Taking

Given a function`a :: Int -> Int ; a x = 0`

, this would relate two functions`f, g :: Int -> Int`

whenever`0 ≡ g 0`

; it is clear that this is not a functional relation between`f`

and`g`

.`a⃯ :: A -> A'`

,`F a⃯`

is a function`F A -> F A'`

when`F`

is a functor, or a function`F A' -> F A`

if`F`

is a contravariant functor. We will not consider contravariant functors further in this blog post, but there is an analogous Contravariant Functor Characterization that we can use for proofs involving contravariant functors.

`∀ab. (a -> b) -> [a] -> [b]`

This is the type of Haskell’s `map`

function for lists of course; the type of `map`

doesn’t fully specify what it should do, but the elements of the result list can only be obtained from applying the function to elements of the input list. Parametricity tells us that

```
f ℛ(∀ab. (a -> b) -> [a] -> [b]) f
-- apply rule for polymorphism, twice
iff forall A, A', B, B', a :: A ⇔ A', b :: B ⇔ B'.
f@A,B ℛ((a -> b) -> [a] -> [b]) f@A',B'
-- apply rule for functions, twice
iff forall A, A', B, B', a :: A ⇔ A', b :: B ⇔ B'.
forall g :: A -> B, g' :: A' -> B', xs :: [A], xs' :: [A'].
if g ℛ(a -> b) g', xs ℛ([a]) xs' then f g xs ℛ([b]) f g' xs'
```

Specializing to *functions* `a⃯ :: A -> A'`

and `b⃯ :: B -> B'`

, we get

```
forall A, A', B, B', a⃯ :: A -> A', b⃯ :: B -> B'.
forall g :: A -> B, g' :: A' -> B', xs :: [A], xs' :: [A'].
if g ℛ(a⃯ -> b⃯) g', xs ℛ([a⃯]) xs' then f g xs ℛ([b⃯]) f g' xs'
-- simplify
iff forall A, A', B, B', a⃯ :: A -> A', b⃯ :: B -> B'.
forall g :: A -> B, g' :: A' -> B'.
if b⃯ . g ≡ g' . a⃯ then map b⃯ . f g ≡ f g' . map a⃯
```

As an aside, `Functor`

instances should satisfy two laws:

`map id ≡ id`

`map f . map g ≡ map (f . g)`

It turns out that the second property follows from the first by parametricity; see The free theorem for `fmap`

.

`∀a. F a -> G a`

Consider a funtion `f :: ∀a. F a -> G a`

, polymorphic in `a`

but between fixed (constant) type constructors `F`

and `G`

; for example, a function of type `∀a. Maybe a -> [a]`

fits this pattern. What can we tell about `f`

?

```
f ℛ(∀a. F a -> G a) f
iff forall A, A', a :: A ⇔ A'.
f@A ℛ(F a -> G a) f@A'
iff forall A, A', a :: A ⇔ A', x :: F A, x' :: F A'.
if x ℛ(F a) x' then f x ℛ(G a) f x'
```

For the special case where we pick a function `a⃯ :: A -> A'`

for `a`

, this is equivalent to

```
forall A, A', a⃯ :: A -> A'.
G a⃯ . f == f . F a⃯
```

For the categorically inclined, this means that polymorphic functions must be natural transformations.

Now that we’ve covered the basics, it’s time to consider some more advanced language features. We will first consider qualified types, such as `∀a. Eq a => a -> a -> a`

.

The rule for a qualified type is

```
f ℛ(∀ a. C a => t) f'
iff forall A, A', a :: A ⇔ A'
such that A, A' instances of C and a respects C.
f@A ℛ(t) f'@A'
```

What does it mean for a relation `a :: A ⇔ A'`

to respect a type class `C`

? Every type class introduces a new constraint on relations defined by the members of the type class. Let’s consider an example; Haskell’s equality type class is defined by

```
class Eq a where
(==) :: a -> a -> Bool
```

(Let’s ignore (`/=`

) for simplicity’s sake.). Then a relation `a`

*respects* `Eq`

, written `Eq(a)`

, iff all class members are related to themselves. For the specific case of `Eq`

this means that

```
(==) ℛ(a -> a -> Bool) (==)
-- rule for functions, twice
iff forall x :: A, x' :: A', y :: A, y' :: A'.
if x ℛ(a) x', y ℛ(a) y' then x == y ℛ(Bool) x' == y'
-- Bool is a constant type, simplify
iff forall x :: A, x' :: A', y :: A, y' :: A'.
if x ℛ(a) x', y ℛ(a) y' then x == y ≡ x' == y'
```

For the special case where we pick a *function* `a⃯ :: A -> A'`

, the function respects Eq iff

```
forall x :: A, y :: A.
x == y ≡ a⃯ x == a⃯ y
```

I.e., the function maps `(==)`

-equal arguments to `(==)`

-equal results.

Syntactic convention.In the following we will write

`forall A, A', a :: A ⇔ A' such that A, A' instances of C and a respects C.`

more concisely as

`forall C(A), C(A'), C(a) :: A ⇔ A'.`

`∀a. Eq a => a -> a -> a`

We already considered the free theorem for functions `f :: ∀ a. a -> a -> a`

:

`g (f x y) = f (g x) (g y)`

Is this free theorem still valid for `∀a. Eq a => a -> a -> a`

? No, it’s not. Consider giving this (admittedly somewhat dubious) definition of natural numbers which considers all “invalid” natural numbers to be equal:

```
newtype Nat = Nat Int
deriving (Show)
instance Eq Nat where
Nat n == Nat n' | n <= 0, n' <= 0 = True
| otherwise = n == n'
```

If we define

```
f :: forall a. Eq a => a -> a -> a
f x y = if x == y then y else x
g :: Nat -> Nat
g (Nat n) = Nat (n + 1)
```

then for `x ≡ Nat (-1)`

and `y ≡ Nat (-2)`

we have that `g (f x y) ≡ Nat (-1)`

but `f (g x) (g y) ≡ Nat 0`

. Dubious or not, free theorems don’t assume anything about the particular implementation of type classes. The free theorem for `∀a. Eq a => a -> a -> a`

however *only* applies to functions `g`

which respect `Eq`

; and this definition of `g`

does not.

`∀ab. (Show a, Show b) => a -> b -> String`

We promised to look at this type when we considered higher rank types above. If you go through the process, you will find that the free theorem for functions `f`

of this type is

`f x y = f (g x) (h y)`

for any `Show`

-respecting functions `g`

and `h`

. What does it mean for a function to respect `Show`

? Intuitively it means that the function can change the value of its argument but not its string representation:

`show (g x) = show x`

Type constructor classes are classes over types of kind `* -> *`

; a typical example is

```
class Functor f where
fmap :: ∀ab. (a -> b) -> f a -> f b
```

The final rule we will discuss is the rule for universal quantification over a qualified type constructor (universal quantification over a type constructor without a qualifier is rarely useful, so we don’t discuss it separately):

```
g ℛ(∀f. C f => t) g'
iff forall C(F), C(F'), C(f) :: F ⇔ F'.
g@F ℛ(t) g'@F'
```

If `F`

and `F'`

are type constructors rather than types (functions on types), `f :: F ⇔ F'`

is a relational action rather than a relation: that is, it is a function on relations. As before, `C(f)`

means that this function must respect the type class `C`

, in much the same way as for type classes. Let’s consider what this means for the example of `Functor`

:

```
fmap ℛ(∀ab. (a -> b) -> f a -> f b) fmap
iff forall A, A', B, B', a :: A ⇔ A', b :: B ⇔ B'.
fmap@A,B ℛ((a -> b) -> f a -> f b) fmap@A,B
iff forall A, A', B, B', a :: A ⇔ A', b :: B ⇔ B'.
forall g :: A -> B, g' :: A' -> B', x :: F A, x' :: F' A'.
if g ℛ(a -> b) g', x ℛ(f a) x'
then fmap g x ℛ(f b) fmap g' x'
```

`∀f. Functor f => f Int -> f Int`

Intuitively, a function `g :: ∀f. Functor f => f Int -> f Int`

can take advantage of the `Int`

type, but only by applying `fmap`

; for example, when we apply `g`

to a list, the order of the list should not matter. Let’s derive the free theorem for functions of this type:

```
g ℛ(∀f. Functor f => f Int -> f Int) g
iff forall Functor(F), Functor(F'), Functor(f) :: F ⇔ F'.
g@F ℛ(f Int -> f Int) g@F'
iff forall Functor(F), Functor(F'), Functor(f) :: F ⇔ F'.
forall x :: F Int, x' :: F' Int.
if x ℛ(f Int) x' then g x ℛ(f Int) g x'
```

As before, we can specialize this to higher order functions, which are special cases of relational actions. Let’s use the notation `f⃯ :: F -> F'`

(with `F`

and `F'`

type constructors) to mean `f⃯ :: ∀ab. (a -> b) -> (F a -> F' b)`

. Then we can specialize the free theorem to

```
iff forall Functor(F), Functor(F'), Functor(f⃯) :: F -> F'.
forall x :: F Int, x' :: F' Int.
if x ℛ(f⃯ Int) x' then g x ℛ(f⃯ Int) g x'
-- `f⃯` is a function; recall that `Int` as a relation is the identity:
iff forall Functor(F), Functor(F'), Functor(f⃯) :: F -> F'.
f⃯ id . g ≡ g . f⃯ id
```

for any Functor-respecting `f⃯`

.

The free theorem we saw in the previous section has a very useful special case, which we will derive now. Recall that in order to prove that a higher order function `f⃯`

respects `Functor`

we have to prove that

`if g ℛ(a -> b) g', x ℛ(f⃯ a) x' then fmap g x ℛ(f⃯ b) fmap g' x'`

As in the higher rank example, this is a proof obligation (as opposed to the *application* of a free theorem), so that we really have to consider *relations* `a :: A ⇔ A'`

and `b :: B ⇔ B'`

here; it’s not sufficient to consider functions only.

We can however derive a special case of the free theorem which is easier to use. Take some arbitrary polymorphic function `k :: ∀a. F a -> F' a`

, and define the relational action `f :: F ⇔ F'`

by

`f(a) = k ⚬ F(a)`

where we use `k`

also as a relation. Then

```
x ℛ(f a) x'
iff ∃i. x ℛ(k) i and i ℛ(F(a)) x'
-- k is a function
iff k x ℛ(F(a)) x'
-- by the Functor Characterization
iff F g (k x) ℛ(F b) F g' x'
-- naturality
iff k (F g x) ℛ(F b) F g' x'
-- use k as a relation again
iff F g x ℛ(k) k (F g x) ℛ(F b) F g' x'
-- pick k (F g x) as the intermediate
then F g x ℛ(f b) F g' x'
-- if we assume that fmap is the "real" functor instance
iff fmap g x ℛ(f b) fmap g' x'
```

In the previous section we derived that the free theorem for `g :: ∀f. Functor f => f Int -> f Int`

was

```
forall Functor(F), Functor(F'), Functor(f⃯) :: F -> F'.
f⃯ id . g ≡ g . f⃯ id
```

for any higher order function which respects `Functor`

. The `f`

we defined above *is* a higher order function provided that `a`

if a function, and we just proved that it must respect functor. The identity relation is certainly a function, so we can specialize the free theorem to

`k . g ≡ g . k`

for *any* polymorphic function `k`

(no restrictions on `k`

). As a special case, this means that we must have

`reverse . g ≡ g . reverse`

formalizing the earlier intuition that when we apply such a function to a list, the order of the list cannot matter.

`∀f. Functor f => (B -> f B) -> f A`

As our last example, we will consider higher-order functions of type `g :: ∀f. Functor f => (B -> f B) -> f A`

. The free theorem for such functions is

```
g ℛ(∀f. Functor f => (B -> f B) -> f A) g
iff forall Functor(F), Functor(F'), Functor(f) : F ⇔ F'.
g@F ℛ((B -> f B) -> f A) g@F'
iff forall Functor(F), Functor(F'), Functor(f) : F ⇔ F'.
forall l :: B -> F B, l' :: B -> F' B.
if l ℛ(B -> f B) l' then g l ℛ(f A) g l'
```

Specializing to *higher order functions* `f⃯ :: ∀ab. (a -> b) -> F a -> F' b`

(rather than a relational action `f`

), we get

```
forall Functor(F), Functor(F'), Functor(f) : F ⇔ F'.
forall l :: B -> F B, l' :: B -> F' B.
if l ℛ(B -> f⃯ B) l' then g l ℛ(f⃯ A) g l'
iff forall Functor(F), Functor(F'), Functor(f) : F ⇔ F'.
forall l :: B -> F B, l' :: B -> F' B.
if f⃯ id . l ≡ l' . id then f⃯ id (g l) ≡ g l'
-- simplify
iff forall Functor(F), Functor(F'), Functor(f) : F ⇔ F'.
forall l :: B -> F B.
f⃯ id (g l) ≡ g (f⃯ id . l)
```

for any `Functor`

respecting `f⃯`

; we can now apply the same reasoning as we did in the previous section, and give the following free theorem instead:

`k (g l) ≡ g (k . l)`

for any polymorphic function (that is, natural transformation) `k :: ∀a. F a -> F' a`

and function `l :: B -> F B`

. This property is essential when proving that the above representation of a lens is isomorphic to a pair of a setter and a getter; see Functor is to Lens as Applicative is to Biplate, Section 4, for details.

Parametricity allows us to formally derive what we can conclude about a function by only looking at its type. We’ve covered a lot of material in this tutorial, but there is a lot more out there still. If you want to know more, here are some additional references.

- Free Theorems Involving Type Constructor Classes is a very accessible paper by Janis Voigtländer that describes how to deal with type constructor classes in free theorems, and gives lots of examples involving monads. If you prefer, you can also watch the video of the presentation. Janis also has a slide set with an introduction to free theorems and some of the underlying theory.
- Proofs for Free: Parametricity for dependent types by Jean-Philippe Bernardy, Patrik Jansson and Ross Paterson extends the theory of parametricity to dependent types.
- Although it’s not about parametricity per se, A Representation Theorem for Second-Order Functionals by Mauro Jaskelioff and Russell O’Connor gives a general (categorical) result that can be used to derive properties of types such as van Laarhoven lenses (sadly, the theorem does not seem to cover prisms).

Thanks to Auke Booij on `#haskell`

for his helpful feedback on both parts of this blog post.

```
-- | Download the specified URL (..)
--
-- This function will 'throwIO' an 'HttpException' for (..)
simpleHttp :: MonadIO m => String -> m ByteString
```

Notice that part of the semantics of this function—that it may throw an `HttpException`

—is encoded in a comment, which the compiler cannot check. This is because Haskell’s notion of exceptions offers no mechanism for advertising to the user the fact that a function might throw an exception.

Michael Snoyman discusses some solutions to this problem, as well as some common anti-patterns, in his blog post Exceptions Best Practices. However, wouldn’t it be much nicer if we could simply express in the type that `simpleHttp`

may throw a `HttpException`

? In this blog post I will propose a very lightweight scheme to do precisely that.

If you want to experiment with this yourself, you can download CheckedRevisited.hs (tested with ghc 7.2, 7.4, 7.6, 7.8 and 7.10).

**Note.** This is an improved version of this blog post; Checked.hs demonstrates the previous approach; see also the discussion on reddit on the original post and on the improved version.

Let’s introduce a type class for “checked exceptions” (à la Java):

`class Throws e where`

Here’s the key idea: *this will be a type class without any instances*. If we want to record in the type that some IO action throws a checked exception, we can now just add the appropriate type class constraint. For instance, we can define

```
throwChecked :: (Exception e, Throws e) => e -> IO a
throwChecked = Base.throwIO
```

and then use that as

```
simpleHttp :: (MonadIO m, Throws HttpException) => String -> m ByteString
simpleHttp _ = liftIO $ throwChecked HttpException
```

Unless we explicitly catch this exception, this annotation will now be propagated to every use site of `simpleHttp`

:

```
useSimpleHttp :: Throws HttpException => IO ByteString
useSimpleHttp = simpleHttp "http://www.example.com"
```

There’s something a little peculiar about a type class constraint such as `Throws HttpException`

: normally `ghc`

will refuse to add a type class constraint for a known (constant) type. If you were to write

`foo = throwChecked $ userError "Uhoh"`

`ghc`

will complain bitterly that

```
No instance for (Throws IOError)
arising from a use of ‘throwChecked’
In the expression: throwChecked
```

until you give the type annotation explicitly (you will need to enable the `FlexibleContexts`

language extension):

```
foo :: Throws IOError => IO a
foo = throwChecked $ userError "Uhoh"
```

I consider this a feature, not a bug of this approach: you are forced to explicitly declare the checked exceptions you throw.

In order to *catch* checked exceptions we need to somehow eliminate that `Throws`

constraint. That is, we want a function of type

`catchChecked :: Exception e => (Throws e => IO a) -> (e -> IO a) -> IO a`

In the remainder of this section we explain how we can do this; it requires a bit of type level hacking, and the use of roles. Bear in mind though that you do not need to understand this section in order to be able to use checked exceptions; it suffices to know that a function `catchChecked`

exists.

First, we define a newtype wrapper around an action that throws an exception:

`newtype Wrap e a = Wrap { unWrap :: Throws e => a }`

Then we define a newtype wrapper around the exception themselves:

`newtype Catch e = Catch e`

This `Catch`

is used internally only and not exported; it is the *only* type that will get a `Throws`

instance:

`instance Throws (Catch e) where`

Now we’re almost there. We are going to use `coerce`

to pretend that instead of an exception of type `e`

we have an exception of type `Catch e`

:

```
coerceWrap :: Wrap e a -> Wrap (Catch e) a
coerceWrap = coerce
```

This requires the type argument `e`

on the `Throws`

class to be representational (this needs `IncoherentInstances`

):

`type role Throws representational`

With all this in place, we can now eliminate `Throws`

constraints:

```
unthrow :: proxy e -> (Throws e => a) -> a
unthrow _ = unWrap . coerceWrap . Wrap
```

and defining `catchChecked`

is a simple matter:

```
catchChecked :: forall a e. Exception e
=> (Throws e => IO a) -> (e -> IO a) -> IO a
catchChecked act = Base.catch (unthrow (Proxy :: Proxy e) act)
```

Suppose we had

`readFile :: Throws IOException => FilePath -> IO String`

then we can write a function to get a file either by reading a local file or by downloading it over HTTP:

```
get :: (Throws IOException, Throws HttpException)
=> String -> IO ByteString
get url = case removePrefix "file:" url of
Just path -> readFile path
Nothing -> simpleHttp url
removePrefix :: [a] -> [a] -> Maybe [a]
removePrefix = ..
```

Alternatively we can define a bespoke exception hierarchy and combine the two exceptions:

```
data SomeGetException = forall e. Exception e => SomeGetException e
wrapIO :: (Throws IOException => IO a)
-> (Throws SomeGetException => IO a)
wrapIO = handleChecked $ throwChecked . SomeGetException
wrapHttp :: (Throws HttpException => IO a)
-> (Throws SomeGetException => IO a)
wrapHttp = handleChecked $ throwChecked . SomeGetException
get :: Throws SomeGetException => String -> IO ByteString
get url = case removePrefix "file:" url of
Just path -> wrapIO $ readFile path
Nothing -> wrapHttp $ simpleHttp url
```

This kind of custom exception hierarchy is entirely standard; I just wanted to show it fits nicely into this approach to checked exceptions.

There is one caveat to be aware of. Suppose we write

`returnAction = return (simpleHttp "http://www.example.com")`

Ideally we’d give this a type such as

```
returnAction :: IO (Throws HttpException => IO ByteString)
returnAction = return (simpleHttp "http://www.example.com")
```

But this requires impredicative types, which is still a no-go zone. Instead the type of `returnAction`

will be

```
returnAction :: Throws HttpException => IO (IO ByteString)
returnAction = return (simpleHttp "http://www.example.com")
```

which has the `Throws`

annotation on `returnAction`

itself; this means we can make the annotation disappear by adding an exception handler to `returnAction`

even though it’s never called (because `returnAction`

*itself* never throws any exception).

```
returnAction' :: IO (IO ByteString)
returnAction' = catchChecked returnAction neverActuallyCalled
where
neverActuallyCalled :: HttpException -> IO (IO ByteString)
neverActuallyCalled = undefined
```

This is somewhat unfortunate, but it occurs only infrequently and it’s not a huge problem in practice. If you do need to return actions that may throw exceptions, you can use a newtype wrapper such as `Wrap`

that we used internally in `rethrowUnchecked`

(for much the same reason):

```
returnAction :: IO (Wrap HttpException IO)
returnAction = return (Wrap $ simpleHttp "http://www.example.com")
```

Of course you will probably want to define a datatype that is more meaningful for your specific application; for instance, see see the definition of `HttpClient`

in the `Repository.Remote`

module, which defines something like

```
data HttpClient = HttpClient {
httpClientGet :: Throws SomeRecoverableException => URI -> ...
}
```

Of course, a type such as

`simpleHttp :: (MonadIO m, Throws HttpException) => String -> m ByteString`

does not tell you that this function can *only* throw `HttpException`

s; it can still throw all kinds of unchecked exceptions, not least of which asynchronous exceptions. But that’s okay: it can still be incredibly useful to track some exceptions through your code.

So there you have it: checked exceptions in Haskell using

- one—singleton—type class
`Throws`

, with no instances - just two functions
`rethrowUnchecked`

and`catchChecked`

- requiring only a handful of non-controversial language extensions
- without the use of
`unsafeCoerce`

and without introducing a special new kind of monad (such as in the control-monad-exception package) and without complicated type level hacking as in the Checked Exception for Free blogpost.

]]>`cabal-install`

and the Hackage server and a tool for managing file-based secure repositories. This release is not yet ready for general use, but we would like to invite interested parties to download and experiment with the library and its integration. We expect a beta release running on the central Hackage server will soon follow.
Hackage Security and related infrastructure is a project funded by the Industrial Haskell Group to secure Hackage, the central Haskell package server. A direct consequence of this work is that we can have untrusted Hackage mirrors (mirror selection is directly supported by the library). A minor but important additional side goal is support for incremental updates of the central Hackage index (only downloading information about *new* packages, rather than all packages).

**TL;DR:** Hackage will be more secure, more reliable and faster, and `cabal update`

should generally finish in seconds.

Security is notoriously difficult to get right, so rather than concocting something ad-hoc the Hackage Security framework is based on TUF, The Update Framework. TUF is a collaboration between security researches at the University of Arizona, the University of Berkeley and the Univerity of Washington, as well as various developers of the Tor project. It is a theory specifically designed for securing software update systems, and suits the needs of Hackage perfectly.

TUF covers both *index signing* and *author signing*. Index signing provides the means to verify that something we downloaded from a mirror is the same as what is available from the central server (along with some other security properties), thus making it possible to set up untrusted mirrors. It does not however deal with compromise of the central server. Author signing allows package authors to sign packages, providing a guarantee that the package you download is the one that the author uploaded. These two concerns are largely orthogonal, and the current project only adds support for index signing. Author signing will be the subject of a later project.

Very briefly, here is how it works. The bits in red refer to new features added as part of the Hackage Security work.

Hackage provides a file 00-index.tar.gz (known as “the index”) which contains the

`.cabal`

files for all versions of all packages on the server. It is this file that`cabal update`

downloads, and it is this file that`cabal install`

uses to find out which packages are available and what their dependencies are.Hackage additionally provides a signed file`/snapshot.json`

(“the snapshot”), containing a hash of the index. When`cabal`

downloads the index it computes its hash and verifies it against the hash recorded in the snapshot. Since mirrors do not have the key necessary to sign the snapshot (“ the snapshot key”), if the hash matches we know that the index we downloaded, and hence all files within, was the same as on the central server.When you

`cabal install`

one or more packages, the index provides`cabal`

with a list of packages and their dependencies. However, the index does not contain the packages themselves.The index does however contain`package.json`

files for each version of each package, containing a hash for the`.tar.gz`

of that package version. Since these files live in the index they are automatically protected by the snapshot key. When`cabal`

downloads a package it computes the hash of the package and compares it against the hash recorded in the index; if it matches, we are guaranteed that the package is the same as the package on the central server, as the central server is the only one with access to the snapshot key.- The client does not have built-in knowledge of the snapshot key. Instead, it can download
`/root.json`

(“the root metadata”) from the server, which contains the public snapshot key. The root metadata itself is signed by the root keys, of which the client*does*have built-in knowledge. The private root keys must be kept very securely (e.g. encrypted and offline).

This description leaves out lots of details, but the purpose of this blog post is not to give a full overview of TUF. See the initial announcement or the website of The Update Framework for more details on TUF; the Hackage Security project README provides a very detailed discussion on how our implemention of TUF relates to the official TUF specification.

Most of the functionality is provided through a new library called `hackage-security`

, available from github, designed to be used by clients and servers alike.

Although we have integrated it in `cabal-install`

, the `hackage-security`

library is expressly designed to be useable by different clients as well. For example, it generalizes over the library to use for HTTP requests; `cabal`

uses hackage-security-HTTP, a thin layer around the HTTP library. However, if a client such as stack wants to use the `hackage-security`

library to talk to Hackage it may prefer to use hackage-security-http-client instead, a thin layer around the http-client library.

Using the library is very simple. After importing Hackage.Security.Client three functions become available, corresponding to points 1, 2 and 3 above:

```
checkForUpdates :: Repository -> CheckExpiry -> IO HasUpdates
downloadPackage :: Repository -> PackageId -> (TempPath -> IO a) -> IO a
bootstrap :: Repository -> [KeyId] -> KeyThreshold -> IO ()
```

Some comments:

- A
`Repository`

is an object describing a (local or remote) repository. - The
`CheckExpiry`

argument describes whether we should check expiry dates on metadata. Expiry dates are important to prevent attacks where a malicious mirror provides outdated data (see A Look In the Mirror: Attacks on Package Managers, Section 3, Threat Model) but we may occassionally want to accept expired data (for instance, when the central server is down for an extended period of time). The

`[KeyId]`

and`KeyThreshold`

arguments to`bootstrap`

represent the client’s “built-in” knowledge of the root keys we alluded to above. In the case of`cabal-install`

these come from the cabal`config`

file, which may contain a section such as`remote-repo hackage.haskell.org url: http://hackage.haskell.org/ secure: True root-keys: 2ae741f4c4a5f70ed6e6c48762e0d7a493d8dd265e9cbc6c4037dfc7ceaec70e 32d3db5b4403935c0baf52a2bcb05031784a971ee2d43587288776f2e90609db eed36d2bb15f94628221cde558e99c4e1ad36fd243fe3748e1ee7ad00eb9d628 key-threshold: 2`

(this syntax for specifying repositories in the cabal

`config`

is new.)

We have written an example client that demonstrates how to use this API; the example client supports both local and remote repositories and can use `HTTP`

, `http-client`

or `curl`

, and yet is only just over 100 lines of code.

The server-side support provided by Hackage.Security.Server comes primarily in the form of datatypes corresponding to the TUF metadata, along with functions for constructing them.

It is important to realize that servers need not be running the Hackage software; mirrors of the central Hackage server may (and typically will) be simple HTTP file servers, and indeed company-internal package servers may choose not to use the Hackage software at all, using *only* file servers. We provide a hackage-security utility for managing such file-based repositories; see below for details.

There are various ways in which you can try out this alpha release, depending on what precisely you are interested in.

Resources at a glance

hackage-securitylibrarygithub tag “pre-release-alpha” cabal-installgithub branch “using-hackage-security” hackagegithub branch “using-hackage-security”

We provide two almost-complete secure (but mostly static) Hackage snapshots, located at

```
http://hackage.haskell.org/security-alpha/mirror1
http://hackage.haskell.org/security-alpha/mirror2
```

If you want to use `cabal`

to talk to these repositories, you will need to download and build it from the using-hackage-security branch. Then change your `cabal`

`config`

and add a new section:

```
remote-repo alpha-snapshot
url: http://hackage.haskell.org/security-alpha/mirror1
secure: True
root-keys: 89e692e45b53b575f79a02f82fe47359b0b577dec23b45d846d6e734ffcc887a
dc4b6619e8ea2a0b72cad89a3803382f6acc8beda758be51660b5ce7c15e535b
1035a452fd3ada87956f9e77595cfd5e41446781d7ba9ff9e58b94488ac0bad7
key-threshold: 2
```

It suffices to point `cabal`

to *either* mirror; TUF and the `hackage-security`

library provide built-in support for providing clients with a list of mirrors. During the first check for updates `cabal`

will download this list, and then use either mirror thereafter. Note that if you wish you can set the `key-threshold`

to 0 and not specify any root keys; if you do this, the initial download of the `root`

information will not be verified, but all access will be secure after that.

These mirrors are *almost* complete, because the first mirror has an intentional problem: the latest version of `generics-sop`

does not match its signed hash (simulating an evil attempt from an attacker to replace the `generics-sop`

library with `DoublyEvil-0.3.142`

). If you attempt to `cabal get`

this library `cabal`

should notice something is amiss on this mirror, and automatically try again from the second mirror (which has not been “compromised”):

```
# cabal get generics-sop
Downloading generics-sop-0.1.1.2...
Selected mirror http://hackage.haskell.org/security-alpha/mirror1
Downloading package generics-sop-0.1.1.2
Exception Invalid hash for .../generics-sop-0.1.1.2.tar45887.gz
when using mirror http://hackage.haskell.org/security-alpha/mirror1
Selected mirror http://hackage.haskell.org/security-alpha/mirror2
Downloading package generics-sop-0.1.1.2
Unpacking to generics-sop-0.1.1.2/
```

(It is also possible to use the example client to talk to these mirrors, or indeed to a secure repo of your own.)

If you want to experiment with setting up your own secure repository, the easiest way to do this is to set up a file based repository using the hackage-security utility. A file based repository (as opposed to one running the actual Hackage software) is much easier to set up and will suffice for many purposes.

Create a directory

`~/my-secure-repo`

containing a single subdirectory`~/my-secure-repo/package`

. Put whatever packages you want to make available from your repo in this subdirectory. At this point your repository might look like`~/my-secure-repo/package/basic-sop-0.1.0.5.tar.gz ~/my-secure-repo/package/generics-sop-0.1.1.1.tar.gz ~/my-secure-repo/package/generics-sop-0.1.1.2.tar.gz ~/my-secure-repo/package/json-sop-0.1.0.4.tar.gz ~/my-secure-repo/package/lens-sop-0.1.0.2.tar.gz ~/my-secure-repo/package/pretty-sop-0.1.0.1.tar.gz`

(because obviously the generics-sop packages are the first things to come to mind when thinking about which packages are important to secure.) Note the flat directory structure: different packages and different versions of those packages all live in the one directory.

Create public and private keys:

`# hackage-security create-keys --keys ~/my-private-keys`

This will create a directory structure such as

`~/my-private-keys/mirrors/id01.private ~/my-private-keys/mirrors/.. ~/my-private-keys/root/id04.private ~/my-private-keys/root/.. ~/my-private-keys/snapshot/id07.private ~/my-private-keys/target/id08.private ~/my-private-keys/target/.. ~/my-private-keys/timestamp/id11.private`

containing keys for all the various TUF roles (proper key management is not part of this alpha release).

Note that these keys are stored outside of the repository proper.

Create the initial TUF metadata and construct an index using

`# hackage-security bootstrap \ --repo ~/my-secure-repo \ --keys ~/my-private-keys`

This will create a directory

`~/my-secure-repo/index`

containing the`.cabal`

files (extracted from the package tarballs) and TUF metadata for all packages`~/my-secure-repo/index/basic-sop/0.1.0.5/basic-sop.cabal ~/my-secure-repo/index/basic-sop/0.1.0.5/package.json ~/my-secure-repo/index/generics-sop/0.1.1.1/generics-sop.cabal ~/my-secure-repo/index/generics-sop/0.1.1.1/package.json ...`

and package the contents of that directory up as the index tarball

`~/my-secure-repo/00-index.tar.gz`

; it will also create the top-level metadata files`~/my-secure-repo/mirrors.json ~/my-secure-repo/root.json ~/my-secure-repo/snapshot.json ~/my-secure-repo/timestamp.json`

The timestamp and snapshot are valid for three days, so you will need to resign these files regularly using

`# hackage-security update \ --repo ~/my-secure-repo \ --keys ~/my-private-keys`

You can use the same command whenever you add any additional packages to your repository.

If you now make this directory available (for instance, by pointing Apache at it) you should be able to use

`cabal`

to access it, in the same way as described above for accessing the secure Hackage snapshots. You can either set`key-threshold`

to 0, or else copy in the root key IDs from the generated`root.json`

file.

If you are feeling adventurous you can also try to set up your own secure Hackage server. You will need to build Hackage from the using-secure-hackage branch.

You will need to create a subdirectory `TUF`

inside Hackage’s `datafiles/`

directory, containing 4 files:

```
datafiles/TUF/mirrors.json
datafiles/TUF/root.json
datafiles/TUF/snapshot.private
datafiles/TUF/timestamp.private
```

containing the list of mirrors, the root metadata, and the private snapshot and timestamp keys. You can create these files using the `hackage-security`

utility:

- Use the
`create-keys`

as described above to create a directory with keys for all roles, and then copy over the snapshot and timestamp keys to the`TUF`

directory. - Use the
`create-root`

and`create-mirrors`

commands to create the root and mirrors metadata. The`create-mirrors`

accepts an arbitrary list of mirrors to be added to the mirrors metadata, should you wish to do so.

Note that the `root.json`

and `mirrors.json`

files are served as-is by Hackage, they are not used internally; the snapshot and timestamp keys are of course used to sign the snapshot and the timestamp.

Once you have created and added these files, everything else should Just Work(™). When you start up your server it will create TUF metadata for any existing packages you may have (if you are migrating an existing database). It will create a snapshot and a timestamp file; create metadata for any new packages you upload and update the snapshot and timestamp; and resign the snapshot and timestamp nightly. You can talk to the repository using `cabal`

as above.

If you a have Hackage server containing a lot of packages (a full mirror of the central Hackage server, for instance) then migration will be slow; it takes approximately an hour to compute hashes for all packages on Hackage. If this would lead to unacceptable downtime you can use the precompute-fileinfo tool to precompute hashes for all packages, given a recent backup. Copy the file created by this tool to `datafiles/TUF/md5-to-sha256.map`

before doing the migration. If all hashes are precomputed migration only takes a few minutes for a full Hackage snapshot.

`hackage-security`

If you want to experiment with integrating the `hackage-security`

library into your own software, the example client is probably the best starting point for integration in client software, and the hackage-security utility is probably a good starting point for integration in server software.

Please report any bugs or comments you may have as GitHub issues.

This is an alpha release, intended for testing by people with a close interest in the Hackage Security work. The issue tracker contains a list of issues to be resolved before the beta release, at which point we will make the security features available on the central Hackage server and make a patched `cabal`

available in a more convenient manner. Note that the changes to Hackage are entirely backwards compatible, so existing clients will not be affected.

After the beta release there are various other loose ends to tie up before the official release of Phase 1 of this project. After that Phase 2 will add author signing.

]]>Years ago, back when Isaac Potoczny-Jones and others were defining the Cabal specification, the big idea was to make Haskell software portable to different environments. One of the mantras was “no untracked dependencies!”.

The problem at the time was that Haskell code had all kinds of implicit dependencies which meant that while it worked for you, it wouldn’t build for me. For example, I might not have some other module that it needed, or the right version of the module.

So of course that’s what the `build-depends`

in `.cabal`

files is all about, requiring that the author of the code declare just what the code requires of its environment. The other important part is that the build system only lets your code see the dependencies you’ve declared, so that you don’t accidentally end up with these untracked dependencies.

This mantra of no untracked dependencies is still sound. If we look at a system like nix, part of what enables it to work so well is that it is absolutely fanatical about having no untracked dependencies.

One weakness in the original Cabal specification is with `Setup.hs`

scripts. These scripts are defined in the spec to be *the* entry point for the system. According to the Cabal spec, to build a package you’re required to compile the `Setup.hs`

script and then use its command line interface to get things done. Because in the original spec the `Setup.hs`

is the first entry point, it’s vital that it be possible to compile `Setup.hs`

without any extra fuss (the `runhaskell`

tool was invented just to make this possible, and to make it portable across compilers).

But by having the `Setup.hs`

as the primary entry point, it meant that it’s impossible to reliably use external code in a `Setup.hs`

script, because you cannot guarantee that that code is pre-installed. Going back to the “no untracked dependencies” mantra, we can see of course that all dependencies of `Setup.hs`

scripts are in fact untracked!

This isn’t just a theoretical problem. Haskell users that do have complex `Setup.hs`

scripts often run into versioning problems, or need external tools to help them get the pre-requisite packages installed. Or as another example: Michael Snoyman noted earlier this year in a diagnosis of an annoying packaging bug that:

As an aside, this points to another problematic aspect of our toolchain: there is no way to specify constraints on dependencies used in custom

`Setup.hs`

files. That’s actually caused more difficulty than it may sound like, but I’ll skip diving into it for now.

As I said, the mantra of no untracked dependencies is still sound, we just need to apply it more widely.

These days the `Setup.hs`

is effectively no longer a human interface, it is now a machine interface used by other tools like `cabal`

or by distro’s install scripts. So we no longer have to worry so much about `Setup.hs`

scripts always compiling out of the box. It would be acceptable now to say that the first entry point for a tool interacting with a package is the `.cabal`

file, which might list the dependencies of the `Setup.hs`

. The tool would then have to ensure that those dependencies are available when compiling the `Setup.hs`

.

So this is exactly what we have now done. Members of the Industrial Haskell Group have funded us to fix this long standing problem and we have recently merged the solution into the development version of `Cabal`

and `cabal-install`

.

From a package author’s point of view, the solution looks like this: in your `.cabal`

file you can now say:

```
build-type: Custom
custom-setup
setup-depends: base >= 4.6,
directory >= 1.0,
Cabal >= 1.18 && < 1.22,
acme-setup-tools == 0.2.*
```

So it’s a new stanza, like libraries or executables, and like these you can specify the library dependencies of the `Setup.hs`

script.

Now tools like `cabal`

will compile the `Setup.hs`

script with these and only these dependencies, just like it does normally for executables. So no more untracked dependencies in `Setup.hs`

scripts. Newer `cabal`

versions will warn about not using this new section. Older `cabal`

versions will ignore the new section (albeit with a warning). So over time we hope to encourage all packages with custom setup scripts to switch over to this.

In addition, the `Setup.hs`

script gets built with CPP version macros (`MIN_VERSION_{pkgname}`

) available so that the code can be made to work with a wider range of versions of its dependencies.

So on the surface this is all very simple and straightforward, a rather minor feature even. In fact it’s been remarkably hard to implement fully for reasons I’ll explain, but the good news is that it works and the hard work has also gotten us solutions to a couple other irksome problems.

Firstly, why isn’t it trivial? It’s inevitable that sooner or later you will find that your application depends on one package that has setup deps like `Cabal == 1.18.*`

and another with setup deps like `Cabal == 1.20.*`

. At that point we have a problem. Classically we aim to produce a build plan that uses at most one version of each package. We do that because otherwise there’s a danger of type errors from using multiple versions of the same package. Here with setup dependencies there is no such danger: it’s perfectly possible for me to build one setup script with one version of the `Cabal`

library and another script with a different `Cabal`

version. Because these are executables and not libraries, the use of these dependencies does not “leak”, and so we would be safe to use different versions in different places.

So we have extended the `cabal`

solver to allow for limited controlled use of multiple versions of the same package. The constraint is that all the “normal” libraries and exes all use the same single version, just as before, but setup scripts are allowed to introduce their own little world where independent choices about package versions are allowed. To keep things sane, the solver tries as far as possible not to use multiple versions unless it really has to.

If you’re interested in the details in the solver, see Edsko’s recent blog post.

This work in the solver has some extra benefits.

In places the Cabal library is a little crufty, and the API it exposes was never really designed as an API. It has been very hard to fix this because changes in the Cabal library interface break `Setup.hs`

scripts, and there was no way for packages to insulate themselves from this.

So now that we can have packages have proper dependencies for their custom `Setup.hs`

, the flip side is that we have an opportunity to make breaking changes to the Cabal library API. We have an opportunity to throw out the accumulated cruft, clean up the code base and make a library API that’s not so painful to use in `Setup.hs`

scripts.

`base`

Another benefit is that the new solver is finally able to cope with having “base shim” packages, as we used in the base 3.x to 4.x transition. For two GHC releases, GHC came with both base-3.x and base-4.x. The base-4 was the “true” base, while the base-3 was a thin wrapper that re-exported most of base-4 (and syb), but with some changes to implement the old base-3 API. At the time we adapted cabal to cope with this situation of having two versions of a package in a single solution.

When the new solver was implemented however support for this situation was not added (and the old solver implementation was retained to work with GHC 6.12 and older).

This work for setup deps has made it relatively straightforward to add support for these base shims. So next time GHC needs to make a major bump to the version of base then we can use the same trick of using a shim package. Indeed this might also be a good solution in other cases, perhaps cleaner than all these `*-compat`

packages we’ve been accumulating.

It has also finally allowed us to retire the old solver implementation.

Another feature that is now easy to implement (though not actually implemented yet) is dealing with the dependency cycles in packages’ test suites and benchmarks.

Think of a core package like `bytestring`

, or even less core like Johan’s `cassava`

csv library. These packages have benchmarks that use the excellent `criterion`

library. But of course `criterion`

is a complex beast and itself depends on `bytestring`

, `cassava`

and a couple dozen other packages.

This introduces an apparent cycle and `cabal`

will fail to find an install plan. I say apparent cycle because there isn’t really a cycle: it’s only the benchmark component that uses `criterion`

, and nothing really depends on that.

Here’s another observation: when benchmarking a new `bytestring`

or `cassava`

, it does not matter one bit that `criterion`

might be built against an older stable version of `bytestring`

or `cassava`

. Indeed it’s probably sensible that we use a stable version. It certainly involves less rebuilding: I don’t really want to rebuild `criterion`

against each minor change in `bytestring`

while I’m doing optimisation work.

So here’s the trick: we break the cycle by building `criterion`

(or say `QuickCheck`

or `tasty`

) against another version of `bytestring`

, typically some existing pre-installed one. So again this means that our install plan has two versions of `bytestring`

in it: the one we mean to build, and the one we use as a dependency for `criterion`

. And again this is ok, just as with setup dependencies, because dependencies of test suites and benchmarks do not “leak out” and cause diamond dependency style type errors.

One technical restriction is that the test suite or benchmark must not depend on the library within the same package, but must instead use the source files directly. Otherwise there would genuinely be a cycle.

Now in general when we have multiple components in a `.cabal`

file we want them to all use the same versions of their dependencies. It would be deeply confusing if a library and an executable within the same package ended up using different versions of some dependency that might have different behaviour. Cabal has always enforced this, and we’re not relaxing it now. The rule is that if there are dependencies of a test suite or benchmark that are *not shared* with the library or executable components in the package, then we are free to pick different versions for those than we might pick elsewhere within the same solution.

As another example – that’s nothing to do with cycles – we might pick different versions of `QuickCheck`

for different test suites in different packages (though only where necessary). This helps with the problem that one old package might require `QuickCheck == 2.5.*`

while another requires `QuickCheck == 2.8.*`

. But it’d also be a boon if we ever went through another major QC-2 vs QC-3 style of transition. We would be able to have both QC-2 and QC-3 installed and build each package’s test suite against the version it requires, rather than freaking out that they’re not the same version.

Technically, this work opens the door to allowing private dependencies more generally. We’re not pursuing that at this stage, in part because it is not clear that it’s actually a good idea in general.

Mark Lentczner has pointed out the not-unreasonable fear that once you allow multiple versions of packages within the same solution it will in practice become impossible to re-establish the situation where there is just one version of each package, which is what distros want and what most people want in production systems.

So that’s something we should consider carefully as a community before opening those flood gates.

]]>