How we might abolish Cabal Hell, part 2

Wed, 28 Jan 2015 13:29:57 GMT, by duncan.
Filed under cabal, community.

In part 1 we looked at an overview of all the various problems that make up “Cabal hell”. We also looked at an overview of a few solutions and how they overlap.

In part 2 and part 3 we’ll look in more detail at the two major solutions to Cabal hell. In this post we’ll look at Nix-style package management, and in the next post we’ll look at curated package collections.

A reminder about what the problems are

You might recall from part 1 this slightly weird diagram of the symptoms of Cabal Hell:

Cabal Hell: the symptoms 

In this part we’re going to pick out a couple of the problems and look them in a bit more detail:

Breaking re-installations

There are situations where Cabal’s chosen solution would involve reinstalling an existing version of a package but built with different dependencies.

For example, suppose I have an application app and it depends on zlib and bytestring. The zlib package also depends on bytestring. Initially the versions I am using for my app-1.0 are zlib-0.5 and bytestring-0.9 (Versions simplified for presentation.) Now I decide in the next version of my app, app-1.1, that I want to use a nice new feature in bytestring version 0.10. So I ask cabal to build my application using a more recent bytestring. The cabal tool will come up with a solution using the new bytestring-0.10 and the same old version of zlib-0.5. Building this solution will involve installing bytestring-0.10 and rebuilding zlib-0.5 against it.

Example breakage 1 

What is the problem here?

The problem is the rebuilding of zlib-0.5.

Why is this a problem?

It is a problem because when we install the instance “zlib-0.5 built against bytestring-0.10” we have to delete the pre-existing instance “zlib-0.5 built against bytestring-0.9”. Anything that depended on that previous instance now has a dangling reference and so is effectively broken.

Why do we have to delete the previous instance?

The way installed packages are managed is such that each GHC package database can only have one instance of each package version. Note that having two different versions would be allowed, e.g. zlib-0.4 and zlib-0.5, but having two instances of zlib-0.5 is not allowed. Not deleting the previous instance would involve having two instances of zlib-0.5 (each instance built against different versions of bytestring).

So the root of the problem is having to delete – or if you like, mutate – package instances when installing new packages. Mutable state strikes again! And this is due to the limitation of only being able to have one instance of a package version installed at once.

Type errors when using packages together

The second, orthogonal, problem is that it is possible to install two packages and then load them both in GHCi and find that you get type errors when composing things defined in the two different packages. Effectively you cannot use these two particular installed packages together.

The reason is that the two packages have been built using different versions of some common dependency. For example, I might have zlib built against bytestring-0.9 and binary built against bytestring-0.10. Now if I try to use them together in GHCi (e.g. compressing the result of binary serialisation) then I will get a type error from GHC saying that bytestring-0.9:Data.ByteString.ByteString is not the same type as bytestring-0.10:Data.ByteString.ByteString. And GHC is not wrong here, we really cannot pretend that these are the same types (at least not in general).

Example breakage 2 

This is a variant on the classic “diamond dependency problem” that cabal solved years ago. So why do we still get it? In fact we never hit this problem when building a package with cabal, because cabal’s solver looks for “consistent” solutions. (Recall from part 1, that when we say a “consistent” solution we mean one that uses only one version of any package.)

We can still hit this problem when we install things separately and then use the packages directly with ghc or ghci. This is because cabal does not enforce consistency in the developer’s environment. It only enforces consistency within any set of packages it installs simultaneously.

The fundamental problem is that developers expect to be able to use combinations of their installed packages together, but the package tools do not enforce consistency of the developer’s environment.

In practice this class of problem is currently relatively rare. In part that is because one would often hit the problem above involving re-installing and breaking packages. If however we lifted the limitation on installing multiple instances of the same version of a package, then we would always be able to install new versions and we would instead hit this problem much more frequently.

Nix-style package management

The ideas behind Nix are now over 10 years old. When first reading the published papers on Nix some years ago I was struck by how simple and elegant they are. They also appear to work well in practice, with a full Linux distribution based on them, plus a number of other well-regarded tools.

The key ideas are:

Contrast this to a traditional approach, e.g. what we have now with Cabal, or Linux distros based on .rpm or .deb. In the traditional approach there is a single environment which is exactly the same as the full set of installed packages. So compared to a traditional approach, we have an extra level of indirection. Having views lets us have a much larger collection of packages installed than we have available in the working environment, and allows multiple environments.

A good illustration of what these ideas give us is to see what happens when we want to add a new package:

  1. We start with our initial environment which points to a bunch of packages from the store.

  2. We compile and install a new package into the store. So far this changes very little, which is a big feature! In particular it cannot break anything. The new installed package can co-exist with all the existing ones. There can be no conflicts (the install paths are arranged to guarantee this). No existing packages have been modified or broken. No environments have yet been changed.

  3. Now we have choices about what to do with environments. Our existing environment is unchanged and does not contain the new package. We can create a new environment that consists of the old environment with the extra package added. In principle both the old and the new environments exist. This is very much like a persistent functional data structure:

    let env' = Map.insert pkgname pkg env

    Both exist, the old one is not changed, and we can decide if we are only interested in the new one or if we want to keep both.

  4. Finally we can, if we want, switch our “current” environment to be the new environment with the new package. So while multiple environments can exist, only one of them is active in our current shell session.

So what have we gained from the added indirection of a persistent store + views?

The multiple environments effectively gives us sandboxes for free. In fact it’s better because we can easily share artefacts between sandboxes when we want to. That means far fewer rebuilds, and easy global installation of things built in an isolated environment.

Nix has a few other good ideas, like its functional package description language and some clever tricks for dealing with the messy details of system packages. The key ideas however are the ones I’ve just outlined, and they are the ones that we should steal for GHC/Cabal.

Mechanisms, policies and workflows

The package store and the multiple environments are just a mechanism, not a user interface. The mechanisms are also mostly policy free.

Generally we should start from user requirements, use cases, workflows and the like, and work out a user interface and then decide on mechanisms that will support that user interface. That said, I think it is clear that the Nix mechanisms are sound and sufficiently general and flexible that they will cover pretty much any user interface we decide we want.

So I think our design process should be:

  1. Look at the packaging tool requirements, use cases, workflows etc, and work out a user interface design.

  2. Then figure out how the actions in that user interface translate into operations on the Nix package store and environments.

Addressing the Cabal Hell problems

The Nix approach deals nicely with the problem of breaking re-installations. The Nix mechanisms guarantee that installed packages are never mutated, so no existing installed packages ever break.

The Nix mechanisms do not deal directly with the issue of type errors when using packages together. As we noted before, that requires enforcing the consistency of the developer’s environment. In Nix terms this is a policy about how we manage the Nix environment(s). The policy would be that each environment contains only one version of each package and it would guarantee that all packages in an environment can be used together.

Without wishing to prejudge the future user interface for cabal, I think this is a policy that we should adopt.

Enforcing consistency does have implications for the user interface. There will be situations where one wants to install a new package, but it is impossible to add it to the current environment while keeping all of the existing packages. For example, suppose we have two different web stacks that have many packages in common but that require different versions of some common package. In that case we could not have a consistent environment that contains both. Thus the user interface will have to do something when the user asks to add the second web stack to an environment that already contains the first. The user interface could minimise the problem by encouraging a style of use where most environments are quite small, but it cannot be avoided in general.

While what I am suggesting for consistency is relatively strong, we cannot get away without enforcing some restrictions on the environment. For example if our environment did contain two instances of the same version of a package then which one would we get when we launch GHCi? So my view is that given that we cannot avoid the user interface issues with environment consistency, it is better to go for the stronger and more useful form.

In fact we’ve already been experimenting in this direction. The current cabal sandbox feature does enforce consistency within each sandbox. This seems to work ok in practice because each sandbox environment is relatively small and focused on one project. Interestingly we have had some pressure to relax this constraint due to the cost of creating new sandboxes in terms of compiling. (Allowing some inconsistencies in the environment allows the common packages to be shared and thus only compiled once.) Fortunately this issue is one that is nicely solved by Nix environments which are extremely cheap to create because they allow sharing of installed packages.

Implementation progress

We’ve been making small steps towards the Nix design for many years now. Several years ago we changed GHC’s package database to use the long opaque package identifiers that are necessary to distinguish package instances.

More recently Philipp Schuster did a GSoC project looking into the details of what we need to do to incorporate the Nix ideas within GHC and Cabal. You can see the slides and video of his HiW presentation. We learned a lot, including that there’s quite a lot of work left to do.

Last year Edward Yang and Simon PJ (with advice from Simon Marlow and myself) started working on implementing the “Backpack” package system idea within GHC and Cabal. Backpack also needs to be able to manage multiple instances of packages with the same version (but different deps) and so it overlaps quite a lot with what we need for Nix-style package management in GHC. So Edward’s work has dealt with many of the issues in GHC that will be needed for Nix-style package management.

Another small step is that in GHC 7.10 we finally have the ability to register multiple instances of the same version of a package, and we have the basic mechanism in GHC to support multiple cheap environments (using environment files). Both of these new GHC features are opt-in so that they do not break existing tools.

The remaining work is primarily in the cabal tool. In particular we have to think carefully about the new user interface and how it maps into the Nix mechanisms.

So there has been a lot of progress and if we can keep making small useful steps then I think we can get there. Of course it would help to focus development efforts on it, perhaps with a hackathon or two.

Conclusion

Implementing the Nix approach in GHC/Cabal would cover a major part of the Cabal Hell problems.

In the next post we’ll look at curated package collections, which solves a different (but slightly overlapping) set of Cabal Hell problems. Nix-style package management and curated package collections are mostly complementary and we want both.


Simple SMT solver for use in an optimizing compiler

Wed, 31 Dec 2014 13:36:52 GMT, by edsko.
Filed under coding.

This is a second blog post in a series about engineering optimizing compilers; the previous was Quasi-quoting DSLs for free. In this blog we will see how we might write a very simple SMT solver that will allow us to write an optimization pass that can turn something like

if a == 0 then
  if !(a == 0) && b == 1 then
    write 1
  else
    write 2
else
  write 3

into

if a == 0 then
  write 2
else
  write 3

without much effort at all.

Preliminaries

For the sake of this blog post we will consider a very simple imperative object language, defined as

data Expr =
    -- Arithmetic expressions
    ExprInt Integer           -- ^ Integer constants
  | ExprAdd Expr Expr         -- ^ Addition
    -- Boolean expressions
  | ExprBool Bool             -- ^ Boolean constants
  | ExprEq Expr Expr          -- ^ Equality
  | ExprNot Expr              -- ^ Negation
  | ExprAnd Expr Expr         -- ^ Logical conjunction
  | ExprOr Expr Expr          -- ^ Logical disjunction
    -- Variables
  | ExprVar VarName           -- ^ Read from a variable
  | ExprAssign VarName Expr   -- ^ Write to a variable
    -- Control flow
  | ExprSeq Expr Expr         -- ^ Sequential composition
  | ExprIf Expr Expr Expr     -- ^ Conditional
    -- Input/output
  | ExprRead                  -- ^ Read an integer from the console
  | ExprWrite Expr            -- ^ Write an integer to the console

We will assume the existence of a quasi-quoter for this language so that we can write Haskell fragments such as

[expr| if a == 0 then read else write b |]

instead of

ExprIf (ExprEq (ExprVar a) (ExprInt 0)) 
       ExprRead 
       (ExprWrite (ExprVar b))

How you can write such a quasi-quoter was the topic of the previous blog post. You should however be able to read this blog post without having read the previous post; hopefully the mapping from the concrete syntax to the constructors of Expr is pretty obvious.

Simplifying assumption

Our goal will be to write a function

provable :: Expr -> Bool

Let’s consider some examples:

Note that this means that it’s perfectly possible for both an expression and the negation of that expression to be unprovable.

What about an expression !(n == 0 && n == 1)? Morally speaking, this expression should be provable. However, we will be making the following very important simplifying assumption:

provable is allowed to give up: when provable returns False, this should be interpreted as “failed to prove”, rather than “there exist a counterexample”.

From a compiler perspective, if something is not statically provable, that simply means that an optimization may not be applied even though it could: that is, we miss an opportunity to make the program go faster, but we will not break the code.

An evaluator

We don’t want (or need) to embed a full blown theorem prover into our compiler: ideally we write something simple that will still catch a lot of the common cases. Moreover, we would prefer to reuse as much of our existing compiler infrastructure as we can. Our optimizer is likely to contain an interpreter for the object language, so that we can attempt to statically evaluate expressions. We are going to adapt this interpreter so that we can also use it in our solver. In fact, as we shall see, the solver will be a one-liner.

But we’re getting ahead of ourselves. Let’s consider how to write the interpreter first. The interpreter will be running in an Eval monad; for our first attempt, let’s define this monad as a a simple wrapper around the list monad:

newtype Eval a = Eval { unEval :: [a] }
  deriving ( Functor
           , Applicative
           , Alternative
           , Monad
           , MonadPlus
           )

runEval :: Eval a -> [a]
runEval act = unEval act

We will use the monad for failure:

throwError :: Eval a
throwError = Eval []

We can provide error recovery through

onError :: Eval a -> Eval a -> Eval a
onError (Eval act) (Eval handler) = Eval $
    case act of
      [] -> handler
      rs -> rs

We will see why we need the list monad when we discuss the evaluator for boolean expressions; but let’s consider the evaluator for integer expressions first:

evalInt :: Expr -> Eval Integer
evalInt = go
  where
    go (ExprInt i)    = return i
    go (ExprAdd a b)  = (+) <$> evalInt a <*> evalInt b
    go (ExprIf c a b) = do cond <- evalBool c
                           evalInt (if cond then a else b)
    go _              = throwError 

Hopefully this should be pretty self explanatory; our toy language only has a few integer-valued expressions, so there isn’t much to do. The interpreter for boolean expressions is more interesting:

evalBool :: Expr -> Eval Bool
evalBool = \e -> go e `onError` guess e
  where
    go (ExprBool a)   = return a
    go (ExprEq   a b) = (==) <$> evalInt a <*> evalInt b
    go (ExprNot  a)   = not  <$> evalBool a
    go (ExprAnd  a b) = (&&) <$> evalBool a <*> evalBool b
    go (ExprOr   a b) = (||) <$> evalBool a <*> evalBool b
    go (ExprIf c a b) = do cond <- evalBool c
                           evalBool (if cond then a else b)
    go _              = throwError 

    guess _e = return True <|> return False

The definition of go contains no surprises, and follows the definition of go in evalInt very closely. However, the top-level definition

evalBool = \e -> eval e `onError` guess e

is more interesting. If for some reason we fail to evaluate a boolean expression (for example, because it contains a variable) then guess returns both True and False. Let’s consider some examples:

runEval $ evalBool [expr| True |]

evalutes to [True];

runEval $ evalBool [expr| a |]

evaluates to [True, False] because we don’t know what value a has, but

runEval $ evalBool [expr| a || True |]

evaluates to [True, True]: we still have two guesses for a, but no matter what we guess a || True always evaluates to True.

Satisfiability

We can now write our SMT solver; as promised, it will be a single line:

satisfiable :: Expr -> Bool
satisfiable = or . runEval . evalBool

Function satisfiable (the “S” in SMT) checks if the expression is true for some values of the variables in the expression. How do we check this? Well, we just run our evaluator which, when it encounters a variable, will return both values for the variable. Hence, if any of the values returned by the evaluator is True, then the expression is true at least for one value for each variable.

Once we have an implementation of satisfiability, we can implement provable very easily: an expression is provable if its negation is not satisfiable:

provable :: Expr -> Bool
provable = not . satisfiable . ExprNot

If we consider the three examples from the previous section, we will find that both True and a || True are provable, but a by itself is not, as expected.

Inconsistent guesses

The careful reader might at this point find his brow furrowed, because something is amiss:

runEval $ evalBool [expr| a || !a |]

evaluates to

[True, True, False, True]

This happens because the evaluator will make two separate independent guesses about the value of a. As a consequence, a || !a will be considered not provable.

Is this a bug? No, it’s not. Recall that we made the simplifying assumption that provable is allowed to give up: it’s allowed to say that an expression is not provable even when it morally is. The corresponding (dual) simplifying assumption for satisfability, and hence the interpreter, is:

The interpreter, and hence satisfiability, is allowed to make inconsistent guesses.

Making an inconsistent guess is equivalent to assuming False: anything follows from False and hence any expression will be considered satisfiable once we make an inconsistent guess. As a consequence, this means that once we make inconsistent guesses, we will consider the expression as not provable.

More precision

We can however do better: whenever we make a guess that a particular expression evaluates to True or False, then if we see that same expression again we can safely make the same guess, rather than making an independent guess. To implement this, we extend our Eval monad with some state to remember the guesses we made so far:

newtype Eval a = Eval { unEval :: StateT [(Expr, Bool)] [] a }
  deriving ( Functor
           , Applicative
           , Alternative
           , Monad
           , MonadPlus
           , MonadState [(Expr, Bool)]
           )
           
runEval :: Eval a -> [a]
runEval act = evalStateT (unEval act) []

throwError :: Eval a
throwError = Eval $ StateT $ \_st -> []

onError :: Eval a -> Eval a -> Eval a
onError (Eval act) (Eval handler) = Eval $ StateT $ \st ->
    case runStateT act st of
      [] -> runStateT handler st
      rs -> rs

The implementation of evalInt does not change at all. The implementation of evalBool also stays almost the same; the only change is the definition of guess:

guess e = do
  st <- get
  case lookup e st of
    Just b  -> return b
    Nothing -> (do put ((e, True)  : st) ; return True)
           <|> (do put ((e, False) : st) ; return False)

This implements the logic we just described: when we guess the value for an expression e, we first look up if we already made a guess for this expression. If so, we return the previous guess; otherwise, we make a guess and record that guess.

Once we make this change

runEval $ evalBool [expr| a || !a |]

will evaluate to [True,True] and consequently a || !a will be considered provable.

Example: folding nested conditionals

As an example of how one might use this infrastructure, we will consider a simple pass that simplifies nested conditionals (if-statements). We can use provable to check if one expression implies the other:

(~>) :: Expr -> Expr -> Bool
(~>) a b = provable [expr| ! $a || $b |]

The simplifier is now very easy to write:

simplifyNestedIf :: Expr -> Expr
simplifyNestedIf = everywhere (mkT go)
  where
    go [expr| if $c1 then
                 if $c2 then
                   $e1
                 else
                   $e2
               else
                 $e3 |]
      | c1 ~> c2              = [expr| if $c1 then $e1 else $e3 |]
      | c1 ~> [expr| ! $c2 |] = [expr| if $c1 then $e2 else $e3 |]
    go e = e

The auxiliary function go pattern matches against any if-statement that has another if-statement as its “then” branch. If we can prove that the condition of the outer if-statement implies the condition of the inner if-statement, we can collapse the inner if-statement to just its “then” branch. Similarly, if we can prove that the condition of the outer if-statement implies the negation of the condition of the inner if-statement, we can collapse the inner if-statement to just its “else” branch. In all other cases, we return the expression unchanged. Finally, we use the Scrap Your Boilerplate operators everywhere and mkT to apply this transformation everywhere in an expression (rather than just applying it top-level). For example,

simplifyNestedIf [expr| 
    if a == 0 then 
      if !(a == 0) && b == 1 then 
        write 1 
      else 
        write 2 
    else 
      write 3 
  |]

evaluates to

if a == 0 then
  write 2
else
  write 3

as promised. Incidentally, perhaps you are wondering why such an optimization would be useful; after all, which self respecting programmer writes such code? However, code like the above may result from other compiler optimizations such as inlining: imagine that the inner if-statement came from a inlined function. A lot of compiler optimizations are designed to clean up other compiler optimizations.

Conclusion

We can implement a very simple but useful SMT solver for use in an optimizing compiler by making a small change to the interpreter, which we need anyway.

Note that the increase in precision gained from recording previous guesses is a gradual increase only; satisfiability may still make some inconsistent guesses. For example

runEval $ evalBool [expr| !(n == 0 && n == 1) |]

will evaluate to

[False,True,True,True]

because it is making independent guesses about n == 0 and n == 1; consequently !(n == 0 && n == 1) will not be considered provable. We can increase the precision of our solver by making guess smarter (the “MT” or “modulo theories” part of SMT). For example, we could record some information about the guesses about integer valued variables.

At the extreme end of the scale we would be implementing a full decision procedure for first order arithmetic, which is probably optimization gone too far. However, the approach we took above where we start from the basis that we are allowed to make inconsistent guesses gives us a way to implement a simple solver that addresses the most important cases, even if it misses a few—without requiring a lot of complicated infrastructure in the compiler. And as long as we make sure that our evaluator never returns too few values (even if it may return too many) we don’t have to worry that we might generate invalid code: the worst that can happen is that we miss an optimization.


Compose conference and New York City Haskell courses

Fri, 05 Dec 2014 11:56:58 GMT, by andres.
Filed under community, training, well-typed.

Well-Typed is happy to announce that we are sponsoring

C◦mp◦se conference

Friday, January 30 – Sunday, February 1, 2015, New York City

This conference is focused on functional programming and features a keynote by Stephanie Weirich on dependent types as well as invited talks by Anthony Cowley, Maxime Ransan and Don Syme, plus a whole lot of additional contributed talks. There’s also an “unconference” with small workshops and tutorials as well as the opportunity to get your hands dirty and try things out yourself.

For several years now, we have been running successful Haskell courses in collaboration in Skills Matter. For the C◦mp◦se conference, we have decided to bring theses courses to New York! You can participate in our Haskell courses directly before or directly after the conference (or both):

Fast Track to Haskell

Thursday, January 29 – Friday, January 30, 2015, New York City

(Don’t worry, there’s no overlap with Stephanie Weirich’s keynote on Friday evening that marks the start of C◦mp◦se.)

You can register here.

This course is for developers who want to learn about functional programming in general or Haskell in particular. It introduces important concepts such as algebraic datatypes, pattern matching, type inference, polymorphism, higher-order functions, explicit effects and, of course, monads and provides a compact tour with lots of hands-on exercises that provide a solid foundation for further adventures into Haskell or functional programming.

Advanced Haskell

Monday, February 2 – Tuesday, February 3, 2015, New York City

You can register here.

This course is for developers who have some experience in Haskell and want to know how to work on larger projects and how things scale. The course covers important topics such as selecting the right data structures for a task at hand, taking the functional perspective into account, and takes a thorough look at Haskell’s “lazy evaluation” and how to reason about time and space performance of Haskell programs. There’s also some focus on how to use Haskell’s powerful abstraction mechanisms and concepts such as Applicative Functors, Monads and Monad Transformers to help you organize larger code bases. Finally, depending on time and demand, there’s the opportunity to look at Parallelism and Concurrency, or at type-level programming. Once again, the course comes with several carefully designed hands-on exercises and provides room for specific questions that the participants might have.

Both courses will be taught by Duncan Coutts, co-founder and partner at Well-Typed. He’s both an experienced teacher and is involved in lots of commercial Haskell development projects at Well-Typed. He’s seen a lot of Haskell code, and knows perfectly which techniques and approaches work and which do not.

Well-Typed training courses

In general, our courses are very practical, but don’t shy away from theory where necessary. Our teachers are all active Haskell developers with not just training experience, but active development experience as well. In addition to these two courses in New York City, we regularly offer courses in London, and plan to offer courses in other European locations, too.

We also provide on-site training on requests nearly anywhere in the world. If you want to know more about our training or have any feedback or questions, have a look at our dedicated training page or just drop us a mail.


Haskell development job with Well-Typed

Thu, 27 Nov 2014 14:53:17 GMT, by duncan, andres.
Filed under well-typed.

tl;dr If you’d like a job with us, send your application before Christmas.

We are looking for a Haskell expert to join our team at Well-Typed. This is a great opportunity for someone who is passionate about Haskell and who is keen to improve and promote Haskell in a professional context.

About Well-Typed

We are a team of top notch Haskell experts. Founded in 2008, we were the first company dedicated to promoting the mainstream commercial use of Haskell. To achieve this aim, we help companies that are using or moving to Haskell by providing a range of services including consulting, development, training, and support and improvement of the Haskell development tools. We work with a wide range of clients, from tiny startups to well known multinationals. We have established a track record of technical excellence and satisfied customers.

Our company has a strong engineering culture. All our mangers and decision makers are themselves Haskell developers. Most of us have an academic background and we are not afraid to apply proper computer science to customers’ problems, particularly the fruits of FP and PL research.

We are a self-funded company so we are not beholden to external investors and can concentrate on the interests of our clients, our staff and the Haskell community.

About the job

We are looking for someone to join as a general member of our team, not for a single specific project or task. The work could cover any of the projects and activities that we are involved in as a company. The work may involve:

We try wherever possible to arrange tasks within our team to suit peoples’ preferences and to rotate to provide variety and interest.

Well-Typed has a variety of clients. For some we do proprietary Haskell development and consulting. For others, much of the work involves open-source development and cooperating with the rest of the Haskell community: the commercial, open-source and academic users.

Our ideal candidate has excellent knowledge of Haskell, whether from industry, academia or personal interest. Familiarity with other languages, low-level programming and good software engineering practices are also useful. Good organisation and ability to manage your own time and reliably meet deadlines is important. You should also have good communication skills. Being interested or having experience in teaching Haskell (or other technical topics) is a bonus. Experience of consulting or running a business is also a bonus. You are likely to have a bachelor’s degree or higher in computer science or a related field, although this isn’t a requirement.

Offer details

The offer is initially for a one-year full time contract, with the intention of a long term arrangement. We are also happy to consider applications for part-time work. We are a distributed company so everyone works remotely. Living in England is not required. We offer flexible hours. We may be able to offer either employment or sub-contracting, depending on the jurisdiction in which you live. We operate a fair profit share scheme so the salary is variable but with a minimum guarantee. For full time annual equivalent (with the statutory minimum holidays) the guaranteed minimum is GBP 34.8k and based on past performance the expected level is approximately 40% higher.

If you are interested, please apply via info@well-typed.com. Tell us why you are interested and why you would be a good fit for the job, and attach your CV. Please also indicate how soon you might be able to start. We are more than happy to answer informal enquiries. Contact Duncan Coutts (duncan@well-typed.com, dcoutts on IRC) or Andres Löh (andres@well-typed.com, kosmikus on IRC) for further information.

To ensure we can properly consider your application, please get it to us by December 22nd, 2014. We expect to do interviews at the beginning of January.


Quasi-quoting DSLs for free

Sun, 19 Oct 2014 14:01:29 GMT, by edsko.
Filed under coding.

Suppose you are writing a compiler for some programming language or DSL. If you are doing source to source transformations in your compiler, perhaps as part of an optimization pass, you will need to construct and deconstruct bits of abstract syntax. It would be very convenient if we could write that abstract syntax using the syntax of your language. In this blog post we show how you can reuse your existing compiler infrastructure to make this possible by writing a quasi-quoter with support for metavariables. As we will see, a key insight is that we can reuse object variables as meta variables.

Toy Language “Imp

For the sake of this blog post we will be working with a toy language called Imp. The abstract syntax for Imp is defined by

type VarName = String

data Expr =
    Var VarName
  | Add Expr Expr
  | Sub Expr Expr
  | Int Integer
  | Read
  deriving (Data, Typeable, Show, Eq)

data Cmd =
    Write Expr
  | Assign VarName Expr
  | Decl VarName
  deriving (Data, Typeable, Show)

data Prog = Prog [Cmd]
  deriving (Data, Typeable, Show)

and we will assume that we have some parsec parsers

parseExpr :: Parser Expr
parseProg :: Parser Prog

We will also make use of

topLevel :: Parser a -> Parser a
topLevel p = whiteSpace *> p <* eof

and the following useful combinator for running a parser:

parseIO :: Parser a -> String -> IO a

The details of these parsers are beyond the scope of this post. There are plenty of parsec tutorials online; for instance, you could start with the parsec chapter in Real World Haskell. Moreover, the full code for this blog post, including a simple interpreter for the language, is available on github if you want to play with it. Here is a simple example of an Imp program:

var x ;
x := read ;
write (x + x + 1)

A simple quasi-quoter

We want to be able to write something like

prog1 :: Prog
prog1 = [prog|
    var x ;
    x := read ;
    write (x + x + 1)
  |]

where the intention is that the [prog| ... |] quasi-quote will expand to something like

prog1 = Prog [ 
      Decl "x"
    , Assign "x" Read
    , Write (Add (Add (Var "x") (Var "x")) (Int 1))
    ]

To achieve this, we have to write a quasi-quoter. A quasi-quoter is an instance of the following data type:

data QuasiQuoter = QuasiQuoter { 
    quoteExp  :: String -> Q Exp
  , quotePat  :: String -> Q Pat
  , quoteType :: String -> Q Type
  , quoteDec  :: String -> Q [Dec] 
  }

The different fields are used when using the quasi-quoter in different places in your Haskell program: at a position where we expect a (Haskell) expression, a pattern (we will see an example of that later), a type or a declaration; we will not consider the latter two at all in this blog post.

In order to make the above example (prog1) work, we need to implement quoteExp but we can leave the other fields undefined:

prog :: QuasiQuoter
prog = QuasiQuoter {
      quoteExp = \str -> do
        l <- location'
        c <- runIO $ parseIO (setPosition l *> topLevel parseProg) str
        dataToExpQ (const Nothing) c
    , quotePat  = undefined
    , quoteType = undefined
    , quoteDec  = undefined
    }

Let’s see what’s going on here. The quasi-quoter gets as argument the string in the quasi-quote brackets, and must return a Haskell expression in the Template-Haskell Q monad. This monad supports, amongst other things, getting the current location in the Haskell file. It also supports IO.

Location

The first thing that we do is find the current location in the Haskell source file and convert it to parsec format:

location' :: Q SourcePos
location' = aux <$> location
  where
    aux :: Loc -> SourcePos
    aux loc = uncurry (newPos (loc_filename loc)) (loc_start loc)

Running the parser

Once we have the location we then parse the input string to a term in our abstract syntax (something of type Prog). We use parsec’s setPosition to tell parsec where we are in the Haskell source file, so that if we make a mistake such as

prog1 :: Prog
prog1 = [prog|
    var x ;
    x := read ;
    write (x + x + )
  |]

we get an error that points to the correct location in our Haskell file:

TestQQAST.hs:6:9:
    Exception when trying to run compile-time code:
      user error ("TestQQAST.hs" (line 9, column 20):
unexpected ")"
expecting "(", "read", identifier or integer)

Converting to Haskell abstract syntax

The parser returns something of type Prog, but we want something of type Exp; Exp is defined in Template Haskell and reifies the abstract syntax of Haskell. For example, we would have to translate the Imp abstract syntax term

Var "x" :: Prog

to its reflection as a piece of abstract Haskell syntax as

AppE (ConE 'Var) (LitE (StringL "x")) :: TH.Exp

which, when spliced into the Haskell source, yields the original Prog value. Fortunately, we don’t have to write this translation by hand, but we can make use of the following Template Haskell function:

dataToExpQ :: Data a 
           => (forall b. Data b => b -> Maybe (Q Exp)) 
           -> a -> Q Exp

This function can translate any term to a reified Haskell expression, as long as the type of the term derives Data (Data instances can be auto-derived by ghc if you enable the DeriveDataTypeable language extension). The first argument allows you to override the behaviour of the function for specific cases; we will see an example use case in the next section. In our quasi-quoter so far we don’t want to override anything, so we pass a function that always returns Nothing.

Once we have defined this quasi-quoter we can write

prog1 :: Prog
prog1 = [prog|
    var x ;
    x := read ;
    write (x + x + 1)
  |]

and ghc will run our quasi-quoter and splice in the Haskell expresion corresponding to the abstract syntax tree of this program (provided that we enable the QuasiQuotes language extension).

Meta-variables

Consider this function:

prog2 :: VarName -> Integer -> Prog
prog2 y n = [prog|
    var x ;
    x := read ;
    write (x + y + n)
  |]

As mentioned, in the source code for this blog post we also have an interpreter for the language. What happens if we try to run (prog2 "x" 1)?

*Main> intIO $ intProg (prog2 "x" 2)
5
*** Exception: user error (Unbound variable "y")

Indeed, when we look at the syntax tree that got spliced in for prog2 we see

Prog [ Decl "x"
     , Assign "x" Read
     , Write (Add (Add (Var "x") (Var "y")) (Var "n"))
     ]

What happened? Didn’t we pass in "x" as the argument y? Actually, on second thought, this makes perfect sense: this is after all what our string parses to. The fact that y and n also happen to Haskell variables, and happen to be in scope at the point of the quasi-quote, is really irrelevant. But we would still like prog2 to do what we expected it to do.

Meta-variables in Template Haskell

To do that, we have to support meta variables: variables from the “meta” language (Haskell) instead of the object language (Imp). Template Haskell supports this out of the box. For example, we can define

ex :: Lift a => a -> Q Exp
ex x = [| id x |]

Given any argument that supports Lift, ex constructs a piece of abstract Haskell syntax which corresponds to the application of the identity function to x. (Don’t confuse this with anti-quotation; see Brief Intro to Quasi-Quotation.) Lift is a type class with a single method

class Lift t where
  lift :: t -> Q Exp

For example, here is the instance for Integer:

instance Lift Integer where
  lift x = return (LitE (IntegerL x))

Meta-variables in quasi-quotes

Quasi-quotes don’t have automatic support for meta-variables. This makes sense: Template Haskell is for quoting Haskell so it has a specific concrete syntax to work with, where as quasi-quotes are for arbitrary custom syntaxes and so we have to decide what the syntax and behaviour of meta-variables is going to be.

For Imp we want to translate any unbound Imp (object-level) variable in the quasi-quote to a reference to a Haskell (meta-level) variable. To do that, we will introduce a similar type class to Lift:

class ToExpr a where
  toExpr :: a -> Expr

and provide instances for variables and integers:

instance ToExpr VarName where
  toExpr = Var

instance ToExpr Integer where
  toExpr = Int

We will also need to know which variables in an Imp program are bound and unbound; in the source code you will find a function which returns the set of free variables in an Imp program:

fvProg :: Prog -> Set VarName

Overriding the behaviour of dataToExpQ

In the previous section we mentioned that rather than doing the Prog -> Q Exp transformation by hand we use the generic function dataToExpQ to do it for us. However, now we want to override the behaviour of this function for the specific case of unbound Imp variables, which we want to translate to Haskell variables.

Recall that dataToExpQ has type

dataToExpQ :: Data a 
           => (forall b. Data b => b -> Maybe (Q Exp)) 
           -> a -> Q Exp

This is a rank-2 type: the first argument to dataToExpQ must itself be polymorphic in b: it must work on any type b that derives Data. So far we have been passing in

const Nothing

which is obviously polymorphic in b since it completely ignores its argument. But how do we do something more interesting? Data and its associated combinators come from a generic programming library called Scrap Your Boilerplate (Data.Generics). A full discussion of of SYB is beyond the scope of this blog post; the SYB papers are a good starting point if you would like to know more (I would recommend reading them in chronological order, the first published paper first). For the sake of what we are trying to do it suffices to know about the existence of the following combinator:

extQ :: (Typeable a, Typeable b) => (a -> q) -> (b -> q) -> a -> q

Given a polymorphic query (forall a)—in our case this is const NothingextQ allows to extend the query with a type specific case (for a specific type b). We will use this to give a specific case for Expr: when we see a free variable in an expression we translate it to an application of toExpr to a Haskell variable with the same name:

metaExp :: Set VarName -> Expr -> Maybe ExpQ
metaExp fvs (Var x) | x `Set.member` fvs = 
  Just [| toExpr $(varE (mkName x)) |]
metaExp _ _ = 
  Nothing

The improved quasi-quoter

With this in hand we can define our improved quasi-quoter:

prog :: QuasiQuoter
prog = QuasiQuoter {
      quoteExp = \str -> do
        l <- location'
        c <- runIO $ parseIO (setPosition l *> topLevel parseProg) str
        dataToExpQ (const Nothing `extQ` metaExp (fvProg c)) c
    , quotePat  = undefined
    , quoteType = undefined
    , quoteDec  = undefined
    }

Note that we are extending the query for Expr, not for Prog; dataToExpQ (or, more accurately, SYB) makes sure that this extension is applied at all the right places. Running (prog2 "x" 2) now has the expected behaviour:

*Main> intIO $ intProg (prog2 "x" 2)
6
14

Indeed, when we have a variable in our code that is unbound both in Imp and in Haskell, we now get a Haskell type error:

prog2 :: VarName -> Integer -> Prog
prog2 y n = [prog|
    var x ;
    x := read ;
    write (x + z + n)
  |]

gives

TestQQAST.hs:15:19: Not in scope: ‘z’

Parenthetical remark: it is a design decision whether or not we want to allow local binding sites in a splice to “capture” meta-variables. Put another way, when we pass in "x" to prog2, do we mean the x that is bound locally in prog2, or do we mean a different x? Certainly a case can be made that we should not be able to refer to the locally bound x at all—after all, it’s not bound outside of the snippet! This is an orthogonal concern however and we will not discuss it any further in this blog post.

Quasi-quoting patterns

We can also use quasi-quoting to support patterns. This enables us to write something like

optimize :: Expr -> Expr
optimize [expr| a + n - m |] | n == m = optimize a
optimize other = other

As before, the occurrence of a in this pattern is free, and we intend it to correspond to a Haskell variable, not an Imp variable; the above code should correspond to

optimize (Sub (Add a n) m) | n == m = optimize a

(note that this is comparing Exprs for equality, hence the need for Expr to derive Eq). We did not mean the pattern

optimize (Sub (Add (Var "a") (Var "n")) (Var "m"))

To achieve this, we can define a quasi-quoter for Expr that supports patterns (as well as expressions):

expr :: QuasiQuoter
expr = QuasiQuoter {
      quoteExp  = \str -> do
        l <- location'
        e <- runIO $ parseIO (setPosition l *> topLevel parseExpr) str
        dataToExpQ (const Nothing `extQ` metaExp (fvExpr e)) e
    , quotePat  = \str -> do
        l <- location'
        e <- runIO $ parseIO (setPosition l *> topLevel parseExpr) str
        dataToPatQ (const Nothing `extQ` metaPat (fvExpr e)) e
    , quoteType = undefined
    , quoteDec  = undefined
    }

The implementation of quotePat is very similar to the definition of quoteExp. The only difference is that we use dataToPatQ instead of dataToExpQ to generate a Haskell pattern rather than a Haskell expression, and we use metaPat to give a type specific case which translates free Imp variables to Haskell pattern variables:

metaPat :: Set VarName -> Expr -> Maybe PatQ
metaPat fvs (Var x) | x `Set.member` fvs = Just (varP (mkName x))
metaPat _ _ = Nothing

Note that there is no need to lift now; the Haskell variable will be bound to whatever matches in the expression.

Limitations

We might be tempted to also add support for Prog patterns. While that is certainly possible, it’s of limited use if we follow the same strategy that we followed for expressions. For instance, we would not be able to write something like

opt [prog| var x ; c |] | x `Set.notMember` fvProg c = opt c

The intention here is that we can remove unused variables. Unfortunately, this will not work because this will cause a parse error: the syntax for Imp does not allow for variables for commands, and hence we also don’t allow for meta-variables at this point. This is important to remember:

By using object-level variables as stand-ins for meta-level variables, we only allow for meta-level variables where the syntax for the object-level language allows variables.

If this is too restrictive, we need to add special support in the ADT and in the corresponding parsers for meta-variables. This is a trade-off in increased expressiveness of the quasi-quotes against additional complexity in their implementation (new parsers, new ADTs).

Conclusions

By reusing object-level variables as stand-ins for meta variables you can reuse existing parsers and ADTs to define quasi-quoters. Using the approach described in this blog we were able to add support for quasi-quoting to a real compiler for a domain specific programming language with a minimum of effort. The implementation is very similar to what we have shown above, except that we also dealt with renaming (so that meta variables cannot be captured by binding sites in the quasi quotes) and type checking (reusing the existing renamer and type checker, of course).


How we might abolish Cabal Hell, part 1

Tue, 30 Sep 2014 10:19:33 GMT, by duncan.
Filed under cabal, community.

At ICFP a few weeks ago a hot topic in the corridors and in a couple talks was the issues surrounding packaging and “Cabal Hell”.

Fortunately we were not just discussing problems but solutions. Indeed I think we have a pretty good understanding now of where we want to be, and several solutions are in development or have reasonably clear designs in peoples’ heads.

I want to explain what’s going on for those not already deeply involved in the conversation. So this is the first of a series of blog posts on the problems and solutions around Cabal hell.

There are multiple problems and multiple solutions. The solutions overlap in slightly complicated ways. Since it is a bit complicated, I’m going to start with the big picture of the problems and solutions and how they relate to each other. In subsequent posts I’ll go into more detail on particular problems and solutions.

“Cabal hell”: the problems

So what is “Cabal hell”? Let’s consult the dictionary…

Cabal Hell

The feeling of powerlessness one has when Cabal does not do what one wanted and one does not know how to fix it.

I’m joking obviously, but my point is that Cabal hell is not a precise technical term. There are a few different technical problems (and misunderstandings and UI problems) that can cause Cabal hell.

A useful concept when talking about this topic is that of the packaging “wild wild west”. What we mean is whether we are in a context where we reasonably expect packages to work together (because there has been some deliberate effort to make them work together), or if we are in the “wild wild west”. In the “wild wild west” we have to do things like deal with packages that were uploaded yesterday by multiple different authors. The point is that nobody has yet had time to try and make things consistent. It is a useful concept because we have developers who need to deal with the “wild wild west” and those who would really rather not, and the solutions tend to look a bit different.

Another term we often use when talking about packages is “consistency”. What we mean is that in a collection of packages there is at most one version of each package. For example when you ask cabal-install to install package A and B, we say that it will try to find a “consistent” set of dependencies – meaning a set including A, B and their dependencies that has only one version of any package.

“Cabal hell”: the symptoms

So lets consider a breakdown of the technical problems. To start with lets look at a breakdown based on the symptoms that a developer in Cabal Hell experiences

Cabal Hell: the symptoms 

We can first break things down by whether there is a solution or not. That is, whether a perfect dependency resolver could find a plan to install the package(s) and their dependencies consistently. We want such a solution because it’s a prerequisite for installing working packages. (We’re ignoring the possibility that there is a solution but the solver fails to find one. That is possible but it’s a relatively rare problem.)

Given the situation where the solver tells us that there is no solution, there are a few different cases to distinguish:

No solution expected

The failure was actually expected. For example a developer updating their package to work with the latest version of GHC is not going to be surprised if their initial install attempt fails. Then based on what the solver reports they can work out what changes they need to make to get things working.

Solution had been expected

The more common case is that the developer was not expecting to be working in the wild west. The developer had an expectation that the package or packages they were asking for could just be installed. In this case the answer “no that’s impossible” from the solver is very unhelpful, even though it’s perfectly correct.

Unnecessary solver failure

The symptoms here are exactly the same, namely the solver cannot find a solution, but the reason is different. More on reasons in a moment.

Even when there is a solution we can hit a few problems:

Compile error

Compilation can fail because some interface does not match. Typically this will manifest as a naming error or type error.

Breaking re-installations

Cabal’s chosen solution would involve reinstalling an existing version of a package but built with different dependencies. This re-installation would break any packages that depend on the pre-existing instance of the installed package. By default cabal-install will not go ahead with such re-installation, but you can ask it to do so.

Type errors when using packages together

It is possible to install two package and then load them both in GHCi and find that you cannot use them together because you get type errors when composing things defined in the two different packages.

“Cabal hell”: the reasons

So those are the major problems. Lets look at some reasons for those problems.

Cabal Hell: the reasons 

Inconsistent versions of dependencies required

There are two sub-cases worth distinguishing here. One is where the developer is asking for two or more packages that could be installed individually, but cannot be installed and used together simultaneously because they have clashing requirements on their common dependencies. The other is that a package straightforwardly has no solution (at least with the given compiler & core library versions), because of conflicting constraints of its dependencies.

Constraints wrong

With under-constrained dependencies we get build failures, and with over-constrained dependencies we get unnecessary solver failures. That is, a build failure is (almost always) due to dependency constraints saying some package version combination should work, when actually it does not. And the dual problem: an unnecessary solver failure is the case where there would have been a solution that would actually compile, if only the constraints had been more relaxed.

Single instance restriction

Existing versions of GHC and Cabal let you install multiple versions of a package, but not multiple instances of the same version of a package. This is the reason why Cabal has to reinstall packages, rather than just add packages.

Inconsistent environment

These errors occur because cabal-install does not enforce consistency in the developer’s environment, just within any set of packages it installs simultaneously.

We’ll go into more detail on all of these issues in subsequent posts, so don’t worry if these things don’t fully make sense yet.

“Cabal hell”: the solutions

There are several problems and there isn’t one solution that covers them all. Rather there are several solutions. Some of those solutions overlap with each other, meaning that for some cases either solution will work. The way the solutions overlap with the problems and each other is unfortunately a bit complicated.

Here’s the overview:

Cabal Hell: the solutions 

So what does it all mean?

We’ll look at the details of the solutions in subsequent posts. At this stage the thing to understand is which solutions cover which problems, and where those solutions overlap.

We’ll start with the two most important solutions. They’re the most important in the sense that they cover the most cases.

Nix-style persistent store with multiple consistent environments

This solves all the cases of breaking re-installations, and all cases of inconsistent environments. It doesn’t help with wrong constraints.

You’ll note that it covers some cases where there is no solution and you might wonder what this can mean. Some cases where there is no solution are due to two (or more) sets of packages that could be installed independently but cannot be installed together consistently. In a nix-style setting it would be possible to offer developers the option to install the packages into separate environments when the solver determines that this is possible.

Curated consistent package collections

These are things like the Debian Haskell packages or Stackage. This solves some cases of each of the different problems: breaking re-installations, inconsistent environments, wrong constraints and lack of consistent solutions. It solves those cases to the extent that the package collection covers all the packages that the developer is interested in. For many developers this will be enough. Almost by definition however it cannot help with the “wild west” of packages because the curation takes time and effort. Unless used in combination with a isolated environment solution (e.g. nix-style, but also less sophisticated systems like hsevn or cabal sandboxes) it does not allow using multiple versions of the collection (e.g. different projects using different Stackage versions).

It is worth noting that these two solutions should work well together. Neither one subsumes the other. We don’t need to pick between the two. We should pick both. The combination would get us a long way to abolishing Cabal hell.

There are also a number of smaller solutions:

Automatic build reporting

This helps with detecting compile errors arising from constraints that are too lax. It doesn’t help with constraints that are too tight. This solution requires a combination of automation and manual oversight to fix package constraints and to push those fixes upstream.

Upper-bound build bots

This is similar to gathering build reports from users, but instead of looking at cases of compile failure (constraints too lax), it explicitly tries relaxing upper bounds and checks if things still compile and testsuites work. Again, this requires automation to act on the information gleaned to minimise manual effort.

Package interface compatibility tools

This is to help package authors get their dependency constraints right in the first place. It can help them follow a version policy correctly, and tell them what minimum and maximum version bounds of their dependencies to use. It does not completely eliminate the need to test, because type compatibility does not guarantee semantic compatibility. Solutions in this area could eliminate a large number of cases of wrong constraints, both too lax and too tight.

Private dependencies

This allows solutions to exist where they do not currently exist, by relaxing the consistency requirement in a safe way. It means global consistency of dependencies is not always required, which allows many more solutions. This solution would cover a lot of cases in the “wild wild west” of packaging, and generally in the large set of packages that are not so popular or well maintained as to be included in a curated collection.

Next time…

So that’s the big picture of the problems and solutions and how they relate to each other. In subsequent posts we’ll look in more detail at the problems and solutions, particularly the solutions people are thinking about or actively working on.


Haskell courses and Haskell eXchange

Tue, 23 Sep 2014 10:10:42 GMT, by andres.
Filed under training, well-typed, community.

In the beginning of October, my colleage Adam Gundry and I will spend a full week in London again for Haskell-related activities: on Monday and Tuesday (October 6–7), we will teach Fast Track to Haskell, a two-day introduction course to Haskell, targeted at people who want to get started with Haskell or learn more about functional programming in general. On Wednesday (October 8), there’s the Haskell eXchange, a one-day conference full of interesting talks on Haskell-related topics. On Thursday and Friday (October 9–10), we will look at more advanced Haskell concepts and programming patterns in Advanced Haskell.

All three events are still open for registration.

Haskell eXchange

The Haskell eXchange will take place in London for the third time now, and I’m happy to report that there’s going to be a really fantastic program again:

As is almost traditional by now, Simon Peyton Jones (Microsoft Research) himself will open the conference, this time with a talk on “Safe, zero-cost coercions in Haskell”.

This is followed by Jeremy Gibbons (University of Oxford) with “Categories for the Working Haskeller”, explaining to Haskell developers and people who are interested in Haskell what role category theory plays in Haskell and if/how categories can be helpful.

I’m very pleased that Bryan O’Sullivan (Facebook) has agreed to give a presentation at the Haskell eXchange this year. As the author of countless Haskell libraries (many of which are among the most used in the entire Haskell ecosystem), and being a co-author of the widely known O’Reilly book “Real World Haskell”, he’s certainly learned how to squeeze a bit of extra performance out of Haskell code when needed. He’s going to share his experiences and provide valuable advice in his talk.

After lunch, we’ll continue with a pair of talks looking at using Haskell for the development of RESTful web services from slightly different angles. Erik Hesselink (Silk) is going to present the rest framework, which makes it easy to develop and maintain REST APIs, independent of the underlying web framework you use. After that, Chris Dornan (Iris Connect) and Adam Gundry (Well-Typed) will talk about api-tools and in particular address the question of how you can solve the problem of schema migrations nicely.

Tim Williams and Peter Marks (both Barclays) will give a joint talk on Lucid, their in-house non-embedded DSL that is written in Haskell and has a mostly structural type system with interesting features such as row polymorphism and extensible records as well as extensible sum-types.

The talks will be concluded by Oliver Charles (Fynder), well-known for his tireless efforts in the “24 days of Hackage” series, who is going to show us how the use of GHC’s most advanced type system extensions helps him write better real-world code at his company.

After the talks, there’s going to be pizza and beer and an informal “Park Bench Discussion” where we can all discuss the questions that have been raised throughout the day in more detail.

I hope you’re as excited about this program as I am: I think there’s a fantastic range of topics covered, from language features and theoretical aspects, practical advice for programmers to interesting case studies of real-world code. Also, it’s an excellent opportunity to meet fellow developers interested in Haskell. If you’re working for a company using Haskell and are looking for new developers, this may be an excellent opportunity to recruit. On the other hand, if you’d like nothing more than a Haskell job, this is an opportunity to meet people who are actually using it in practice, and may have a job for you or at least be able to give you advice on how to find one.

If you haven’t registered yet, please consider doing so! We’re looking forward to meeting you there.

Fast Track to Haskell and Advanced Haskell

These two successful courses have been offered on a regular basis since October 2012. They’re continuously updated to reflect the latest changes in the Haskell world, such as updates to the infrastructure, new features of the main Haskell compiler GHC, or exciting new libraries.

Both courses are hands-on, comprising a combination of lectures, interactive coding and exercises that the participants are supposed to work on alone or in teams, with the help of the teacher(s).

The Fast Track course is for people who know how to program, but have little or no experience in Functional Programming or Haskell. It teaches Haskell in from scratch in just two days, covering important concepts such as datatypes, polymorphism, higher-order functions, type classes, how IO works in Haskell, and ending with an introduction to monads. It’s also interesting for people who are interested in learning about functional programming in general, because Haskell is a prime example of a functional programming language, and the course focuses on the important programming concepts more than on language peculiarities.

The Advanced Haskell course is for people who have some experience with Haskell, but want to learn more. We’re going to discuss (functional) data structures and their complexity, have a detailed look at how lazy evaluation works and how it is implemented, how to reason about performance and how to use various debugging tools. Somewhat depending on demand, there’s also the option to learn more about advanced programming patterns, such as monads, applicative functors and monad transformers, about concurrency and parallelism, or about the more advanced features of the Haskell type system such as GADTs and type families.

Being in the same week as the Haskell eXchange makes it possible to combine one of the courses (or even both) with the eXchange, where you can hear several other viewpoints and get to know more Haskellers!

Other courses

We’re offering additional training courses on-demand and on-site for companies in Europe, the US, or anywhere in the world on request. See our training page for more information.

HacBerlin

By the way, I’m also going to be at the Haskell Hackathon in Berlin this upcoming weekend. On Sunday, I’m going to give a talk on parser combinators. You can still register for the Hackathon, too. It’s free.


Dealing with Asynchronous Exceptions during Resource Acquisition

Thu, 28 Aug 2014 15:48:49 GMT, by edsko, duncan.
Filed under coding.

Introduction

Consider the following code: we open a socket, compute with it, and finally close the socket again. The computation happens inside an exception handler (try), so even when an exception happens we still close the socket:

example1 :: (Socket -> IO a) -> IO a 
example1 compute = do -- WRONG
  s <- openSocket 
  r <- try $ compute s
  closeSocket s
  case r of
    Left ex -> throwIO (ex :: SomeException)
    Right a -> return a

Although this code correctly deals with synchronous exceptions–exceptions that are the direct result of the execution of the program–it does not deal correctly with asynchronous exceptions–exceptions that are raised as the result of an external event, such as a signal from another thread. For example, in

example2 :: (Socket -> IO a) -> IO (Maybe a)
example2 compute = timeout someTimeout $ example1 compute

it is possible that the timeout signal arrives after we have opened the socket but before we have installed the exception handler (or indeed, after we leave the scope of the exception handler but before we close the socket). In order to address this we have to control precisely where asynchronous exceptions can and cannot be delivered:

example3 :: (Socket -> IO a) -> IO a
example3 compute =
  mask $ \restore -> do
    s <- openSocket 
    r <- try $ restore $ compute s
    closeSocket s
    case r of
      Left ex -> throwIO (ex :: SomeException)
      Right a -> return a

We mask asynchronous exceptions, and then restore them only inside the scope of the exception handler. This very common pattern is captured by the higher level combinator bracket, and we might rewrite the example as

example4 :: (Socket -> IO a) -> IO a
example4 = bracket openSocket closeSocket

Allowing asynchronous exceptions during resource acquisition

Suppose that we wanted to define a derived operation that opens a socket and performs some kind of handshake with the server on the other end:

openHandshake :: IO Socket
openHandshake = do
  mask $ \restore -> do
    s <- openSocket
    r <- try $ restore $ handshake s
    case r of
      Left  ex -> closeSocket s >> throwIO (ex :: SomeException)
      Right () -> return s 

(These and the other examples can be defined in terms of bracket and similar, but we use mask directly so that it’s easier to see what is happening.) We might use openHandshake as follows:

example5 :: (Socket -> IO a) -> IO a
example5 compute = do
  mask $ \restore -> do
    s <- openHandshake 
    r <- try $ restore $ compute s
    closeSocket s
    case r of
      Left ex -> throwIO (ex :: SomeException)
      Right a -> return a

There are no resource leaks in this code, but there is a different problem: we call openHandshake with asynchronous exceptions masked. Although openHandshake calls restore before doing the handshake, restore restores the masking state to that of the enclosing context. Hence the handshake with the server cannot be timed out. This may not be what we want–we may want to be able to interrupt example5 with a timeout either during the handshake or during the argument computation.

Note that this is not a solution:

example6 :: (Socket -> IO a) -> IO a 
example6 compute = do
  mask $ \restore -> do
    s <- restore openHandshake -- WRONG
    r <- try $ restore $ compute s
    closeSocket s
    case r of
      Left ex -> throwIO (ex :: SomeException)
      Right a -> return a

Consider what might happen: if an asynchronous exception is raised after openHandshake returns the socket, but before we leave the scope of restore, the asynchronous exception will be raised and the socket will be leaked. Installing an exception handler does not help: since we don’t have a handle on the socket, we cannot release it.

Interruptible operations

Consider this definition from the standard libraries:

withMVar :: MVar a -> (a -> IO b) -> IO b
withMVar m io =
  mask $ \restore -> do
    a <- takeMVar m
    b <- restore (io a) `onException` putMVar m a
    putMVar m a
    return b

This follows almost exactly the same pattern as the examples we have seen so far; we mask asynchronous exceptions, take the contents of the MVar, and then execute some operation io with the contents of the MVar, finally putting the contents of the MVar back when the computation completes or when an exception is raised.

An MVar acts as a lock, with takeMVar taking the role of acquiring the lock. This may, of course, take a long time if the lock is currently held by another thread. But we call takeMVar with asynchronous exceptions masked. Does this mean that the takeMVar cannot be timed out? No, it does not: takeMVar is a so-called interruptible operation. From the Asynchronous Exceptions in Haskell paper:

Any operation which may need to wait indefinitely for a resource (e.g., takeMVar) may receive asynchronous exceptions even within an enclosing block, but only while the resource is unavailable. Such operations are termed interruptible operations. (..) takeMVar behaves atomatically when enclosed in block. The takeMVar may receive asynchronous exceptions right up until the point when it acquires the MVar, but not after.

(block has been replaced by mask since the publication of the paper, but the principle is the same.) Although the existence of interruptible operations makes understanding the semantics of mask harder, they are necessary: like in the previous section, wrapping takeMVar in restore is not safe. If we really want to mask asynchronous exceptions, even across interruptible operations, Control.Exception offers uninterruptibleMask.

Custom interruptible operations

So an interruptible operation is one that can be interrupted by an asynchronous exception even when asynchronous exceptions are masked. Can we define our own interruptible operations? Yes, we can:

-- | Open a socket and perform handshake with the server
--
-- Note: this is an interruptible operation.
openHandshake' :: IO Socket
openHandshake' = 
  mask_ $ do 
    s <- openSocket
    r <- try $ unsafeUnmask $ handshake s
    case r of
      Left  ex -> closeSocket s >> throwIO (ex :: SomeException)
      Right () -> return s 

unsafeUnmask is defined in GHC.IO, and unmasks asynchronous exceptions, no matter what the enclosing context is. This is of course somewhat dangerous, because now calling openHandshake' inside a mask suddenly opens up the possibility of an asynchronous exception being raised; and the only way to know is to look at the implementation of openHandshake', or its Haddock documentation. This is somewhat unsatisfactory, but exactly the same goes for takeMVar and any other interruptible operation, or any combinator that uses an interruptible operation under the hood. A sad state of affairs, perhaps, but one that we don’t currently have a better solution for.

Actually, using unsafeUnmask is a bit too crude. Control.Exception does not export it, but does export

allowInterrupt :: IO ()
allowInterrupt = unsafeUnmask $ return ()

with documentation

When invoked inside mask, this function allows a blocked asynchronous exception to be raised, if one exists. It is equivalent to performing an interruptible operation, but does not involve any actual blocking.

When called outside mask, or inside uninterruptibleMask, this function has no effect.

(emphasis mine.) Sadly, this documentation does not reflect the actual semantics: unsafeUnmask, and as a consequence allowInterrupt, unmasks asynchronous exceptions no matter what the enclosing context is: even inside uninterruptibleMask. We can however define our own operator to do this:

interruptible :: IO a -> IO a
interruptible act = do
  st <- getMaskingState
  case st of
    Unmasked              -> act
    MaskedInterruptible   -> unsafeUnmask act
    MaskedUninterruptible -> act 

where we call unsafeUnmask only if the enclosing context is mask, but not if it is uninterruptibleMask (TODO: What is the semantics when we nest these two?). We can use it as follows to define a better version of openHandshake:

-- | Open a socket and perform handshake with the server
--
-- Note: this is an interruptible operation.
openHandshake' :: IO Socket
openHandshake' = 
  mask_ $ do 
    s <- openSocket
    r <- try $ interruptible $ handshake s
    case r of
      Left  ex -> closeSocket s >> throwIO (ex :: SomeException)
      Right () -> return s 

Resource allocation timeout

If we wanted to timeout the allocation of the resource only, we might do

example7 :: (Socket -> IO a) -> IO a
example7 compute = do
  mask $ \restore -> do
    ms <- timeout someTimeout $ openHandshake'
    case ms of
      Nothing -> throwIO (userError "Server busy")
      Just s  -> do r <- try $ restore $ compute s
                    closeSocket s
                    case r of
                      Left ex -> throwIO (ex :: SomeException)
                      Right a -> return a

Exceptions are masked when we enter the scope of the timeout, and are unmasked only once we are inside the exception handler in openHandshake'–in other words, if a timeout happens, we are guaranteed to clean up the socket. The surrounding mask is however necessary. For example, suppose we are writing some unit tests and we are testing openHandshake'. This is wrong:

example8 :: IO ()
example8 = do 
  ms <- timeout someTimeout $ openHandshake'
  case ms of
    Just s  -> closeSocket s
    Nothing -> return ()

Even if we are sure that the example8 will not be interrupted by asynchronous exceptions, there is still a potential resource leak here: the timeout exception might be raised just after we leave the mask_ scope from openHandshake but just before we leave the timeout scope. If we are sure we don’t need to worry about other asynchronous exceptions we can write

example8 :: IO ()
example8 = do
  s <- mask_ $ timeout someTimeout $ openHandshake'
  case ms of
    Just s  -> closeSocket s
    Nothing -> return ()

although of course it might be better to simply write

example8 :: IO ()
example8 = 
  bracket (timeout someTimeout $ openHandshake')
          (\ms -> case ms of Just s  -> closeSocket s
                             Nothing -> return ())
          (\_ -> return ())

Conclusions

Making sure that resources are properly deallocated in the presence of asynchronous exceptions is difficult. It is very important to make sure that asynchronous exceptions are masked at crucial points; unmasking them at the point of calling a resource allocation function is not safe. If you nevertheless want to be able to timeout resource allocation, you need to make your resource allocation function interruptible.

For completeness’ sake, there are some other solutions that avoid the use of unsafeUnmask. One option is to thread the restore argument through (and compose multiple restore arguments if there are multiple nested calls to mask). This requires resource allocations to have a different signature, however, and it is very error prone: a single mask somewhere along the call chain where we forget to thread through the restore argument will mean the code is no longer interruptible. The other option is to run the code that we want to be interruptible in a separate thread, and wait for the thread to finish with, for example, a takeMVar. Getting this right is however no easy task, and it doesn’t change anything fundamentally anyway: rather than using unsafeUnmask we are now using a primitive interruptible operation; either way we introduce the possibility of exceptions even in the scope of mask_.

Finally, when your application does not fit the bracket pattern we have been using (implicitly or explicitly), you may want to have a look at resourcet and pipes or conduit, or my talk Lazy I/O and Alternatives in Haskell.


Debugging Haskell at assembly level
by scripting lldb in Python

Fri, 01 Aug 2014 09:17:34 GMT, by edsko.
Filed under coding.

Haskell programmers tend to spend far less time with debuggers than programmers in other languages. Partly this is because for pure code debuggers are of limited value anyway, and Haskell’s type system is so strong that a lot of bugs are caught at compile time rather than at runtime. Moreover, Haskell is a managed language – like Java, say – and errors are turned into exceptions. Unlike in unmanaged languages such as C “true” runtime errors such as segmentation faults almost never happen.

I say “almost” because they can happen: either because of bugs in ghc or the Haskell runtime, or because we are doing low level stuff in our own Haskell code. When they do happen we have to drop down to a system debugger such as lldb or gdb, but debugging Haskell at that level can be difficult because Haskell’s execution model is so different from the execution model of imperative languages. In particular, compiled Haskell code barely makes any use of the system stack or function calls, and uses a continuation passing style instead (see my previous blog posts Understanding the Stack and Understanding the RealWorld). In this blog post I will explain a technique I sometimes use to help diagnose low-level problems.

Since I work on OSX I will be using lldb as my debugger. if you are using gdb you can probably use similar techniques; The LLDB Debugger shows how gdb and lldb commands correlate, and the ghc wiki also lists some tips. However, I have no experience with scripting gdb so your mileage may vary.

Description of the problem

As our running example I will use a bug that I was tracking down in a client project. The details of the project don’t matter so much, except that this project happens to use the GHC API to compile Haskell code—at runtime—into bytecode and then run it; moreover, it also—dynamically, at runtime—loads C object files into memory.

In one example run it loads the (compiled) C code

#include <stdio.h>

void hello(void) { 
  printf("hello\n"); 
}

and then compiles and runs this Haskell code:

{-# LANGUAGE ForeignFunctionInterface #-}
module Main where

foreign import ccall "hello" hello :: IO ()

main = hello

Sadly, however, this resulted in total system crash.

Starting point

By attaching lldb to the running process we got a tiny bit more information about the crash:

* thread #4: tid = 0x3550aa, 0x000000010b3b8226, stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x000000010b3b8226
-> 0x10b3b8226:  addb   %al, (%rax)
   0x10b3b8228:  addb   %al, (%rax)
   0x10b3b822a:  addb   %al, (%rax)
   0x10b3b822c:  addb   %al, (%rax)

It turns out we have a null-pointer dereferencing here. Anybody who has spent any time debugging Intel assembly code however will realize that this particular instruction

addb   %al, (%rax)

is in fact the decoding of zero:

(lldb) memory read -c 8 0x10b3b8226
0x10b3b8226: 00 00 00 00 00 00 00 00                          ........

In other words, chances are good we were never meant to execute this instruction at all. Unfortunately, asking lldb for a backtrace tells us absolutely nothing new:

(lldb) bt
* thread #4: tid = 0x3550aa, 0x000000010b3b8226, stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x000000010b3b8226

Finding a call chain

The lack of a suitable backtrace in lldb is not surprising, since compiled Haskell code barely makes use of the system stack. Instead, the runtime maintains its own stack, and code is compiled into a continuation passing style. For example, if we have the Haskell code

functionA :: IO ()
functionA = do .. ; functionB ; ..

functionB :: ()
functionB = do .. ; functionC ; ..

functionC :: IO ()
functionC = .. crash ..

main :: IO ()
main = functionA

and we step through the execution of this program in lldb, and we ask for a backtrace when we start executing function A all we get is

(lldb) bt
* thread #1: tid = 0x379731, 0x0000000100000a20 Main`A_functionA_info, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x0000000100000a20 Main`A_functionA_info

with no mention of main. Similarly, the backtraces on entry to functions B and C are

* thread #1: tid = 0x379731, 0x0000000100000b90 Main`B_functionB_info, queue = 'com.apple.main-thread', stop reason = breakpoint 2.1
  * frame #0: 0x0000000100000b90 Main`B_functionB_info

and

* thread #1: tid = 0x379731, 0x0000000100000c88 Main`C_functionC_info, queue = 'com.apple.main-thread', stop reason = breakpoint 3.1
  * frame #0: 0x0000000100000c88 Main`C_functionC_info

none of which is particularly informative. However, stepping manually through the program we do first see function A on the (singleton) call stack, then function B, and finally function C. Thus, by the time we reach function C, we have discovered a call chain A, B, C—it’s just that it involves quite a bit of manual work.

Scripting lldb

Fortunately, lldb can be scripted (see Using Scripting and Python to Debug in LLDB and the LLDB Python Reference). What we want to do is keep stepping through the code, showing the top-level (and quite possibly only) function at the top of the call stack at each step, until we crash.

We can use the following Python script to do this:

import lldb

def step_func(debugger, command, result, internal_dict):
  thread = debugger.GetSelectedTarget().GetProcess().GetSelectedThread()
  
  while True:
    thread.StepOver()

    stream = lldb.SBStream()
    thread.GetStatus(stream)
    description = stream.GetData()
    print description

    if thread.GetStopReason() == lldb.eStopReasonException:
      break

def __lldb_init_module(debugger, dict):
  debugger.HandleCommand('command script add -f %s.step_func sf' % __name__)

For the above example, we might use this as follows: we load our application into lldb

# lldb Main
Current executable set to 'Main' (x86_64).

register our new command sf

(lldb) command script import mystep.py

set a breakpoint where we want to start stepping

(lldb) breakpoint set -n A_functionA_info
Breakpoint 1: where = Main`A_functionA_info, address = 0x0000000100000b90

run to the breakpoint:

(lldb) run
Process 54082 launched: 'Main' (x86_64)
Process 54082 stopped
* thread #1: tid = 0x384510, 0x0000000100000b90 Main`A_functionA_info, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
    frame #0: 0x0000000100000b90 Main`A_functionA_info

and then use sf to find a call-trace until we crash:

(lldb) sf
...

* thread #1: tid = 0x384510, 0x0000000100000bf0 Main`B_functionB_info, queue = 'com.apple.main-thread', stop reason = instruction step over
    frame #0: 0x0000000100000bf0 Main`B_functionB_info
Main`B_functionB_info:

...

* thread #1: tid = 0x384510, 0x0000000100000c78 Main`C_functionC_info, queue = 'com.apple.main-thread', stop reason = instruction step over
    frame #0: 0x0000000100000c78 Main`C_functionC_info
Main`C_functionC_info:

...

* thread #1: tid = 0x384510, 0x0000000100000d20 Main`crash + 16 at crash.c:3, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x0000000100000d20 Main`crash + 16 at crash.c:3

Note that if you are using the threaded runtime, you may have to select which thread you want to step through before calling sf:

(lldb) thread select 4
(lldb) sf

Tweaking the script

You will probably want to tweak the above script in various ways. For instance, in the application I was debugging, I wanted to step into each assembly language instruction but over each C function call, mostly because lldb was getting confused with the call stack. I also added a maximum step count:

def step_func(debugger, command, result, internal_dict):
  args = shlex.split(command)
 
  if len(args) > 0:
    maxCount = int(args[0])
  else:
    maxCount = 100

  thread = debugger.GetSelectedTarget().GetProcess().GetSelectedThread()
  i      = 0;

  while True:
    frame = thread.GetFrameAtIndex(0)
    file  = frame.GetLineEntry().GetFileSpec().GetFilename()
    inC   = type(file) is str and file.endswith(".c")
   
    if inC:
      thread.StepOver()
    else:
      thread.StepInstruction(False)

    stream = lldb.SBStream()
    thread.GetStatus(stream)
    description = stream.GetData()

    print i
    print description

    i += 1;
    if thread.GetStopReason() == lldb.eStopReasonException or i > maxCount:
      break

You may want to tweak this step into/step over behaviour to suit your application; certainly you don’t want to have a call trace involving every step taken in the Haskell RTS or worse, in the libraries it depends on.

Back to the example

Rather than printing every step along the way, it may also be useful to simply remember the step-before-last and show that on a crash; often it is sufficient to know what happened just before the crash. Indeed, in the application I was debugging the call stack just before the crash was:

2583
  thread #3: tid = 0x35e743, 0x00000001099da56d libHSrts_thr_debug-ghc7.8.3.20140729.dylib`schedule(initialCapability=0x0000000109a2f840, task=0x00007fadda404550) + 1533 at Schedule.c:470, stop reason = step over
    frame #0: 0x00000001099da56d libHSrts_thr_debug-ghc7.8.3.20140729.dylib`schedule(initialCapability=0x0000000109a2f840, task=0x00007fadda404550) + 1533 at Schedule.c:470
   467 	    }
   468 	    
   469 	    case ThreadInterpret:
-> 470 		cap = interpretBCO(cap);
   471 		ret = cap->r.rRet;
   472 		break;
   473 		

2584
  thread #3: tid = 0x35e743, 0x0000000103106226, stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x0000000103106226
-> 0x103106226:  addb   %al, (%rax)
   0x103106228:  addb   %al, (%rax)
   0x10310622a:  addb   %al, (%rax)
   0x10310622c:  addb   %al, (%rax)

Which is a lot more helpful than the backtrace, as we now have a starting point: something went wrong when running the bytecode interpreter (remember that the application was compiling and running some Haskell code at runtime).

To pinpoint the problem further, we can set a breakpoint in interpretBCO and run sf again (the way we defined sf it steps over any C function calls by default). This time we get to:

4272
  thread #4: tid = 0x35f43a, 0x000000010e77c548 libHSrts_thr_debug-ghc7.8.3.20140729.dylib`interpretBCO(cap=0x000000010e7e7840) + 18584 at Interpreter.c:1463, stop reason = step over
    frame #0: 0x000000010e77c548 libHSrts_thr_debug-ghc7.8.3.20140729.dylib`interpretBCO(cap=0x000000010e7e7840) + 18584 at Interpreter.c:1463
   1460		    tok = suspendThread(&cap->r, interruptible ? rtsTrue : rtsFalse);
   1461	
   1462		    // We already made a copy of the arguments above.
-> 1463	            ffi_call(cif, fn, ret, argptrs);
   1464	
   1465	            // And restart the thread again, popping the stg_ret_p frame.
   1466		    cap = (Capability *)((void *)((unsigned char*)resumeThread(tok) - STG_FIELD_OFFSET(Capability,r)));

4273
  thread #4: tid = 0x35f43a, 0x0000000107eba226, stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
    frame #0: 0x0000000107eba226
-> 0x107eba226:  addb   %al, (%rax)
   0x107eba228:  addb   %al, (%rax)
   0x107eba22a:  addb   %al, (%rax)
   0x107eba22c:  addb   %al, (%rax)

Ok, now we are really getting somewhere. Something is going wrong when are doing a foreign function call. Let’s re-run the application once more, setting a breakpoint at ffi_call:

(lldb) breakpoint set -n ffi_call
Breakpoint 1: where = libffi.dylib`ffi_call + 29 at ffi64.c:421, address = 0x0000000108e098dd
(lldb) cont
Process 51476 resuming
Process 51476 stopped
* thread #4: tid = 0x360fd3, 0x0000000108e098dd libffi.dylib`ffi_call(cif=0x00007fb262f00000, fn=0x00000001024a4210, rvalue=0x000000010b786ac0, avalue=0x000000010b7868a0) + 29 at ffi64.c:421, stop reason = breakpoint 1.1
    frame #0: 0x0000000108e098dd libffi.dylib`ffi_call(cif=0x00007fb262f00000, fn=0x00000001024a4210, rvalue=0x000000010b786ac0, avalue=0x000000010b7868a0) + 29 at ffi64.c:421
   418 	  /* If the return value is a struct and we don't have a return value
   419 	     address then we need to make one.  Note the setting of flags to
   420 	     VOID above in ffi_prep_cif_machdep.  */
-> 421 	  ret_in_memory = (cif->rtype->type == FFI_TYPE_STRUCT
   422 			   && (cif->flags & 0xff) == FFI_TYPE_VOID);
   423 	  if (rvalue == NULL && ret_in_memory)
   424 	    rvalue = alloca (cif->rtype->size);

and let’s take a look at the function we’re about to execute:

(lldb) disassemble -s fn
   0x1024a4210:  pushq  %rbp
   0x1024a4211:  movq   %rsp, %rbp
   0x1024a4214:  leaq   (%rip), %rdi
   0x1024a421b:  popq   %rbp
   0x1024a421c:  jmpq   0x1024a4221
   0x1024a4221:  pushq  $0x6f6c6c65
   0x1024a4226:  addb   %al, (%rax)
   0x1024a4228:  addb   %al, (%rax)
   0x1024a422a:  addb   %al, (%rax)
   0x1024a422c:  addb   %al, (%rax)
   0x1024a422e:  addb   %al, (%rax)

We were expecting to execute hello:

# otool -tV hello.o 
hello.o:
(__TEXT,__text) section
_hello:
0000000000000000	pushq	%rbp
0000000000000001	movq	%rsp, %rbp
0000000000000004	leaq	L_str(%rip), %rdi ## literal pool for: "hello"
000000000000000b	popq	%rbp
000000000000000c	jmpq	_puts

and if you compare this with the code loaded into memory it all becomes clear. The jump instruction in the object file

jmpq _puts

contains a symbolic reference to puts; but the jump in the code that we are about to execute in fact jumps to the next instruction in memory:

0x1024a421c:  jmpq   0x1024a4221
0x1024a4221:  ... 

In other words, the loaded object file has not been properly relocated, and when we try to call puts we end up jumping into nowhere. At this point the bug was easily resolved.

Further Reading

We have barely scratched the surface here of what we can do with lldb or gdb. In particular, ghc maintains quite a bit of runtime information that we can inspect with the debugger. Tim Schröder has an excellent blog post about inspecting ghc’s runtime data structures with gdb, and Nathan Howell has written some extensive lldb scripts to do the same, although they may now be somewhat outdated. See also the reddit discussion about this blog post.


Understanding the RealWorld

Thu, 12 Jun 2014 11:31:19 GMT, by edsko.
Filed under coding.

In my previous blog post Understanding the Stack I mentioned that “RealWorld tokens disappear” in the generated code. This is not entirely accurate. Simon Marlow gives a very brief explanation of this on the old glasgow-haskell-users mailing list. In this blog post we will explore this in more depth and hopefully come to a better understanding of the RealWorld, though perhaps not of the real world.

In order to make some points a bit easier to illustrate, we will be using a 32-bit version of ghc; and since there are no 32-bit builds of ghc for OSX later than version 7.4.2, that’s what we’ll use for most of this blog post.

Warmup exercise

Consider

constant_Int# :: () -> Int#
constant_Int# _ = 1234#

This compiles to the following C-- code:

R1 = 1234;
Sp = Sp + 4;
jump (I32[Sp + 0]) ();

On x86 architectures all Haskell function arguments are passed through the stack (on x86-64 the first few arguments are passed through registers). That means that on entry to constant_Int# the top of the stack contains the () argument, which we discard and pop off the stack by incrementing the stack pointer Sp. (I will explain why I am introducing this unused argument towards the end of this blog post.)

Since 1234# is already in weak head normal form we simply call the continuation on the top of the stack with #1234 as the first argument. If you followed along in the previous blog post this will all be old hat to you.

Heap allocation

What happens if we use a boxed integer instead?

constant_Int :: () -> Int
constant_Int _ = 1234

If we compare the STG code for constant_Int# and constant_Int we see that the only difference is indeed that constant_Int must box its argument:

constant_Int  = \r [ds_sf9] I# [1234];
constant_Int# = \r [ds_sfb] 1234;

This means however that we must create a heap object, and hence the C-- is significantly more complicated:

chJ:
    Hp = Hp + 8;
    if (Hp > I32[BaseReg + 92]) goto chO;
    I32[Hp - 4] = I#_con_info;
    I32[Hp + 0] = 1234;
    R1 = Hp - 3;
    Sp = Sp + 4;
    jump (I32[Sp + 0]) ();
chM:
    R1 = constant_Int_closure;
    jump (I32[BaseReg - 4]) ();
chO:
    I32[BaseReg + 116] = 8;
    goto chM;

The function starts with a heap overflow check. This works in much the same way as the stack overflow check that we discussed previously, so we will ignore it for the sake of this blog post, and omit heap and stack overflow checks from further code samples.

The interesting code happens after we conclude that the heap is okay. Every heap object consists of a code pointer followed by a payload. In this case, the code pointer is the one for the I# constructor, and the payload is 1234.

Since I# 1234 is already in weak head normal form, we don’t need to evaluate it any further, and we can call the continuation on the stack with the address of the heap object we just constructed as its argument. Incidentally, if you are confused by the Hp - 3, think of it as Hp - 4 + 1 instead; Hp - 4 is the address of the code pointer (the start of the heap object); the + 1 is pointer tagging at work again.

(Incidentally, ghc will “preallocate” small integers, so that when you write, say, 1 instead of 1234 it doesn’t actually create a new heap object but gives you a reference to a cached Int instead.)

Larger types

What happens if we move from Int to Double?

constant_Double# :: () -> Double#
constant_Double# _ = 1234.0##

constant_Double :: () -> Double
constant_Double _ = 1234.0

The C-- code for constant_Double is

F64[BaseReg + 56] = 1234.0 :: W64;
Sp = Sp + 4;
jump (I32[Sp + 0]) ();

Since a Double is 64-bits it cannot be passed through the 32-bit register R1/esi. On 32-bit machines ghc passes Double values through a “virtual register”. Virtual registers are just memory locations; BaseReg (ebx on x86) is the address of the first of them. (On x86-64 architectures ghc will use the SSE registers for Doubles.)

The code for constant_Double is similar to the code for constant_Int, except that it must allocate 12 bytes, rather than 8, for the heap object:

Hp = Hp + 12;
I32[Hp - 8] = D#_con_info;
F64[Hp - 4] = 1234.0 :: W64;
R1 = Hp - 7;
Sp = Sp + 4;
jump (I32[Sp + 0]) ();

Functions

So far we considered only constants (modulo that extra argument of type ()). Let’s consider some simple functions next:

want_Int# :: Int# -> Int#
want_Int# x = x

translates to

R1 = I32[Sp + 0];
Sp = Sp + 4;
jump (I32[Sp + 0]) ();

Since any Int# is by definition already in weak head normal form, the argument to want_Int# must already be in weak head normal form and hence we can just pop that Int# off the stack and pass it to the continuation through register R1. The situation for Double# is comparable, except that as before we need to use a different register, and pop two words off the stack rather than one. That is,

want_Double# :: Double# -> Double#
want_Double# x = x

translates to

F64[BaseReg + 56] = F64[Sp + 0];
Sp = Sp + 8;
jump (I32[Sp + 0]) ();

The translation for Int is different.

want_Int :: Int -> Int
want_Int x = x

Since we don’t know if x is in weak head normal form yet, we cannot call the continuation; instead, we must enter x itself:

R1 = I32[Sp + 0];
Sp = Sp + 4;
R1 = R1 & (-4);
jump I32[R1] ();

Remember from the previous blog post that when we enter a closure the address of the closure must be in register R1, in case the closure needs to lookup any free variables. So, we pop off the address of the closure from the stack, load it into R1, mask out any pointer tagging bits, and enter the closure. It will be the responsibility of the closure to eventually call the continuation on the stack once it reaches weak head normal form.

The translation of

want_Double :: Double -> Double
want_Double x = x

is precisely identical; indeed, both are identical to the translation of

id :: a -> a
id x = x

That’s the point of boxing, of course: every argument has the same shape, and hence we can universally quantify over any non-primitive type. Incidentally, this is precisely why Int# and Double# have kind # rather than kind *: it means you cannot apply a polymorphic function such as id to an argument of a primitive type.

Calling known functions

We saw how we can define functions; how do we call them? Consider

call_the_want_Int# :: () -> Int#
call_the_want_Int# _ = want_Int# 1234#

This translates to

I32[Sp + 0] = 1234;
jump want_Int#_info ();

This couldn’t be simpler. We overwrite the top of the stack (which contains the () argument that we are ignoring) with the argument for want_Int#, and then jump to the code for want_Int#.

For want_Double# the situation is slightly more complicated.

call_the_want_Double# :: () -> Double#
call_the_want_Double# _ = want_Double# 1234.0##

Since the Double is two words, we have to grow the stack by one word (and deal with a potential stack overflow, which we omit again):

F64[Sp - 4] = 1234.0 :: W64;
Sp = Sp - 4;
jump want_Double#_info ();

To call a function with a boxed Int or Double we have to create the heap object. For instance,

call_the_want_Int :: () -> Int
call_the_want_Int _ = want_Int 1234

translates to

Hp = Hp + 8;
I32[Hp - 4] = I#_con_info;
I32[Hp + 0] = 1234;
I32[Sp + 0] = Hp - 3;
jump want_Int_info ();

No real surprises here. We create the heap object, push the address of the newly created heap object onto the stack, and call want_Int_info. The same happens when we call want_Double, except of course that we need to create a larger heap object (12 bytes rather than 8).

Multiple arguments

Conceptually speaking Haskell does not have functions of multiple arguments. Instead, a function such as

want_ID## :: Int# -> Double# -> Int#
want_ID## x y = x +# double2Int# y

with two arguments is thought of as a function of a single argument of type Int#, which returns a new function, that takes an argument of type Double# and returns an Int#. The compiler does not implement that literally, however, as that would be much too slow. Instead, a function such as want_ID## really does take two arguments, both of which are on the stack (or in registers on architectures where that is possible); want_ID## gets compiled to

R1 = I32[Sp + 0] + %MO_FS_Conv_W64_W32(F64[Sp + 4]);
Sp = Sp + 12;
jump (I32[Sp + 0]) ();

We add together the Int# and Double# on the stack, store the result in R1, pop them off the stack and call the continuation. Similarly, when we call a function of multiple arguments, we push all arguments to the stack before jumping to the function entry code:

call_the_want_ID## :: () -> Int#
call_the_want_ID## _ = want_ID## 1234# 1234.0##

translates to

F64[Sp - 4] = 1234.0 :: W64;
I32[Sp - 8] = 1234;
Sp = Sp - 8;
jump want_ID##_info ();

Calling unknown functions

In the previous section we knew precisely what function we were calling. Things are more difficult if we don’t:

call_some_want_Int# :: (Int# -> a) -> a
call_some_want_Int# f = f 1234#

The reason that this is more difficult is that we do not know how many arguments f takes. If it takes just one, then we can call f in precisely the same way that we were calling want_Int# previously. But what if f takes more than one argument? In that case we must construct a PAP heap object to record the partial application of f. Clearly, the shape of this heap object must depend on the arguments that we have supplied: for an Int# the size of the payload must be one word, for a Double# it must be two, and so on.

We must also deal with the case that f is already such a PAP object. In that case, we must check if we now have all the arguments necessary to call f; if so, we can call f as we did before; if not, we must construct a new PAP object collecting the previously supplied arguments and the arguments we are supplying now.

Finally, if we are providing multiple arguments, we must deal with the case that f actually takes fewer arguments than we are providing and returns a new PAP object. In that case, we must provide all the arguments that f needs, call f, and then continue with the PAP object that f returns.

Rather than generating code to deal with all these possibilities at every call-to-unknown-function site, ghc delegates this to a bunch of specialized functions which are part of the RTS. The compilation of call_some_want_Int# therefore looks deceptively simple:

R1 = I32[Sp + 0];
I32[Sp + 0] = 1234;
jump stg_ap_n_fast ();

stg_ap_n_fast deals with the application of an unknown function to a single Int#; hence the _n in the name. It finds out how many arguments f has (by looking at the pointer tags, or failing that at the info table for f), and deals with all the cases that mentioned above (as well as a bunch of others; we haven’t shown the full picture here). To call stg_ap_n_fast we pop the function argument (f) off the stack and store it in R1 and then push the argument that we want to provide to f onto the stack.

The case for Double#

call_some_want_Double# :: (Double# -> a) -> a
call_some_want_Double# f = f 1234.0##

is very similar, except that we need to grow the stack by one word, and use stg_ap_d_fast instead (d for Double#):

R1 = I32[Sp + 0];
F64[Sp - 4] = 1234.0 :: W64;
Sp = Sp - 4;
jump stg_ap_d_fast ();

Finally, for non-primitive arguments there is a generic stg_ap_p_fast (p for pointer);

call_some_want_Int :: (Int -> a) -> a
call_some_want_Int f = f 1234

translates to

Hp = Hp + 8;
I32[Hp - 4] = I#_con_info;
I32[Hp + 0] = 1234;
R1 = I32[Sp + 0];
I32[Sp + 0] = Hp - 3;
jump stg_ap_p_fast ();

No real surprises here; we construct the heap object and call stg_ap_p_fast.

Multiple arguments, again

What happens if we call an unknown function with multiple arguments?

call_some_want_II :: (Int -> Int -> a) -> a
call_some_want_II f =1234 5678

This is really no different from supplying just one argument; we still have to deal with the same set of possibilities; the only difference is that we now need a payload for two pointers. This is created by stg_ap_pp_fast:

Hp = Hp + 16;
I32[Hp - 12] = I#_con_info;
I32[Hp - 8] = 5678;
I32[Hp - 4] = I#_con_info;
I32[Hp + 0] = 1234;
R1 = I32[Sp + 0];
I32[Sp + 0] = Hp - 11;
I32[Sp - 4] = Hp - 3;
Sp = Sp - 4;
jump stg_ap_pp_fast ();

We construct two heap objects for the two boxed integers, and then call stg_ap_pp_fast.

If at this point you are wondering “how many variations on stg_ap_XYZ_fast are there?” congratulate yourself, you are paying attention :) Clearly, there cannot be a version of this function for every possible number and type of argument, as there are infinitely many such combinations. Instead, the RTS only contains versions for the most common combinations. For example, there is no version for calling a function with two Int# arguments. So what does

call_some_want_II## :: (Int# -> Int# -> a) -> a
call_some_want_II## f = f 1234# 5678#

compile to?

R1 = I32[Sp + 0];
I32[Sp + 0] = 5678;
I32[Sp - 4] = I32[Lstg_ap_n_info$non_lazy_ptr];
I32[Sp - 8] = 1234;
Sp = Sp - 8;
jump stg_ap_n_fast ();

Let’s consider carefully what’s going on here:

  1. We call stg_ap_n_fast (with a single n) with 1234 on the top of the stack. stg_ap_n_fast will notice that f has (at least) two arguments, and can therefore not yet be called. Instead, it creates a PAP object containing just 1234 (and the address of f).

  2. After it has created the PAP object, it then calls the continuation on the top of the stack (after the argument). This continuation happens to be stg_ap_n_info, which is the “continuation wrapper” of stg_ap_n.

  3. This in turn will pop the next argument off the stack (5678) and the process repeats.

In this way any non-standard version of stg_ap_XYZ can be simulated with a chain of standard stg_ap_XYZ functions.

RealWorld tokens

The main points to take away from all of the above are

So what does all this have to do with the RealWorld tokens that I promised at the start we would look at? Well, the RealWorld tokens have type State# RealWorld, which is yet another primitive type … of size 0. So let us retrace our steps and consider the same examples that we considered for Int# and Double#, but now look at the corresponding translation for State# RealWorld.

We first considered the construction of a constant:

constant_State# :: () -> State# RealWorld
constant_State# _ = realWorld#

This translates to

Sp = Sp + 4;
jump (I32[Sp + 0]) ();

All we need to do is pop the () argument off the stack and call the continuation; since a State# RealWorld type has size zero, we don’t need any register at all to store it!

The translation of

call_the_want_State# :: () -> State# RealWorld
call_the_want_State# _ = want_State# realWorld#

is

Sp = Sp + 4;
jump want_State#_info ();

and finally the translation of

call_some_want_State# :: (State# RealWorld -> a) -> a
call_some_want_State# f = f realWorld#

is

R1 = I32[Sp + 0];
Sp = Sp + 4;
jump stg_ap_v_fast ();

The v here stands for “void”, and here we finally see a trace of a RealWorld token in the compiled code: we are applying a function to a primitive type of type void; admittedly, this is something of size zero, but it’s still somehow here. Note that stg_ap_v_fast must still deal with all the possible cases (exact number of arguments, too many arguments, too few arguments) that we mentioned above, despite the fact that we are providing no arguments at all.

Proxy#

In the case of RealWorld and the IO monad this doesn’t really matter most of the time. Usually when we repeatedly call an unknown IO function we are providing an argument (think mapM, for instance), and the STG runtime provides specialized versions of stg_ap_XYZ_fast for a function that takes one, two, or three pointers and a single void argument, in which case the additional void parameter does not introduce any additional overhead of the indirection through stg_ap. But it is good to be aware that the runtime cost is not quite zero when writing highly performance critical code.

However, as we start to do more and more type level programming in Haskell, we increasingly need to explicitly pass type variables around. For this reason ghc 7.8 introduces a new primitive type

Proxy# :: * -> #

with a single “constructor”

proxy# :: Proxy# a

The sole purpose of a Proxy# argument is to instantiate a type variable. It is used heavily for instance in the new ghc extension for overloaded records. This isn’t a blog post about type level programming—quite the opposite :)—so we will just consider a completely contrived example, along the same lines of the other examples that we considered so far:

call_some_want_Proxies# :: (Proxy# a -> Proxy# b -> Proxy# c 
                              -> (a, b, c) -> (c, b, a)) 
                        -> (a, b, c) -> (c, b, a) 
call_some_want_Proxies# f tup = f proxy# proxy# proxy# tup

Although a Proxy# argument takes up no memory, they do play a role when calling an unknown function, as we have seen. The above function call translates to

R1 = P32[Sp];
I32[Sp - 12] = stg_ap_v_info;
I32[Sp - 8]  = stg_ap_v_info;
I32[Sp - 4]  = stg_ap_v_info;
I32[Sp]      = stg_ap_p_info;
Sp = Sp - 12;
call stg_ap_v_fast(R1) args: 24, res: 0, upd: 4;

Note the stack that we set up here: we apply the function to a single “void” argument; then when that is done, we apply it to the next, and again to the third, and only then we apply it to a “pointer” argument (the actual tuple). The use of Proxy# thus does incur a runtime cost. Of course, one could envision various ways in which ghc could be improved to alleviate this cost; the simplest of which would be to introduce some more stg_ap_XYZ variations targeted at void arguments.

For now, if this really matters it is faster to use a single Proxy# argument, and make sure that it is the last argument so that we can take advantage of the specialized stg_ap functions which were introduced for the IO monad:

fast_call_some_want_Proxies# :: ((a, b, c) -> Proxy# (a, b, c) -> (c, b, a))
                             -> (a, b, c) -> (c, b, a)
fast_call_some_want_Proxies# f tup = f tup proxy#

compiles to just

R1 = P32[Sp];
Sp = Sp + 4;
call stg_ap_pv_fast(R1) args: 8, res: 0, upd: 4;

Footnotes

What is I#_con_info?

We saw that the heap object header for boxed integers is I#_con_info. You might be wondering what exactly is at that code pointer address. We can fire up lldb (or gdb) and find out:

(lldb) disassemble -n ghczmprim_GHCziTypes_Izh_con_info
Main`ghczmprim_GHCziTypes_Izh_con_info:
Main[0x68398]:  incl   %esi
Main[0x68399]:  jmpl   *(%ebp)

ghczmprim_GHCziTypes_Izh is the Z-encoding for ghc-prim_GHC.Types.I#; zm for minus, zh for hash, and zi for dot (“the dot on the i”). So what does the code do? Well, the code pointer associated with every heap object is the code that reduces that heap object to normal form (this what we mean by “entering a closure”). Since a constructor application already is in normal form, there is almost nothing to do, so we just call the continuation on the stack (jmpl).

The only complication is due to pointer tagging, once again. Remember that when we evaluate a closure, R1 (mapped to esi on x86 architectures) points to the address of the closure. If we entered the closure that means the pointer wasn’t tagged yet, or we ignored the tag bits (for whatever reason); we make sure to tag it by calling incl before calling the continuation so that we don’t unnecessarily enter the closure again.

 Info tables

The discussion of constant_Int above assumed that the “tables next to code” optimization is enabled (which is most of the time). With this optimization enabled the code pointer for a closure also doubles as a pointer to the info table for the closure. The info table for a closure contains information such as the constructor tag (which constructor of a data type is this?), the size of the payload and the types of its elements (int, float, pointer etc) for garbage collection purposes, the static reference table (used for garbage collection of CAFs), profiling information, etc. The info table is located just before the closure entry code in memory.

When the optimization is not enabled the heap object header is a pointer to the info table, which in turn contains a field for the entry code. This means that in order to enter a closure an additional indirection is necessary.

Allocation in STG

I have often seen claims that the only place where heap allocation occurs in STG code is in let expressions (for instance, on StackOverflow and on cvs-ghc). This is not entirely accurate, as we saw when we looked at constant_Int. See cgExpr in compiler/codeGen/StgCmmExpr in the ghc sources, and in particular function cgConApp under “tail calls”.

Boxing state

We don’t normally box State# objects, but we could:

data State a = State# (State# a)

constant_State :: () -> State RealWorld
constant_State _ = State# realWorld#

The translation of constant_State is however a bit surprising:

Hp = Hp + 8;
if (Hp > I32[BaseReg + 92]) goto cm8;
I32[Hp - 4] = State#_con_info;
R1 = Hp - 3;
Sp = Sp + 4;
jump (I32[Sp + 0]) ();

This creates a heap object with a one-word payload, which is subseqeuently left unused. The reason is that every heap object must be at least two words, so that there is enough room to overwrite it with a forwarding pointer during garbage collection. Thanks to rwbarton on reddit for pointing this out!

That strange () argument

Many of the examples we looked at had an additional () argument:

call_the_want_Int# :: () -> Int#
call_the_want_Int# _ = want_Int# 1234#

What is the purpose of this argument? Well, first of all, a top-level binding to a primitive type (something of kind #) is not allowed; this is illegal Haskell:

caf_the_want_Int# :: Int# -- Not legal Haskell
caf_the_want_Int# = want_Int# 1234#

But even for examples where it would be legal we want to avoid it for the sake of this blog post. Consider

caf_the_want_Int :: Int
caf_the_want_Int = want_Int 1234

The difference between call_the_want_Int and caf_the_want_Int is that the latter is a CAF or “constant applicative form”; ghc must ensure that caf_the_want_Int is reduced to weak head normal form only once in the execution of the program. CAFs are a large topic in their own right; another blog post, perhaps (for now, some interesting links are this Stack overflow question and ghc ticket #917, answer from Simon Peyton-Jones about how CAFs are garbage collected, and the entry on the ghc wiki). If you are feeling brave, take a look at the generated code for caf_the_want_Int.

In order to avoid CAFs being generated, we introduce the additional argument, and use the compiler flag -fno-full-laziness to make sure that ghc doesn’t attempt to optimize that additional argument away again.


Previous entries