How we might abolish Cabal Hell, part 1

Tuesday, 30 September 2014, by Duncan Coutts.
Filed under cabal, community.

At ICFP a few weeks ago a hot topic in the corridors and in a couple talks was the issues surrounding packaging and “Cabal Hell”.

Fortunately we were not just discussing problems but solutions. Indeed I think we have a pretty good understanding now of where we want to be, and several solutions are in development or have reasonably clear designs in peoples’ heads.

I want to explain what’s going on for those not already deeply involved in the conversation. So this is the first of a series of blog posts on the problems and solutions around Cabal hell.

There are multiple problems and multiple solutions. The solutions overlap in slightly complicated ways. Since it is a bit complicated, I’m going to start with the big picture of the problems and solutions and how they relate to each other. In subsequent posts I’ll go into more detail on particular problems and solutions.

“Cabal hell”: the problems

So what is “Cabal hell”? Let’s consult the dictionary…

Cabal Hell

The feeling of powerlessness one has when Cabal does not do what one wanted and one does not know how to fix it.

I’m joking obviously, but my point is that Cabal hell is not a precise technical term. There are a few different technical problems (and misunderstandings and UI problems) that can cause Cabal hell.

A useful concept when talking about this topic is that of the packaging “wild wild west”. What we mean is whether we are in a context where we reasonably expect packages to work together (because there has been some deliberate effort to make them work together), or if we are in the “wild wild west”. In the “wild wild west” we have to do things like deal with packages that were uploaded yesterday by multiple different authors. The point is that nobody has yet had time to try and make things consistent. It is a useful concept because we have developers who need to deal with the “wild wild west” and those who would really rather not, and the solutions tend to look a bit different.

Another term we often use when talking about packages is “consistency”. What we mean is that in a collection of packages there is at most one version of each package. For example when you ask cabal-install to install package A and B, we say that it will try to find a “consistent” set of dependencies – meaning a set including A, B and their dependencies that has only one version of any package.

“Cabal hell”: the symptoms

So lets consider a breakdown of the technical problems. To start with lets look at a breakdown based on the symptoms that a developer in Cabal Hell experiences

Cabal Hell: the symptoms 

We can first break things down by whether there is a solution or not. That is, whether a perfect dependency resolver could find a plan to install the package(s) and their dependencies consistently. We want such a solution because it’s a prerequisite for installing working packages. (We’re ignoring the possibility that there is a solution but the solver fails to find one. That is possible but it’s a relatively rare problem.)

Given the situation where the solver tells us that there is no solution, there are a few different cases to distinguish:

No solution expected

The failure was actually expected. For example a developer updating their package to work with the latest version of GHC is not going to be surprised if their initial install attempt fails. Then based on what the solver reports they can work out what changes they need to make to get things working.

Solution had been expected

The more common case is that the developer was not expecting to be working in the wild west. The developer had an expectation that the package or packages they were asking for could just be installed. In this case the answer “no that’s impossible” from the solver is very unhelpful, even though it’s perfectly correct.

Unnecessary solver failure

The symptoms here are exactly the same, namely the solver cannot find a solution, but the reason is different. More on reasons in a moment.

Even when there is a solution we can hit a few problems:

Compile error

Compilation can fail because some interface does not match. Typically this will manifest as a naming error or type error.

Breaking re-installations

Cabal’s chosen solution would involve reinstalling an existing version of a package but built with different dependencies. This re-installation would break any packages that depend on the pre-existing instance of the installed package. By default cabal-install will not go ahead with such re-installation, but you can ask it to do so.

Type errors when using packages together

It is possible to install two package and then load them both in GHCi and find that you cannot use them together because you get type errors when composing things defined in the two different packages.

“Cabal hell”: the reasons

So those are the major problems. Lets look at some reasons for those problems.

Cabal Hell: the reasons 

Inconsistent versions of dependencies required

There are two sub-cases worth distinguishing here. One is where the developer is asking for two or more packages that could be installed individually, but cannot be installed and used together simultaneously because they have clashing requirements on their common dependencies. The other is that a package straightforwardly has no solution (at least with the given compiler & core library versions), because of conflicting constraints of its dependencies.

Constraints wrong

With under-constrained dependencies we get build failures, and with over-constrained dependencies we get unnecessary solver failures. That is, a build failure is (almost always) due to dependency constraints saying some package version combination should work, when actually it does not. And the dual problem: an unnecessary solver failure is the case where there would have been a solution that would actually compile, if only the constraints had been more relaxed.

Single instance restriction

Existing versions of GHC and Cabal let you install multiple versions of a package, but not multiple instances of the same version of a package. This is the reason why Cabal has to reinstall packages, rather than just add packages.

Inconsistent environment

These errors occur because cabal-install does not enforce consistency in the developer’s environment, just within any set of packages it installs simultaneously.

We’ll go into more detail on all of these issues in subsequent posts, so don’t worry if these things don’t fully make sense yet.

“Cabal hell”: the solutions

There are several problems and there isn’t one solution that covers them all. Rather there are several solutions. Some of those solutions overlap with each other, meaning that for some cases either solution will work. The way the solutions overlap with the problems and each other is unfortunately a bit complicated.

Here’s the overview:

Cabal Hell: the solutions 

So what does it all mean?

We’ll look at the details of the solutions in subsequent posts. At this stage the thing to understand is which solutions cover which problems, and where those solutions overlap.

We’ll start with the two most important solutions. They’re the most important in the sense that they cover the most cases.

Nix-style persistent store with multiple consistent environments

This solves all the cases of breaking re-installations, and all cases of inconsistent environments. It doesn’t help with wrong constraints.

You’ll note that it covers some cases where there is no solution and you might wonder what this can mean. Some cases where there is no solution are due to two (or more) sets of packages that could be installed independently but cannot be installed together consistently. In a nix-style setting it would be possible to offer developers the option to install the packages into separate environments when the solver determines that this is possible.

Curated consistent package collections

These are things like the Debian Haskell packages or Stackage. This solves some cases of each of the different problems: breaking re-installations, inconsistent environments, wrong constraints and lack of consistent solutions. It solves those cases to the extent that the package collection covers all the packages that the developer is interested in. For many developers this will be enough. Almost by definition however it cannot help with the “wild west” of packages because the curation takes time and effort. Unless used in combination with a isolated environment solution (e.g. nix-style, but also less sophisticated systems like hsevn or cabal sandboxes) it does not allow using multiple versions of the collection (e.g. different projects using different Stackage versions).

It is worth noting that these two solutions should work well together. Neither one subsumes the other. We don’t need to pick between the two. We should pick both. The combination would get us a long way to abolishing Cabal hell.

There are also a number of smaller solutions:

Automatic build reporting

This helps with detecting compile errors arising from constraints that are too lax. It doesn’t help with constraints that are too tight. This solution requires a combination of automation and manual oversight to fix package constraints and to push those fixes upstream.

Upper-bound build bots

This is similar to gathering build reports from users, but instead of looking at cases of compile failure (constraints too lax), it explicitly tries relaxing upper bounds and checks if things still compile and testsuites work. Again, this requires automation to act on the information gleaned to minimise manual effort.

Package interface compatibility tools

This is to help package authors get their dependency constraints right in the first place. It can help them follow a version policy correctly, and tell them what minimum and maximum version bounds of their dependencies to use. It does not completely eliminate the need to test, because type compatibility does not guarantee semantic compatibility. Solutions in this area could eliminate a large number of cases of wrong constraints, both too lax and too tight.

Private dependencies

This allows solutions to exist where they do not currently exist, by relaxing the consistency requirement in a safe way. It means global consistency of dependencies is not always required, which allows many more solutions. This solution would cover a lot of cases in the “wild wild west” of packaging, and generally in the large set of packages that are not so popular or well maintained as to be included in a curated collection.

Next time…

So that’s the big picture of the problems and solutions and how they relate to each other. In subsequent posts we’ll look in more detail at the problems and solutions, particularly the solutions people are thinking about or actively working on.