Community Server Stats

Thursday, 17 June 2010, by Ian Lynagh.
Filed under community.

It's been just over 3 years since the Haskell community server was started, and since then 372 projects have been created. While work goes on behind the scenes to move community to a more powerful server, with a recently enlarged sysadmin team running it, here are some stats for what it's being used for.

Almost all projects use community for hosting source code, but many also use it for a trac bug tracker, a web page, and a mailing list:

Working out what licence projects use, without a lot of manual effort, is tricky, but hopefully this is a good approximation:

This data is based on licence files, and licence fields in Cabal files, within project's /srv/code directories. The total is more than 100% as soem projects contain files under multiple licences, and it is likely that there are further licences that have been missed. In particular, licences for document such as RFCs and academic papers will likely have been missed.

Unsurprisingly, BSD3 is the most common license, with GPL and LGPL next. The "Haskell" licence is the licence used for the Haskell98 report.

Interestingly, the picture is a little different if we look at the number of bytes under each licence:

The "Haskell" licence is much higher now, as it includes any project containing a GHC tree (which is large). GHC trees also get marked as using the GPL (due to a Cabal file in Cabal's testsuite, although some of the Windows tarballs contain GPLed programs anyway) and there are also some large projects using the GPL; the combination puts GPL ahead of BSD3.

Looking at the number of users per project, most have a single developer, and the number tails of as one would expect:

Most projects have seen activity (defined as "a file in the /srv/code directory has been modified) within the last 6 months, and more than two thirds within the last year. Inevitably, there are some that appear dormant:

Finally, here's a breakdown of the disk usage on community:

Interestingly, home directories account for more disk usage than projects. This ls largely due bindists, builds and darcs checkouts of large projects - mostly GHC. There's now a more recent GHC in /srv/local/ghc/ghc-6.12.2/bin, but when community moves to its new home we'll try to keep an up-to-date GHC in the default path.