Community Server Stats

Thursday, 17 June 2010, by Ian Lynagh.
Filed under community.

It's been just over 3 years since the Haskell community server was started, and since then 372 projects have been created. While work goes on behind the scenes to move community to a more powerful server, with a recently enlarged sysadmin team running it, here are some stats for what it's being used for.

Almost all projects use community for hosting source code, but many also use it for a trac bug tracker, a web page, and a mailing list:

Working out what licence projects use, without a lot of manual effort, is tricky, but hopefully this is a good approximation:

This data is based on licence files, and licence fields in Cabal files, within project's /srv/code directories. The total is more than 100% as soem projects contain files under multiple licences, and it is likely that there are further licences that have been missed. In particular, licences for document such as RFCs and academic papers will likely have been missed.

Unsurprisingly, BSD3 is the most common license, with GPL and LGPL next. The "Haskell" licence is the licence used for the Haskell98 report.

Interestingly, the picture is a little different if we look at the number of bytes under each licence:

The "Haskell" licence is much higher now, as it includes any project containing a GHC tree (which is large). GHC trees also get marked as using the GPL (due to a Cabal file in Cabal's testsuite, although some of the Windows tarballs contain GPLed programs anyway) and there are also some large projects using the GPL; the combination puts GPL ahead of BSD3.

Looking at the number of users per project, most have a single developer, and the number tails of as one would expect:

Most projects have seen activity (defined as "a file in the /srv/code directory has been modified) within the last 6 months, and more than two thirds within the last year. Inevitably, there are some that appear dormant:

Finally, here's a breakdown of the disk usage on community:

Interestingly, home directories account for more disk usage than projects. This ls largely due bindists, builds and darcs checkouts of large projects - mostly GHC. There's now a more recent GHC in /srv/local/ghc/ghc-6.12.2/bin, but when community moves to its new home we'll try to keep an up-to-date GHC in the default path.


Well-Typed are hiring

Friday, 07 May 2010, by Duncan Coutts.
Filed under well-typed.

We are looking to hire a Haskell expert to work with us at Well-Typed as a Haskell consultant. We are seeing an increasing demand for our services, and are thus seeking to expand our capacity.

This is an exciting opportunity for someone who is passionate about Haskell and who is keen to improve and promote Haskell in a professional context.

The role is quite general and could cover any of the projects and activities that we are involved in as a company. The tasks may involve:

Well-Typed has a variety of clients. For some we do proprietary Haskell development and consulting. For others, much of the work involves open-source development and cooperating with the rest of the Haskell community: the commercial, open-source and academic users.

The position is initially as a contractor for one year with a salary from 150 GBP per day, plus a profit-dependent bonus. We offer flexible hours and work from home. Living in England is not required.

In the longer term there is the opportunity to become a member of the partnership with a full stake in the business: being involved in business decisions, and fully sharing the risks and rewards.

If you are interested, please apply via info@well-typed.com. Send us your CV, and tell us why you are interested and why you would be a good fit for the job. We are more than happy to answer informal enquiries. Contact Duncan Coutts or Ian Lynagh for further information, either by email or IRC.

The deadline for applications is Friday 11 June 2010.


Well-Typed delivers air traffic analysis tool for NATS

Wednesday, 05 May 2010, by Duncan Coutts.
Filed under well-typed.

Well-Typed have successfully delivered an application in Haskell to meet NATS requirements in order to analyse air traffic in UK airspace. Since the beginning of the year UK airline operators have been using the software to fine-tune aspects of their future schedules.

Sid Mohangee, project manager at NATS, said:

"At the beginning of the project I set our suppliers a challenging schedule. Well-Typed's use of Haskell and a small expert team worked out well. They worked closely with me and my team and together we were able to adjust quickly to changing customer requirements. We delivered the project on schedule."

The use of Haskell allowed the software to be developed rapidly, while at the same time giving confidence in its correctness. Well-Typed used the standard suite of Haskell techniques, libraries and tools, including testing with QuickCheck and code coverage with HPC.

In light of the success in this project, NATS is considering further use of Haskell as part of its technology strategy.

About Well-Typed

Well-Typed LLP is a Haskell services company, providing Haskell consultancy services and writing bespoke Haskell applications.

http://www.well-typed.com/

About NATS

NATS is the UK's leading air traffic services provider. It provides air traffic control to all en-route aircraft in UK airspace, and to aircraft at 15 of the UK's biggest airports.

http://www.nats.co.uk/


Parallel Haskell: 2-year project to push real world use

Thursday, 29 April 2010, by Duncan Coutts.
Filed under well-typed.

GHC HQ and Well-Typed are very pleased to announce a 2-year project funded by Microsoft Research to push the real-world adoption and practical development of parallel Haskell with GHC. We are seeking organisations to take part: read on for details.

In the last few years GHC has gained impressive support for parallel programming on commodity multi-core systems. In addition to traditional threads and shared variables, it supports pure parallelism, software transactional memory (STM), and data parallelism. With much of this research and development complete, and more on the way, the next stage is to get the technology into more widespread use.

This project aims to do the engineering work to solve whatever remaining practical problems are blocking organisations from making serious use of parallelism with GHC. The driving force will be the applications rather than the technology.

We will work in partnership with a small number of commercial or scientific users who are keen to make use of parallel Haskell. We will work with these partners to identify the issues, major or minor, that are hindering progress. The project is prepared to handle system issues, covering everything from compiler and runtime system through to more mundane platform and tool problems. Meanwhile our partners will contribute their domain-specific expertise to use parallel Haskell to address their application.

We are now seeking organisations to take part in this project. Organisations do not need to contribute financially but should be prepared to make a significant commitment of their own time. We expect to get final confirmation of the project funding in June and to start work shortly thereafter.

Well-Typed will coordinate the project, working directly with both the participating organisations and the Simons at GHC HQ. If you think your organisation may be interested then get in touch with me, Duncan Coutts, via info@well-typed.com.


To the Hackathon!

Wednesday, 17 March 2010, by Duncan Coutts.
Filed under community.

I'm off tomorrow morning to the Haskell Hackathon in Zurich. I'm looking forward to seeing people.

I don't have particular plans yet for what I will be working on, though undoubtedly it'll be related to the Cabal/Hackage/Platform projects. Of course the point of a hackathon is to get people together to share ideas and enthusiasm, so it's usually best not to have too fixed ideas about how to spend one's time.

By the way, in case anyone has been wondering why I've been so quiet since the beginning of the year; I've been in hiding (i.e. mostly offline) doing the final push on my DPhil thesis. It's getting pretty close now; I sent my long-suffering supervisor a 180-page draft the other day. I'm really looking forward now to getting it over with and moving on to new commercial and academic projects.


Talk at the Functional Programming eXchange

Wednesday, 09 December 2009, by Duncan Coutts.
Filed under community.

I gave a talk a couple days ago at the Functional Programming eXchange event organised by Skills Matter in London.

Strong Types and Pure Functions

Slides Video

It was an interesting event. We had about 50 people attend, mostly professional programmers with day jobs doing development on the JVM and .NET platforms. Sadek Drobi gave the opening talk about computational abstraction, by which he meant monadic glue code. He explained why we want it and showed examples in C#, Scala and F#. I hadn't realised that the C# LINQ stuff is actually general monad syntax. It was fun to see Sadek telling all these professional programmers that what they really want are functors, applicative functors and monads.

I gave a talk in the afternoon about some of the things you can do if you go all the way with FP and use a purely functional language (rather than a hybrid). I explained a technique for making interfaces where you use types to enforce which side effects are allowed. Of course this means making custom monads so it was good that Sadek had introduced that topic in the morning. The feedback I got was that this was one of the more technically advanced talks, but I think that was ok since it wasn't essential to understand every last detail to get the point.

Overall I think it went quite well, especially given it was the first time the event has been put on. I expect that other members of the Haskell community might like to take part in future (Ganesh Sittampalam and Matthew Sackman also spoke this time). Also, Skills Matter are planning to do an FP journal/magazine aimed at mainstream professional programmers. Hopefully that'll be another good way to get our message out. Get in touch with me or Robert Pickering if you're interested in either the conference or the journal.


Video and slides from the IHG talk at CUFP

Tuesday, 08 September 2009, by Duncan Coutts.
Filed under industrial-haskell-group.

I gave the closing talk at the Commercial Users of Functional Programming (CUFP) conference last week about the birth of the Industrial Haskell Group.

Birth of the Industrial Haskell Group

Video Slides

I talked about how we share these language implementations and what opportunities there are to share future development costs. I talked about how we went about setting up the IHG. Finally, I tried to persuade people that improving shared development infrastructure such as Hackage is a modest investment with potentially large benefits.

Here's the full abstract:

It has long been thought that commercial users of Haskell could benefit from an organisation to support their needs, and that as a side-effect the wider Haskell community could benefit from the actions of such an organisation. The stronger community would in turn benefit the commercial users, in a positive feedback cycle.

At last year's CUFP, users of several FP languages raised the issue that there was no organisation that they could pay to do the important but boring work of maintaining and improving common infrastructure. Shortly after CUFP, in partnership with major commercial users of Haskell such as Galois and Amgen, we started to set wheels in motion, and in March 2009 we announced the birth of the Industrial Haskell Group (IHG).

The IHG is starting off with a limited set of activities, but already it is having an impact on the state of the Haskell development platform. We expect that as it expands, it will become a significant force driving Haskell forwards.

In this presentation we will talk about the motivation leading to the formation of the IHG, how it has worked thus far and what lessons we can learn that might benefit other FP communities. We will also look at how we can encourage the positive feedback cycle between commercial users and the wider community.


Industrial Haskell Group meeting at CUFP

Wednesday, 19 August 2009, by Ian Lynagh.
Filed under industrial-haskell-group, well-typed.

Following on from the "Birth of the Industrial Haskell Group" talk at CUFP in Edinburgh, we will be having a short meeting to discuss our future plans before everyone heads off to dinner. That's on the 4th September, initially gathering in a suitable corner of the CUFP location.

If you're thinking of joining the IHG, or even just interested in what is going on, then please join us.


GHC and Windows DLLs

Friday, 03 July 2009, by Ben Lippmeier.
Filed under coding, industrial-haskell-group.

Following on from Duncan's work on Building plugins as Haskell shared libs, I've been working on supporting the same functionality on Windows. The end goal is to have a rts.dll, libHsBase.dll and myPlugin.dll and be able to write things like Excel plugins in Haskell without needing to statically link the whole runtime system and set of libraries into each one.

Windows uses the Portable Executable (PE) Format, so the hoops that must be jumped through are different than those for Linux and Mac OS X. Linux uses ELF for its object format, and Mac OS X uses Mach-O. Tool chain programs such as linkers and object file views are also different.

One of immediate issues is to deal with mutually recursive imports between Haskell libraries and the GHC Run Time System (RTS). Clearly, the code for a Haskell library will call the RTS to perform tasks such as allocating memory, throwing exceptions, forking threads and so on. However, the runtime system also calls back on the base library. For example, here is a function from the RTS which helps to create parallel threads:

void createSparkThread (Capability *cap) {
    StgTSO *tso;
    tso = createIOThread (cap, RtsFlags.GcFlags.initialStkSize, 
                                  &base_GHCziConc_runSparks_closure);
    postEvent(cap, EVENT_CREATE_SPARK_THREAD, 0, tso->id);
    appendToRunQueue(cap,tso);
}

The variable base_GHCziConc_runSparks_closure is the name of a function closure in the GHC.Conc library which we won't have code for when we're linking the RTS.

One of the quirks of Windows is the need to generate so called "import libraries". These contain stub code that is used to call a function in a DLL. For example, if code in module main.o wants to call a function fun in a library base.dll, the picture looks something like this:

## in main.o ##################### (linked into main.exe)
main:
    call fun
    ....
    call dword ptr [__imp_fun]
    .... 


## in base.lib ###################### (linked into main.exe)
fun:
    jmp dword ptr [__imp_fun]

__imp_fun:
.data
    .dword fun


## in base.dll ######################
fun:
    .. actual code for fun   

In Windows, all calls to a function in a DLL go via the Imported function Address Table (IAT). This is a table of pointers, and in the example above there is one entry named __imp_fun. There are two ways to use this table. The first way is illustrated by the first call to fun in main.o. This call targets stub code that looks up the pointer from the table and then jumps to it. The second way is to lookup the pointer and jump to it directly, but to do this we need to know that the function is in an external DLL at code generation time. A call fun instruction uses a PC relative offset, and is physically shorter than a call dword ptr [] instruction, so it's not practical to change one to the other at link time.

The file base.lib is the "import library", which contains the call stub and the IAT. Import libraries need to be generated independently from the main compiling and linking process, using Windows specific tools. The import library for a particular dll is then linked into every executable (or other dll) that uses it.

Anyway, I've spent the last few days wading through MSDN and the GHC build system, and I think I've cataloged at least all the major hoops. I'll let you know how the jumping goes next post.


GHC, primops and exorcising GMP

Tuesday, 09 June 2009, by Duncan Coutts.
Filed under coding, industrial-haskell-group.

GHC uses GMP to implement the Haskell arbitrary-precision Integer type. It's been this way for ages.

For various reasons using GMP is a slight problem for some users. Some users don't really make use of Integer and don't like to have to link to GMP. Since GMP uses the LGPL, if you want to ship closed source programs then you have to link to it dynamically. On Windows static linking is the default so you have to jump through hoops to link it dynamically. Then there are also users who make heavy use of GMP and find that the Integer library is far too limited an interface to GMP. However binding extra GMP functions is complicated by the the way that the GHC RTS uses it already (especially the memory management).

So what these people want is a way to build GHC such that the RTS does not directly link to GMP. Then the implementation of Integer should be in a library that is replaceable so that one can use a simple slow implementation, a super-duper binding to GMP or some other "big num" library.

Daniel Peebles, Ian Lynagh and I have been working on this problem recently. Ian and my contributions to this are supported by the IHG.

Getting GMP out of the RTS

Before we can think about replacements however we need to disentangle GMP from the RTS and at least move the existing GMP-based Integer implementation into a library. This Integer implementation would remain the default so it still has to be fast. Daniel has managed to rip GMP out of the RTS and we're now focusing on how to move the GMP binding into its own library.

The difficulty of moving it out of the RTS is that currently almost all the GMP operations are bound as GHC "primops", as opposed to using the FFI. This is partly historical accident (FFI arrived on the scene relatively late) and partly that due to certain FFI restrictions, the primop route is simpler and faster. The issue is that the wrapper code (around the actual GMP calls) needs to return several results to Haskell land, in particular things like (# Int, ByteArray# #). Using the FFI it is possible to return several results but one has to do it in the time-honoured tradition of C and emulate "out" parameters by passing pointers. The problem with doing that is we would need to do a lot of marshaling: temporarily allocate some memory, pass pointers and read back the results. All this just to return a few integers and pointers. It's actually more tricky because at the level in the library stack where we have to implement Integer we do not actually have access to the FFI libraries (in fact currently we do not even have access to the IO type).

GHC primops

Primops bypass the single-result restrictions inherited from the C calling convention. We can write primops that directly return unboxed tuples, like (# Int, ByteArray# #). Primops (at least out-of-line primops) are implemented in Cmm, which is GHC's low level intermediate language based on the C-- language. These Cmm functions have to know exactly the internal calling convention that GHC uses, but there is no excess marshaling.

Unfortunately knowledge of the primops has to be baked into the compiler and the Cmm code has to be compiled into the RTS. So that's no good for implementing Integer a separate library from the RTS.

What if we could use the FFI to import Cmm functions...

foreign import prim

That would make it possible to have out-of-line primops in a library. The library would contain the compiled .cmm files and the .hs code in the same library would "foreign import" the cmm function. In particular we could then just move the .cmm code we use for wrapping the GMP library calls from the RTS into the integer-gmp package. Then instead of getting primops like plusInteger# from the GHC.Prim module, we would just foreign import them, eg:

foreign import prim "plusInteger" plusInteger#
  :: Int# -> ByteArray#
  -> Int# -> ByteArray#
  -> (# Int#, ByteArray# #)

So that's what I started implementing today, "foreign import prim". It needs a slight extension in the lexer, parser, type checker, desugarer, core->stg, and stg->cmm phases. That sounds like a lot but the changes in each bit are pretty small. As a feature it is very similar to foreign C calls and also to primops, so fortunately it can share most code with those existing features. So far it's going ok, I've got it producing convincing looking core, stg and cmm code. Tomorrow I'll test it and review the design and changes with Simon Marlow.

If this works out ok then it should mean we're still using the same well-tested gmp binding code and without any extra marshaling overhead. Correctness testing is mostly covered by the existing GHC testsuite. We still want to check the performance of course. To that end, Daniel has been working on an Integer performance benchmark. He's tried it already using the simple pure-Haskell implementation of Integer. Apparently it does respectably but takes ages to calculate 10000 factorial.


Previous entries

Next entries