Dealing with Asynchronous Exceptions during Resource Acquisition

Edsko de Vries, Duncan Coutts – Thursday, 28 August 2014

Introduction

Consider the following code: we open a socket, compute with it, and finally close the socket again. The computation happens inside an exception handler (try), so even when an exception happens we still close the socket:

example1 :: (Socket -> IO a) -> IO a 
example1 compute = do -- WRONG
  s <- openSocket 
  r <- try $ compute s
  closeSocket s
  case r of
    Left ex -> throwIO (ex :: SomeException)
    Right a -> return a

Although this code correctly deals with synchronous exceptions–exceptions that are the direct result of the execution of the program–it does not deal correctly with asynchronous exceptions–exceptions that are raised as the result of an external event, such as a signal from another thread. For example, in

example2 :: (Socket -> IO a) -> IO (Maybe a)
example2 compute = timeout someTimeout $ example1 compute

it is possible that the timeout signal arrives after we have opened the socket but before we have installed the exception handler (or indeed, after we leave the scope of the exception handler but before we close the socket). In order to address this we have to control precisely where asynchronous exceptions can and cannot be delivered:

example3 :: (Socket -> IO a) -> IO a
example3 compute =
  mask $ \restore -> do
    s <- openSocket 
    r <- try $ restore $ compute s
    closeSocket s
    case r of
      Left ex -> throwIO (ex :: SomeException)
      Right a -> return a

We mask asynchronous exceptions, and then restore them only inside the scope of the exception handler. This very common pattern is captured by the higher level combinator bracket, and we might rewrite the example as

example4 :: (Socket -> IO a) -> IO a
example4 = bracket openSocket closeSocket

Allowing asynchronous exceptions during resource acquisition

Suppose that we wanted to define a derived operation that opens a socket and performs some kind of handshake with the server on the other end:

openHandshake :: IO Socket
openHandshake = do
  mask $ \restore -> do
    s <- openSocket
    r <- try $ restore $ handshake s
    case r of
      Left  ex -> closeSocket s >> throwIO (ex :: SomeException)
      Right () -> return s

(These and the other examples can be defined in terms of bracket and similar, but we use mask directly so that it’s easier to see what is happening.) We might use openHandshake as follows:

example5 :: (Socket -> IO a) -> IO a
example5 compute = do
  mask $ \restore -> do
    s <- openHandshake 
    r <- try $ restore $ compute s
    closeSocket s
    case r of
      Left ex -> throwIO (ex :: SomeException)
      Right a -> return a

There are no resource leaks in this code, but there is a different problem: we call openHandshake with asynchronous exceptions masked. Although openHandshake calls restore before doing the handshake, restore restores the masking state to that of the enclosing context. Hence the handshake with the server cannot be timed out. This may not be what we want–we may want to be able to interrupt example5 with a timeout either during the handshake or during the argument computation.

Note that this is not a solution:

example6 :: (Socket -> IO a) -> IO a 
example6 compute = do
  mask $ \restore -> do
    s <- restore openHandshake -- WRONG
    r <- try $ restore $ compute s
    closeSocket s
    case r of
      Left ex -> throwIO (ex :: SomeException)
      Right a -> return a

Consider what might happen: if an asynchronous exception is raised after openHandshake returns the socket, but before we leave the scope of restore, the asynchronous exception will be raised and the socket will be leaked. Installing an exception handler does not help: since we don’t have a handle on the socket, we cannot release it.

Interruptible operations

Consider this definition from the standard libraries:

withMVar :: MVar a -> (a -> IO b) -> IO b
withMVar m io =
  mask $ \restore -> do
    a <- takeMVar m
    b <- restore (io a) `onException` putMVar m a
    putMVar m a
    return b

This follows almost exactly the same pattern as the examples we have seen so far; we mask asynchronous exceptions, take the contents of the MVar, and then execute some operation io with the contents of the MVar, finally putting the contents of the MVar back when the computation completes or when an exception is raised.

An MVar acts as a lock, with takeMVar taking the role of acquiring the lock. This may, of course, take a long time if the lock is currently held by another thread. But we call takeMVar with asynchronous exceptions masked. Does this mean that the takeMVar cannot be timed out? No, it does not: takeMVar is a so-called interruptible operation. From the Asynchronous Exceptions in Haskell paper:

Any operation which may need to wait indefinitely for a resource (e.g., takeMVar) may receive asynchronous exceptions even within an enclosing block, but only while the resource is unavailable. Such operations are termed interruptible operations. (..) takeMVar behaves atomatically when enclosed in block. The takeMVar may receive asynchronous exceptions right up until the point when it acquires the MVar, but not after.

(block has been replaced by mask since the publication of the paper, but the principle is the same.) Although the existence of interruptible operations makes understanding the semantics of mask harder, they are necessary: like in the previous section, wrapping takeMVar in restore is not safe. If we really want to mask asynchronous exceptions, even across interruptible operations, Control.Exception offers uninterruptibleMask.

Custom interruptible operations

So an interruptible operation is one that can be interrupted by an asynchronous exception even when asynchronous exceptions are masked. Can we define our own interruptible operations? Yes, we can:

-- | Open a socket and perform handshake with the server
--
-- Note: this is an interruptible operation.
openHandshake' :: IO Socket
openHandshake' = 
  mask_ $ do 
    s <- openSocket
    r <- try $ unsafeUnmask $ handshake s
    case r of
      Left  ex -> closeSocket s >> throwIO (ex :: SomeException)
      Right () -> return s

unsafeUnmask is defined in GHC.IO, and unmasks asynchronous exceptions, no matter what the enclosing context is. This is of course somewhat dangerous, because now calling openHandshake' inside a mask suddenly opens up the possibility of an asynchronous exception being raised; and the only way to know is to look at the implementation of openHandshake', or its Haddock documentation. This is somewhat unsatisfactory, but exactly the same goes for takeMVar and any other interruptible operation, or any combinator that uses an interruptible operation under the hood. A sad state of affairs, perhaps, but one that we don’t currently have a better solution for.

Actually, using unsafeUnmask is a bit too crude. Control.Exception does not export it, but does export

allowInterrupt :: IO ()
allowInterrupt = unsafeUnmask $ return ()

with documentation

When invoked inside mask, this function allows a blocked asynchronous exception to be raised, if one exists. It is equivalent to performing an interruptible operation, but does not involve any actual blocking.
When called outside mask, or inside uninterruptibleMask, this function has no effect.

(emphasis mine.) Sadly, this documentation does not reflect the actual semantics: unsafeUnmask, and as a consequence allowInterrupt, unmasks asynchronous exceptions no matter what the enclosing context is: even inside uninterruptibleMask. We can however define our own operator to do this:

interruptible :: IO a -> IO a
interruptible act = do
  st <- getMaskingState
  case st of
    Unmasked              -> act
    MaskedInterruptible   -> unsafeUnmask act
    MaskedUninterruptible -> act

where we call unsafeUnmask only if the enclosing context is mask, but not if it is uninterruptibleMask (TODO: What is the semantics when we nest these two?). We can use it as follows to define a better version of openHandshake:

-- | Open a socket and perform handshake with the server
--
-- Note: this is an interruptible operation.
openHandshake' :: IO Socket
openHandshake' = 
  mask_ $ do 
    s <- openSocket
    r <- try $ interruptible $ handshake s
    case r of
      Left  ex -> closeSocket s >> throwIO (ex :: SomeException)
      Right () -> return s

Resource allocation timeout

If we wanted to timeout the allocation of the resource only, we might do

example7 :: (Socket -> IO a) -> IO a
example7 compute = do
  mask $ \restore -> do
    ms <- timeout someTimeout $ openHandshake'
    case ms of
      Nothing -> throwIO (userError "Server busy")
      Just s  -> do r <- try $ restore $ compute s
                    closeSocket s
                    case r of
                      Left ex -> throwIO (ex :: SomeException)
                      Right a -> return a

Exceptions are masked when we enter the scope of the timeout, and are unmasked only once we are inside the exception handler in openHandshake'–in other words, if a timeout happens, we are guaranteed to clean up the socket. The surrounding mask is however necessary. For example, suppose we are writing some unit tests and we are testing openHandshake'. This is wrong:

example8 :: IO ()
example8 = do 
  ms <- timeout someTimeout $ openHandshake'
  case ms of
    Just s  -> closeSocket s
    Nothing -> return ()

Even if we are sure that the example8 will not be interrupted by asynchronous exceptions, there is still a potential resource leak here: the timeout exception might be raised just after we leave the mask_ scope from openHandshake but just before we leave the timeout scope. If we are sure we don’t need to worry about other asynchronous exceptions we can write

example8 :: IO ()
example8 = do
  s <- mask_ $ timeout someTimeout $ openHandshake'
  case ms of
    Just s  -> closeSocket s
    Nothing -> return ()

although of course it might be better to simply write

example8 :: IO ()
example8 = 
  bracket (timeout someTimeout $ openHandshake')
          (\ms -> case ms of Just s  -> closeSocket s
                             Nothing -> return ())
          (\_ -> return ())

Conclusions

Making sure that resources are properly deallocated in the presence of asynchronous exceptions is difficult. It is very important to make sure that asynchronous exceptions are masked at crucial points; unmasking them at the point of calling a resource allocation function is not safe. If you nevertheless want to be able to timeout resource allocation, you need to make your resource allocation function interruptible.

For completeness’ sake, there are some other solutions that avoid the use of unsafeUnmask. One option is to thread the restore argument through (and compose multiple restore arguments if there are multiple nested calls to mask). This requires resource allocations to have a different signature, however, and it is very error prone: a single mask somewhere along the call chain where we forget to thread through the restore argument will mean the code is no longer interruptible. The other option is to run the code that we want to be interruptible in a separate thread, and wait for the thread to finish with, for example, a takeMVar. Getting this right is however no easy task, and it doesn’t change anything fundamentally anyway: rather than using unsafeUnmask we are now using a primitive interruptible operation; either way we introduce the possibility of exceptions even in the scope of mask_.

Finally, when your application does not fit the bracket pattern we have been using (implicitly or explicitly), you may want to have a look at resourcet and pipes or conduit, or my talk Lazy I/O and Alternatives in Haskell.