Haskell Applicative hacks

Safely discarding Applicative results

6 thoughts
last posted April 1, 2016, 1:19 a.m.
get stream as: markdown or atom

The problem

erikd on Freenode #haskell.au shares an interesting bug.

The relevant code is:

buildTable :: IO EvenCache
buildTable = do
  ht <- HT.new
  forM_ pairs $ \ (k,v) ->
    maybe (HT.insert ht k v) (const $ abort k) <$>
        HT.lookup ht k
  return ht

Can you spot the error?


The problem is in the loop body: <$> applies HT.insert and abort to the result of the HT.lookup action, but the resulting IO actions are then simply discarded by forM_ as values, without being used or executed.

This is similar to saying:

putStrLn <$> getLine  -- has type IO (IO ())

This action yields a separate IO action as its result. Executing it and discarding its result will end up executing only the getLine, and not the putStrLn.

Correcting this requires using join, or equivalently using =<< as the application operator instead of <$>:

putStrLn =<< getLine  -- has type IO ()

This combines the two actions into a single action, as expected.


This error is easy to introduce, especially in more complex code.

It's insidious when it happens: there will often be no warning sign or hard failure, only strange results or inexplicable behaviour down the line due to the actions and effects that have silently gone "missing".


Checks and warnings?

GHC has a warning for this. Enabling -Wall (or -fwarn-unused-do-bind) will complain whenever a do block discards a value non-explicitly:

ghci> do putStrLn <$> getLine; return ()


A do-notation statement discarded a result of type ‘IO ()’

Suppress this warning by saying ‘_ <- (<$>) putStrLn getLine’ or by using the flag -fno-warn-unused-do-bind

However, forM_ defeats this check by discarding all the loop's result values regardless of type. One may intend to discard only (), but when a bug like the above slips in, forM_ will just as happily discard IO () or any other type too, and the checker will be none the wiser.


Safer types

One solution to this is to have variants of the _ functions that are type-specialised to only accept () as the loop body's result:

traverse_' :: (Applicative f, Foldable t)
           => (a -> f ()) -> t a -> f ()
traverse_' = traverse_

for_' :: (Applicative f, Foldable t)
      => t a -> (a -> f ()) -> f ()
for_' = for_

mapM_' :: (Monad m, Foldable t)
       => (a -> m ()) -> t a -> m ()
mapM_' = mapM_

forM_' :: (Monad m, Foldable t)
        => t a -> (a -> m ()) -> m ()
forM_' = forM_

These make it explicit that the loop should have no result, and makes it a type error to accidentally introduce a non-() result.



Erik proposed this change to libraries@haskell.org:

"A better type signature for forM_" (continued here)