Monads, part ten

Last time on FAIC I gave some examples of some simple “associate extra data with a value” monads. Though those are useful, the real power of the monad pattern is when it is used to produce objects that represent workflows on data. The best example of that I can give you in C# is the sequence monad, IEnumerable<T>. The whole point of LINQ, in fact, is to make it easier to construct the monadic workflows that are better known as “queries”.

I’ve often said that if I could tell all LINQ newbs only one thing about LINQ it is this: a query object represents a query, not the results of executing the query. Let’s take a look at the most basic example of that, the “select many” query.

Select many? You might think that select queries are the more basic query, but as we’ll see, select is just a special case of select many. I said before that ApplyFunction was actually a special case of ApplySpecialFunction. The correlation is not coincidental! Select is the ApplyFunction of the sequence monad.

Here’s an implementation of one of the overloads of SelectMany in LINQ-to-Objects: (With error handling removed for the sake of a more clear exposition.)

static IEnumerable<R> SelectMany<A, R>(
  this IEnumerable<A> sequence, 
  Func<A, IEnumerable<R>> function) 
{ 
  foreach(A outerItem in sequence) 
    foreach(R innerItem in function(outerItem)) 
      yield return innerItem; 
}

You should of course immediately recognize that signature, having seen it for the last half dozen or more episodes: SelectMany is the Bind aka ApplySpecialFunction of the sequence monad. And what does Bind do? It takes an existing workflow — the existing sequence — and a function, and it produces a new workflow that is the logical “binding” of the function onto the output of that workflow. Let’s consider an example:

static IEnumerable<int> Odd(int i) 
{ 
  if (i % 2 != 0) 
    yield return i; 
}

And now suppose we have an existing workflow that produces a sequence of ints:

IEnumerable<int> original = whatever;
IEnumerable<int> query = original.SelectMany(Odd);

What have we got? We’ve just bound the “where the integer is odd” operation to the end of an existing workflow, building a new query out of an old one. We could generalize this:

static IEnumerable<T> WhereHelper<T>(
  T item, 
  Func<T, bool> predicate) 
{ 
  if (predicate(item)) 
    yield return item; 
}
...
Func<int, IEnumerable<int>> odd = num => WhereHelper<int>(
  num,
  item => item % 2 != 0); 
IEnumerable<int> query = original.SelectMany(odd);

But that seems really clunky to use; we should abstract the whole thing away into a helper method:

static IEnumerable<T> Where<T>(
  this IEnumerable<T> items, 
  Func<T, bool> predicate) 
{ 
  return items.SelectMany(item => WhereHelper(item, predicate)); 
}
...
IEnumerable<int> query = original.Where(num => num % 2 != 0);

You see what we did there? We just implemented Where out of nothing but a call to SelectMany and a few helper methods. (And of course some implicit uses of CreateSimpleSequence; did you spot them?)

This is of course an insanely roundabout way to implement Where; you can take advantage of the features of C# to make a much shorter and more efficient implementation. But my point is that logically Where is just a fancy way to call SelectMany, and it does the same thing: binds a new operation onto the end of an existing workflow.

Of course the same is true of Select, which could be written entirely in terms of SelectMany:

static IEnumerable<R> SelectHelper<A, R>(
  A item, 
  Func<A, R> projection) 
{ 
    yield return projection(item); 
}
static IEnumerable<R> Select<A, R>(
  this IEnumerable<A> items, 
  Func<A, R> projection) 
{ 
  return items.SelectMany(item => SelectHelper(item, projection)); 
}

...
IEnumerable<int> query = original.Select(num => num + 100);

Again, no sensible person would actually implement Select like that because there are shorter and more efficient ways to do it in C#. (And now that we have Where, Select and SelectMany we can implement Join; a very inefficient Join to be sure, but we can do it. Doing so is left as an exercise.) But logically, Select is just a special way of calling SelectMany; if we didn’t have those more efficient mechanisms in C# then we could use this one. The same is true of all the query comprehensions that take a sequence and produce a new sequence; SelectMany is what enables you to bind any operation to the end of a monadic workflow and produce a new workflow.

Next time on FAIC we’ll explore the relationship between the bind operation on monads and the “query comprehension” syntax.

11 thoughts on “Monads, part ten

  1. First off: I really like this series, thanks for explaining monads in this down-to-earth style.

    I feel that the implementation of Where in terms of SelectMany needs to cheat, though – Apart from bind and return, it also needs the empty sequence. So it looks as if we need special monad-dependend extras – in this case, additional ways to create values in the monad – in order to actually get the full power of LINQ.

    The same seems to be true for the Nullable monad – bind and return are not particularly useful unless you can refer to the null value, because otherwise no function passed to bind could return it.

    Are there some general ideas about the kinds of functions/constants a monad needs to provide in order to get its “full potential”, or is this too specific for each monad to generalize?

    • Well, there are also additive monads. They have special “mzero” value which doesn’t change when binded to any function: bind(mzero, f) == mzero. And also, bind(m, (x => mzero) == mzero.

      Lists and Nullables are examples of additive monads: for lists, mzero is the empty list. For Nullable, it’s null.

      See, people have already thought about this, researched thoroughly and wrote many papers of various practical usefulness!

    • To be a monad, you only need unit and bind. I imagine there are many sub-types of monads, but monad is just the most general form that has useful properties in its own right. Both IEnumerable and Nullable have some commonality in that they have the notion of an “empty” instance, but not all monads would have that (OnDemand, for instance).

  2. So, looking at this part along with part 7, would it be fair to say that the whole point of monads is to be able to write a program using this style?

    // Take a bunch of Func<a, M<r>>
    //
    Func<A, M<B>> f1 = …;
    Func<B, M<C>> f2 = …;
    Func<C, M<D>> f3 = …;

    // Compose them to make a workflow
    //
    Func<A, M<D>> workflow1 = Compose(Compose(f1, f2), f3);

    //
    // And since workflow1 is just another Func<a, M<r>>
    //

    // …I can use workflow1 to create other workflows
    //
    Func<D, M<A>> feedback = …;
    Func<A, M<D>> workflow2 = Compose(Compose(workflow1, feedback), workflow1);

    // …or I can use workflow1 just like any other function
    //
    M<D> md = workflow1(a);

    // and if M happens to be IEnumerable
    //
    foreach (D d in md) {…}

    If so, then would it be okay to make my own lenient monad laws that goes like this?

    1) There must be a way to create a monad (CreateSimpleM from previous examples).

    2) There must be a way to create a Func<a, M<r>> by chaining a bunch of other Func<a, M<r>> (Compose from part 7). Which mechanisms are used (bind, fmap, whatever), or whether or not these mechanisms are standardized, is up to the makers and users of the monads. The only thing that matters is there’s a way to chain a bunch of Func<a, M<r>> to create a new Func<a, M<r>>.

    3) There should be a way to unwrap a monad (foreach for IEnumerable), if it makes sense for the user of the monad to be able to unwrap the monad.

    • so 1) says you need unit 2) says you need bind or something equivalent (e.g. fmap and join) so that 2 of the three elements, you also need a type constructor (but this is kind of implied by M being a generic class)

      Then you need to match 3 monad laws which is the ‘how they compose bit’, however you are kind of glossing over exactly how they compose, which is kind of the point of having the monad laws.

      3) is not a requirement of monads, it seems to be about comonads

      • About 3). For example, in Haskell there is no way to “unwrap” I/O monad, there is no function with “IO a -> a” signature… okay, there is, but it’s marked as unsafe. The runtime itself unwraps IO monad by evaluating it.

        However, you can unwrap Maybe or List, either via pattern-matching or, say, fromMaybe/fromJust and head/(!!) .

  3. Pingback: Range comprehensions with C++ lazy generators | a totally unnecessary blog

  4. Pingback: Monads, part eleven | Fabulous adventures in coding

Leave a comment