Graph traversal, part one

There are lots of topics in both practical and theoretical computer science that deal with graphs, lots of different kinds of graphs, lots of ways to represent a graph as a data structure, and lots of operations you’d want to perform on such a data structure. Like, “is this node in this graph?” or “is there any path from this node to that node in this graph?” and so on. Programming with graphs is a huge topic. In this series I want to explore a tiny subset of those topics. Specifically:

  • How can we create an immutable data structure which represents a directed graph with labels on the edges?
  • Given a directed acyclic graph (DAG), what are all the possible edge traversals through that graph starting from a particular node?

Continue reading

Producing combinations, part four

Last time I showed how to use an immutable stack of Booleans to act as flags indicating whether a particular element of a sequence should be chosen as one of the combinations of exactly k elements. A reasonable criticism of this approach is that when you have a stack containing, say, thirty flags, each of which is a managed object and contains a Boolean and a reference, then we’ve overspent by a large factor; thirty bits could fit into a single int.

Continue reading

Producing combinations, part three

All right, we have an immutable stack of Booleans and we wish to produce all such stacks of size n that have exactly k true elements. As always, a recursive algorithm has the following parts:

  • Is the problem trivial? If so, solve it.
  • Otherwise, break the problem down into one or more smaller problems, solve them recursively, and aggregate those solutions into the solution of the larger problem

Let’s start with a signature. What do we have? integers n and k. What do we want? A sequence of Boolean stacks such that the size is n, and there are k bits turned on. We therefore require that n and k both be non-negative, and that n be greater than or equal to k. Continue reading

Producing combinations, part two

Last time we saw that producing all the subsequences of a given size k from a given sequence of size n is essentially the same problem as computing all the sequences of n Booleans where exactly k of them are true. How do we do that?

An approach that I often see is “enumerate all sequences of n Booleans, count the number of on bits, and discard those which have too many or too few”. Though that works, it’s not ideal. Suppose we are seeking to enumerate the combinations of 16 items chosen from a set of 32. There are over four billion possible combinations of 32 bits, and of those over three billion of them have more or fewer than 16 true bits, so that’s a lot of counting and discarding to do. We can do better! To do so, we’ll use a combination of all my favourite techniques:

  • Immutable data structures
  • Abstract classes with derived nested classes
  • Recursion

Long-time readers of this blog will have seen me use these techniques before, but for new readers, a quick introduction is in order. Continue reading

Confusing errors for a confusing feature, part three

One of the nice things about a project as large as the Roslyn project is that you have an opportunity to really think hard about your past mistakes and hopefully fix them. When I was working on getting these error messages reported in Roslyn I realized that trying to match exactly the behavior of the original compiler would be actively making the world a worse place, so I took a big step back and started sending a lot of emails to Mads, Neal, Anthony and the rest of the Roslyn gang to try and get a better design worked out. All the godawful nonsense I told you about in the previous two episodes will be fixed in Roslyn.

Continue reading