Producing combinations, part five

In part three we enumerated all the combinations of a particular size from a sequence by observing that the sequence {50, 60, 70, 80, 90} had combinations of exactly three elements as follows:

{
                      // 50, 60, 70, 80, 90
    {50, 60, 70},     // T   T   T   F   F
    {50, 60, 80},     // T   T   F   T   F
    {50, 60, 90},     // T   T   F   F   T
    {50, 70, 80},     // T   F   T   T   F
    {50, 70, 90},     // T   F   T   F   T
    {50, 80, 90},     // T   F   F   T   T
    {60, 70, 80},     // F   T   T   T   F
    {60, 70, 90},     // F   T   T   F   T
    {60, 80, 90},     // F   T   F   T   T
    {70, 80, 90}      // F   F   T   T   T
}

The table of bits on the right hand side is true if the element is included in the combination and false if it is not. We recursively divided finding all combinations of a particular size up into two sub-problems: first we enumerated all the cases where the left bit was true, and then all the cases where the left bit was false. We used that insight to recursively construct every sequence of n Booleans where there were exactly k true values.

If you treat the chart above as bits in an integer with the least-significant bit on the right, we have the values 28, 26, 25, 22, 21, 19, 14, 13, 11, 7 — that is, we have produced the sequence in order of descending value when interpreted as bits this way.

I have in mind a slightly different order. What if we treated those Booleans as bits in an integer with the least-significant bit on the left, and then re-ordered them into ascending order by integer value? Here I’ve reversed the bits left-to-right in the chart, and reordered the rows into ascending order by bit value:

{                     // 90, 80, 70, 60, 50
    {50, 60, 70},     // F   F   T   T   T 
    {50, 60, 80},     // F   T   F   T   T
    {50, 70, 80},     // F   T   T   F   T
    {60, 70, 80},     // F   T   T   T   F
    {50, 60, 90},     // T   F   F   T   T
    {50, 70, 90},     // T   F   T   F   T
    {60, 70, 90},     // T   F   T   T   F
    {50, 80, 90},     // T   T   F   F   T
    {60, 80, 90},     // T   T   F   T   F
    {70, 80, 90}      // T   T   T   F   F
}

You’ll note that this order is very similar but not identical to our previous ordering.

Suppose we treat these not as sequences of bits, but as sets of bit numbers, where the low bit is bit zero, the next is bit one, and so on up to bit four:

{ 
    {50, 60, 70},     //     2 1 0
    {50, 60, 80},     //   3   1 0
    {50, 70, 80},     //   3 2   0
    {60, 70, 80},     //   3 2 1
    {50, 60, 90},     // 4     1 0
    {50, 70, 90},     // 4   2   0
    {60, 70, 90},     // 4   2 1
    {50, 80, 90},     // 4 3     0
    {60, 80, 90},     // 4 3   1
    {70, 80, 90}      // 4 3 2
}

Previously we noted that the job of producing all combinations is equivalent to producing all bit sequences with a given number of true bits; well, that is in turn equivalent to producing all sets of integers with a given number of elements and a certain bound. In this case, all the sets of integers that have exactly three elements, and are between zero and four inclusive. Moreover, the integers are indexes into the original sequence!

Let’s make a recursive method to produce these bit sets. The recursion is a pretty straightforward modification of our previous algorithm:

// Takes integers n and k, both non-negative.
// Produces all sets of exactly k elements consisting only of 
// integers from 0 through n - 1.
private static IEnumerable<TinySet> Combinations(int n, int k)
{
  // Base case: if k is zero then there can be only one set
  // regardless of the value of n: the empty set is the only set
  // with zero elements.
  if (k == 0)
  { 
    yield return TinySet.Empty;
    yield break;
  }

  // Base case: if n < k then there can be no set of exactly
  // k elements containing values from 0 to n - 1, because sets
  // do not contain repeated elements.

  if (n < k)
    yield break;

  // A set containing k elements where each is an integer from
  // 0 to n - 2 is also a set of k elements where each is an
  // integer from 0 to n - 1, so yield all of those.

  foreach(var r in Combinations(n-1, k))
    yield return r;

  // If we add n - 1 to all the sets of k - 1 elements where each
  // is an integer from 0 to n - 2, then we get a set of k elements
  // where each is an integer from 0 to n - 1.

  foreach(var r in Combinations(n-1, k-1))
    yield return r.Add(n-1);
}

Previously we produced a sequence of bools indicating whether a particular item has been chosen or not. Now we have a sequence of indicies, so it seems reasonable to require an indexed collection. Writing the code to deal efficiently with a non-indexed sequence is left as an exercise.

public static IEnumerable<IEnumerable<T>> Combinations<T>(
  this IList<T> items, int k)
{
  if (k < 0) 
    throw new InvalidOperationException();
  if (items == null)
    throw new ArgumentNullException("items");
  int n = items.Count;
  if (n > TinySet.Maximum) 
    throw new InvalidOperationException();
  return 
    from combination in Combinations(n, k)
    select (from index in combination select items[index]);
}

We can then run this just as we did before:

var sequence = new[] { 50, 60, 70, 80, 90 };
foreach(var combination in sequence.Combinations(3))
  Console.WriteLine(string.Join(",", combination));

And get, as we expect:

50,60,70
50,60,80
50,70,80
60,70,80
50,60,90
50,70,90
60,70,90
50,80,90
60,80,90
70,80,90

Notice how this is in a slightly different order than our original algorithm. The original algorithm produced everything that contained 50 first; this algorithm produces everything that contains 90 last.

Next time on FAIC: We’ll keep exploring recursive algorithms that use stacks, but this time we’ll solve a problem on graphs.

8 thoughts on “Producing combinations, part five

  1. Transforming [50,60,70,80,90] into [0,1,2,3,4] is just a “free theorem”.

    The code uses “positive power ordering” (i.e. with key(S) = $\sum_{k \in S} 2^k$) instead of “negative power ordering” (key(S) = $\sum_{k \in S} 2^{-k}$) because it inserts at the “top” (at index $n$) while stacks insert at the “bottom” (at index $0$).

  2. “that is in turn equivalent to producing all sets of integers with a given number of elements and a certain bound”
    That sounds a lot like taking all of the k-sized combinations oh the set of the first n natural numbers.

    “Moreover, the integers are indexes into the original sequence!”
    And this sounds like a bijection from the first n naturals to our original set.

    This all sounds like something about which a mathematician might (have) prove(n) something. Maybe that a bijection and taking subsets can be composed either way.

    • Indeed! I write this blog for an audience of computer programmers. Though I often discuss mathematical concepts, I try to avoid jargon — like “bijection” — unless it is necessary to get the concept across.

  3. Pingback: The Morning Brew - Chris Alcock » The Morning Brew #1724

  4. What’s wrong with doing the selection on the original sequence instead of first picking indices?

    IEnumerable<IEnumerable<T>> CombinationsHelper<T>(
      IList<T> items, 
      int k, 
      int currentIndex = 0) 
    {
      if (k == 0 || items.Length < k) 
      {
        yield return new List<T>();
        yield break;
      }
      for (var i = currentIndex; i < items.Length; ++i) 
      {
        yield return new List<T>() { items[currentIndex] }
          .Append(Combinations(items, k - 1, currentIndex + 1));
      }
    }
    

    I didn't test this but why would the approach be wrong?

  5. Pingback: Graph traversal, part three | Fabulous adventures in coding

  6. Pingback: Producing combinations, part four | Fabulous adventures in coding

Leave a reply to ericlippert Cancel reply