Fixing Random, part 6

[Code is here.]


Last time on FAIC I showed how we could implement more of the standard, straightforward probability distributions, like the Bernoulli distribution; the episode before I showed the standard discrete uniform distribution, also known as “roll an n-sided fair die”. But typically when we are working on a line-of-business program, we’re working on objects that are meaningful in that line of business, and not necessarily on integers. Just to pick a silly example, suppose I want to uniformly choose between a cat, a dog and a goldfish?

using SDU = StandardDiscreteUniform;

var
 animals = new List<Animal>()
  { new Cat(), new Dog(), new Goldfish() };

Traditionally, as I noted a while back, we’d write some mechanism-heavy code like:

var choice = animals[random.Next(0, animals.Length)];

Now, I sincerely hope you’re not satisfied with:

var choice = animals[
  SDU.Distribution(0, animals.Length-1).Sample()];

If I tried to push that nonsense on you, you’d be quite justified in concluding that this whole project of mine to motivate a better “Random” is a cure that is worse than the disease. As is often the case, the problem here is that we’ve failed to push the “mechanism” code into a helper class, so let’s do that:

// This is wrong!
public
 sealed class DiscreteUniform<T> :
  IDiscreteDistribution<T>
{
  private readonly SDU sdu;
  private readonly List<T> support;
  public static DiscreteUniform<T> Distribution(
      IEnumerable<T> items) =>
    new DiscreteUniform<T>(items);
  private DiscreteUniform(IEnumerable<T> items)
  {
    this.support = items.ToList();
    this.sdu SDU.Distribution(0support.Count  1);
  }
  public IEnumerable<T> Support() => support;
  public T Sample() => support[sdu.Sample()];
  public int Weight(T t) => this.support.Contains(t) ? 1 : 0;

  public override string ToString() =>
    $”DiscreteUniform[{string.Join(“,”, support)}]”;
}


Exercise: That code is wrong in a number of ways; what are they? Give it some thought and then scroll down.

.

.

.

.

.

.

  1. We do not handle the case where the sequence is empty, which should be an error, or a single item, which would be better represented as a Singleton<T>.
  2. A buggy or hostile caller could call Support(), cast it to List<T>, and then mutate the list. Defense in depth!
  3. Suppose the array contains two references to the same dog, and a cat and a fish. If we ask what the weight is for the cat, we get 1; if we ask what the weight of the dog is, we get 1, but the dog is twice as likely to be chosen.
  4. Also in that scenario: the support contains the same dog twice. We expect the support to contain exactly one of each possible value.

We’ll fix all these problems in our final implementation at the bottom of this post.


Suppose we’ve managed to fix the problems identified above. We now create a helper method:

public static IDiscreteDistribution<T> ToUniform<T>(
    this IEnumerable<T> items) =>
  DiscreteUniform<T>.Distribution(items);

And now our proposed code is much nicer:

var animals = new List<Animal>()
  { new Cat(), new Dog(), new Goldfish() };
var choice = animals.ToUniform().Sample();

That looks much nicer, but… I still don’t like it.

Oh, sure, I like the call site, with the beautiful, readable fluent extension methods and whatnot. What I don’t like is the fact that I just built yet another single-purpose wrapper class. This wrapper class is only good for making a uniform distribution of an arbitrary sequence. Suppose I have a non-uniform distribution of integers from 0 to n — say, a binomial distribution — and I want to map a list of animals to that distribution; am I going to have to write a “binomial distribution on sequence” class that duplicates almost all the logic of the wrapper class above?


Exercise: Implement the binomial distribution; this is the extension of the Bernoulli distribution where, again, we have weights for a possibly-unfair coin, but we also have a parameter n which is the number of coin flips to make. The result is the total number of heads. Sampling is straightforward, but can you see how to compute the weights?


This seems like an opportunity for a more sophisticated wrapper class, parameterized in the underlying distribution on integers. Let’s tweak the class we just wrote:

public sealed class IntegerDistributionWrapper<T> :
  IDiscreteDistribution<T>
{
  private readonly IDiscreteDistribution<int> d;
  private readonly List<T> items;
  public static IntegerDistributionWrapper<T> Distribution(
    IDiscreteDistribution<int> d,
    IEnumerable<T> items)
{

    // blah blah blah
  }
  private IntegerDistributionWrapper(
    IDiscreteDistribution<int> d,
    IEnumerable
<T> items)
  {
// you get the idea
  }
  public T Sample() => items[d.Sample()];
// and so on

I waved my hands a lot there because, though this is an improvement, I still don’t like it. We’re getting better, but I want more generality for less cost. What are we doing here really?

  • Sampling an integer from a distribution.
  • Projecting that integer onto a value taken from a list.

Why does the projection have to be “value taken from a list”? And for that matter, why does the value sampled from the underlying distribution have to be an integer to begin with? We can write almost the same code, but make it far more general:

public sealed class Projected<A, R> :
  IDiscreteDistribution<R>
{
  private readonly IDiscreteDistribution<A> underlying;
  private readonly Func<A, R> projection;
  private readonly Dictionary<R, int> weights;
  public static IDiscreteDistribution<R> Distribution(
    IDiscreteDistribution<A> underlying,
    Func<A, R> projection)
  {
    var result = new Projected<A, R>(underlying, projection);
    if (result.Support().Count() == 1)
      return Singleton<R>.Distribution(
        result.Support().First());
    return result;
  }
  private Projected(
    IDiscreteDistribution<A> underlying,
    Func<A, R> projection)
  {
    this.underlying = underlying;
    this.projection = projection;
    this.weights = underlying.Support().
      GroupBy(
        projection, 
        a => underlying.Weight(a)).
      ToDictionary(g => g.Key, g => g.Sum());
  }
  public R Sample() => projection(underlying.Sample());
  public IEnumerable<R> Support() => this.weights.Keys;
  public int Weight(R r) =>
    this.weights.GetValueOrDefault(r, 0);
}

Study that implementation and make sure it makes sense to you; the crux of the whole thing is in the constructor, where we compute the weights of the resulting distribution.

You’ve probably noticed that I’ve fixed all the problems identified above:

  • We return a singleton when possible
  • We don’t need to worry about the distribution being empty because presumably the underlying distribution is not empty
  • The support set is an immutable deduplicated keyset
  • We compute the weights up front.

Unfortunately this means O(support) extra storage, but I can live with that.


Exercise: there are a number of optimizations that could be made here for the common case where the number of “colliding” elements in the support is small or zero, to reduce the amount of extra storage. Try implementing some of them, see if you get a win.


Now that we have a projection wrapper, we can delete our discrete uniform wrapper and integer wrapper classes entirely. That means we have to rewrite our helper method:

public static IDiscreteDistribution<T> ToUniform<T>(
  this IEnumerable<T> items)
{
  var list = items.ToList();
  return Projected<int, T>.Distribution(
    SDU.Distribution(0, list.Count  1),
    i => list[i]);
}

But again, this seems like I’m missing some generality.

Hmm.

Maybe I should also write an extension method on distributions that makes a projected distribution. It’s an easy helper method after all:

public static IDiscreteDistribution<R> MakeProjection<A, R>(
    this IDiscreteDistribution<A> d,
    Func<A, R> projection) =>
  Projected<A, R>.Distribution(d, projection);

Wait a minute… that looks very familiar.

Scroll down when you see it.

.

.

.

.

.

.

.

 

Good heavens, we’ve re-invented Select. Again. 

public static IDiscreteDistribution<R> Select<A, R>(
    this IDiscreteDistribution<A> d,
    Func<A, R> projection) =>
  Projected<A, R>.Distribution(d, projection);

And now this helper can be rewritten into a call to Select:

public static IDiscreteDistribution<T> ToUniform<T>(
  this IEnumerable<T> items)
{
  var list = items.ToList();
  return SDU.Distribution(0, list.Count  1)
    .Select(i => list[i]);
}

But you know what this means: I have a Select method with the right signature, which means I can use my beloved query comprehension syntax! That last line could be

return
  from i in SDU.Distribution(0, list.Count  1)
  select list[i];

And now you know why, a few episodes back I said that I was not going to make IDistribution<T> an extension of IEnumerable<T>; doing so just causes confusion between the Select operation on distributions and the Select operation on sequences.


Aside: If you’ve been following along you’ve noticed that this implementation takes a list, then turns it into a list, and then turns it into a list again, and then turns it into a dictionary. This seems “not cheap”. And indeed it is not, but remember, this is pedagogical code, not industrial code. Were I writing this library for industry, I’d be optimizing it much more to avoid collection pressure.

The idea here is to sketch out what could be done, not exactly how we’d do it. When I think about performance in this series, I’m going to be concentrating on things like eliminating arbitrarily expensive loops, and not on pragmatic but mundane practicalities like eliminating sources of collection pressure.


Finally, let’s just make sure that everything is working as expected:

var cat = new Cat();
var dog = new Dog();
var fish = new Goldfish();
var animals = new List<Animal>() { cat, dog, dog, fish };
Console.WriteLine(animals.ToUniform().Histogram());
Console.WriteLine(animals.ToUniform().ShowWeights());

Sure enough, we get the correct distribution:

     Cat|*******************
     Dog|****************************************
Goldfish|*******************
     Cat:1
     Dog:2
Goldfish:1

Interesting! We started by trying to make a uniform distribution and ended up with a correct non-uniformly-weighted distribution. Can we do a better job of building such a distribution without having to make an array that has duplicates?


Next time on FAIC: we’ve already seen how to represent a “fair die roll” distribution and a Bernoulli distribution where we give the “odds” that either zero or one happens as a ratio of integers; we’ll extend those concepts to any number of weighted outcomes, and discuss what data structure we can use to efficiently sample that distribution.

 

Advertisements

4 thoughts on “Fixing Random, part 6

  1. “And now you know why, a few episodes back I said that I was not going to make IDistribution an extension of IEnumerable; doing so just causes confusion between the Select operation on distributions and the Select operation on sequences.”

    I don’t really understand this point. If we had made IDistribution extend IEnumerable (or just be IEnumerable and not bother with an explicit interface), then wouldn’t Select on IEnumerable and select on IDistribution be one and the same?

    • Select on a sequence returns a new sequence. But I’ve defined Select on a discrete distribution to return a new discrete distribution. A discrete distribution has weights and a support; a sequence does not.

      If all we cared about was the output of `Sample` then sure, a distribution is just an infinite sequence, and Select is just Select. The key point that I’m going to demonstrate in this series is that we can actually do math on distributions and get facts out. Look at what we just did: we computed that putting a projection on top of a uniform distribution produced a distribution where we get exactly twice as many dogs as cats.

      We could not have done that with just a sequence. With a sequence, we could enumerate a million items and estimate that there are probably twice as many dogs as cats. We just took a distribution, slapped a projection on it, and got out the other end a correct and exact description of the resulting distribution.

      As we’ll see in this series, we can do a lot more complicated operations than just a selection and get an exact answer out the back end, and it would be very difficult indeed to do so efficiently with sequences. Stick around; it’s going to be an interesting ride.

  2. Pingback: Dew Drop – February 20, 2019 (#2903) | Morning Dew

Leave a Reply to ericlippert Cancel reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s