Fixing Random, part 5

[Code is here.]


Last time on FAIC I implemented our first discrete distribution, the “choose an integer between min and max” distribution. I thought I might knock out a couple more of the easy discrete distributions. The easiest discrete distribution of all is of course the trivial distribution:

public sealed class Singleton<T> : IDiscreteDistribution<T>
{
  private readonly T t;
  public static Singleton<T> Distribution(T t) =>
    new Singleton<T>(t);
  private Singleton(T t) => this.t = t;
  public T Sample() => t;
  public IEnumerable<T> Support()
  {
    yield return t;
  }
  public int Weight(T t) =>
EqualityComparer<T>.Default
.Equals(this.t, t) ? 1 : 0;

  public override string ToString() =>
    $”Singleton[{t}]”;
}

That is, the probability distribution where 100% of the time it returns the same value. You’ll also sometimes see distributions like this called the “Dirac delta” or the “Kronecker delta”, but we’re not doing quantum physics here, so I’m going to just call this the singleton distribution and be done with it.


Aside: I wish the Enumerable class had the corresponding method: take an item, return the sequence containing only that item. As we just saw, it’s easy to implement; because it is not in the framework, I’ve implemented it literally dozens of times! It is weird to me that the “extract the value of a singleton sequence” is a member of Enumerable, but the inverse operation is not.


Aside: This is the first distribution we’ve seen that does not necessarily have some sort of number as its output. You might be thinking that maybe our distribution type should never have been generic in the first place, if the only distributions I’m going to come up with are either numeric-valued or trivial. But worry not! In the next episodes we’ll see how we can naturally use more complex types in probability distributions.


Let’s look at another integer-valued distribution. The Bernoulli distribution considers the question “what if we flipped a possibly-unfair coin, and scored 1 for a head and 0 for a tail?” The parameter to the distribution is traditionally the probability of heads in our  coin, between 0.0 and 1.0. However, as I noted last time, at least for now I’m going to try to avoid double weights. Instead, we’ll think of this distribution as “odds”. Instead of saying “0.7 chance of getting a 1”, we’ll say “we get three zeros for every seven ones”:

public sealed class Bernoulli :
  IDiscreteDistribution<int>
{
  public static IDiscreteDistribution<int> Distribution(
    int zero, int one)
  {
    if (zero < 0 || one < 0 || zero == 0 && one == 0)
      throw new ArgumentException();
    if (zero == 0) return Singleton<int>.Distribution(1);
    if (one == 0) return Singleton<int>.Distribution(0);
    return new Bernoulli(zero, one);
  }
  public int Zero { get; }
  public int One { get; }
  private Bernoulli(int zero, int one)
  {
    this.Zero = zero;
    this.One = one;
  }
  public int Sample() =>
    (SCU.Distribution.Sample() <=
      ((double)Zero) / (Zero + One)) ? 0 : 1;
  public IEnumerable<int> Support() =>
    Enumerable.Range(0, 2);
  public int Weight(int x) => x == 0 ? Zero : x == 1 ? One : 0;

  public override string ToString() =>
    $”Bernoulli[{this.Zero}{this.One}]”;
}

And sure enough, if we graph it out:

Bernoulli.Distribution(1, 3).Histogram()

We get what you’d expect: three times as many heads as tails:

0|*************
1|****************************************

Next time on FAIC: We’ll stop making new implementations of classic probability distributions and look at how to extend the ones we’ve got.

21 thoughts on “Fixing Random, part 5

  1. Is there a reason why `Singleton.Weight(T)` is implemented as `Support().Contains(t)`, instead of simply `this.t == t`?

    • Try writing it the way you suggest and see what happens!

      That said, your point is germane. I could have written EqualityComparer<T>.Default.Equals(this.t, t).

      The real answer is: I copy-pasted the code from another class I had already written because I’m lazy.

      • about == vs EqualityComparer, I had integers in mind, even though I registered the class was generic…

        Being lazy is a good reason though. 🙂

  2. I guess it’s technically true that System.Linq.Enumerable doesn’t have a method dedicated to the singleton enumeration. But, perhaps Microsoft decided that with the Repeat() method, that sufficed, and that it would clutter the API to add another method that was basically the equivalent to Repeat(x, 1).

    Of course, that doesn’t explain why Empty() wound up being included, since one could produce the same result by calling Repeat(default(X), 0). Seems like a similar enough argument would apply.

    But maybe that’s just where they decided to draw the line. After all, if a method to return a 1-element sequence was justified, maybe a method to return a 2-element sequence would be too. And a 3-element? Tuples as sequences anyone? It’s a slippery slope. 🙂

  3. I’m trying to copy the code out of the blog and run it in LINQPad, and apart from all the smart quotes, etc, causing grief, I can’t get it to compile. Is there a online repo that I could pull the code from to get it all in a compilable form?

    • 1. Convert EN DASH (U+2013) to HYPHEN-MINUS (HYPHEN-MINUS)
      2. Convert LEFT DOUBLE QUOTATION MARK (U+201C) and RIGHT DOUBLE QUOTATION MARK (U+201D) to QUOTATION MARK (U+0022)
      3. Convert LEFT SINGLE QUOTATION MARK (U+2018) and RIGHT SINGLE QUOTATION MARK (U+2019) to APOSTROPHE (U+0027)

      4. Move using alias directives (using XXX = YYY) to ‘Additional Namespace Imports’ with the ‘using’ keyword removed (XXX = YYY)

      Now it should compile with no issues

  4. Pingback: Dew Drop – February 15, 2019 (#2900) | Morning Dew

  5. Pingback: Fixing Random, part 6 | Fabulous adventures in coding

  6. > I wish the Enumerable class had the corresponding method: take an item, return the sequence containing only that item.

    Eric, isn’t that what the Take extension method does? (Specifically, enumerable.Take(1))?

    • That turns a sequence into a shorter sequence; I want to turn an item not in a sequence into a sequence of one item. Enumerable.Repeat(item, 1) is the closest we’ve got.

  7. It seems to me that `new[] { singleItem }` is about as good as you can get converting a single item to an enumerable object. It’s only one line of code to make an explicit enumerator, but then you’d still be allocating an object of about the same size; the state machine isn’t free, and also lives on the heap. Are there reasons why constructing a single-element array is _not_ a good approach?

    Specific to this code, are there any significant differences in performance or memory usage between this?:

    public IEnumerable Support()
    {
    yield return t;
    }

    …and this?:

    public IEnumerable Support() => new[] { t };

    I made a naive performance comparison and the array appears to be marginally faster in practice. ¯\_(ツ)_/¯

    • Actually I ran the test further and it looks like it’s more than marginally better, I’m seeing about a 33% difference (that is to say, the state machine takes 133% the time of the array).

Leave a comment