Fixing Random, part 3

[Code is here.]


Last time on FAIC I described a first attempt of how I’d like to fix System.Random:

  • Make every method static and threadsafe
  • Draw a clear distinction between crypto-strength and pseudo-random methods

There were lots of good comments on ways to improve the API and implementation details further. However, even with those improvements, we’re still really just implementing slightly better versions of rand() from the 1970s. We can do better than that.

Let’s start by taking a step back and asking what we really want when we’re calling NextInt or NextDouble or whatever on a source of randomness. What we’re saying is: we have a source of random Ts for some type T, and the values produced will correspond to some probability distribution. We can express that very clearly in the C# type system:

public interface IDistribution<T>
{
  T Sample();
}

For this episode I’m just going to look at continuous probability distributions, which I’ll represent as IDistribution<double>. We’ll look at integer distributions and other possible values for type parameter T in the upcoming episodes.

Now that we have an interface, what’s the simplest possible implementation? We have a library that produces a uniform distribution of doubles over the interval [0.0, 1.0), which is called the standard continuous uniform distribution. Let’s make a little singleton class for that:

using SCU StandardContinuousUniform;
public sealed class StandardContinuousUniform :
  IDistribution<double>
{
  public static readonly StandardContinuousUniform
    Distribution new StandardContinuousUniform();
  private StandardContinuousUniform() { }
  public double Sample() => Pseudorandom.NextDouble();
}


Aside: For the purposes of this series I’m going to use pseudo-random numbers; feel free to sub in crypto-strength random numbers if you like it better.

Exercise: make this a non-singleton that takes a source of randomness as a parameter, and feel all fancy because now you’re doing “dependency injection”.


This implementation might seem trivial — and it is — but upon this simple foundation we can build a powerful abstraction.

Now that we have a type that represents distributions, I’d like to build a bridge between distributions and LINQ. The connection between probability distributions and sequences should be pretty obvious; a distribution is a thing that can produce an infinite sequence of values drawn from that distribution.

I’ll make a static class for my helper methods:

public static class Distribution
{
  public static IEnumerable<T> Samples<T>(
    this IDistribution<T> d)
  {
    while (true)
      yield return d.Sample();
  }

What does this give us? We instantly have access to all the representational power of the sequence operation library. Want a sum of twelve random doubles drawn from a uniform distribution? Then say that:

SCU.Distribution.Samples().Take(12).Sum()


Aside: You might be thinking “if there is such a strong connection between distributions and infinite sequences then why not cut out the middleman by making IDistribution extend the IEnumerable interface directly, rather than requiring a Samples() helper method?” That’s an excellent question; the answer will become clear in a few episodes.


For testing and pedagogic purposes I’d like to be able to very quickly visualize a distribution from the console or in the debugger, so here’s another couple extension methods:

public static string Histogram(
  this IDistribution<double> d, double low, double high) =>
d.Samples().Histogram(low, high); 

public static string Histogram(
  this IEnumerable<double> d, double low, double high)
{
  const int width = 40;
  const int height = 20;
  const int sampleCount = 100000;
  int[] buckets = new int[width];
  foreach(double c in d.Take(sampleCount))
  {
    int bucket = (int)
(
buckets.Length * (c  low) /
      (high  low));
    if (0 <= bucket && bucket < buckets.Length)
      buckets[bucket] += 1;
  }
  int max = buckets.Max();
  double scale =
    max < height ? 1.0 : ((double)height) / max;
  return string.Join(“”,
    Enumerable.Range(0, height).Select(
      r => string.Join(“”, buckets.Select(
        b => b * scale > (height  r) ? ‘*’ : ‘ ‘)) + “\n”))
    + new string(‘-‘, width) + “\n”;
}


Exercise: Rescue the above princess in a single LINQ expression without getting Jon to do it for you. (I haven’t actually tried this; I’d be interested to know if it is possible. 🙂 )


We can now do this:

SCU.Distribution.Histogram(0, 1)

and get the string:

**** ********** *    *   * * *******   *
****************************************
****************************************
****************************************
****************************************
****************************************
****************************************
****************************************
****************************************
****************************************
****************************************
****************************************
****************************************
****************************************
****************************************
****************************************
****************************************
****************************************
****************************************
----------------------------------------

Which obviously is the desired probability distribution histogram, so I’m pretty pleased.

Extension methods and fluent programming are nice and all, but what else does this get us? Well, we can very concisely produce derived probability distributions. For example, suppose we want the normal distribution:

using static System.Math;
using SCU = StandardContinuousUniform;

public sealed class Normal : IDistribution<double>
{
  public double Mean { get; }
  public double Sigma { get; }
  public double μ => Mean;
  public double σ => Sigma;
  public readonly static Normal Standard = Distribution(0, 1);
  public static Normal Distribution(
      double mean, double sigma) =>
    new Normal(mean, sigma);
  private Normal(double mean, double sigma)
  {
    this.Mean = mean;
    this.Sigma = sigma;
  }
  // Box-Muller method
  private double StandardSample() =>
    Sqrt(2.0 * Log(SCU.Distribution.Sample())) *
      Cos(2.0 * PI * SCU.Distribution.Sample());
  public double Sample() => μ + σ * StandardSample();
}

And of course we can graph it:

Normal.Distribution(1.0, 1.5).Histogram(4, 4)

                        ***             
                      ******            
                     ********           
                    **********          
                    ***********         
                   ************         
                  **************        
                  **************        
                 ****************       
                ******************      
                ******************      
               ********************     
              **********************    
             ************************   
             ************************   
           **************************** 
          ******************************
         *******************************
       *********************************
----------------------------------------

If you want a list with a hundred thousand normally distributed values with a particular mean and standard deviation, just ask for it:

Normal.Distribution(1.0, 1.5).Samples().Take(100000).ToList()

Hopefully you can see how much more useful, readable and powerful this approach is, compared to messing around with System.Random directly. We want to move the level of abstraction of the code away of the mechanistic details of loops and lists and whatnot; the code should be at the level of the business domain: distributions and population samples.


Exercise: Try implementing the Irwin-Hall distribution (hint: it’s already written in this episode!)

Exercise: Try implementing the Gamma distribution using the Ahrens-Dieter algorithm.

Exercise: Try implementing the Poisson distribution as an IDistribution<int>.


In the next few episodes of FAIC: we’ll stop looking at continuous distributions and take a deeper look at the use cases for our new interface when the type argument represents a small set of discrete value, like Booleans, small ints or enums.

Fixing Random, part 2

[Code is here.]


Last time on FAIC I complained bitterly about the shortcomings of System.Random. In this episode, I’ll take a stab at the sort of thing I’d like to see in a redesigned random library. This is the “fix the trivial problems” episode, and then we’ll go deeper from there.

The first thing I want is two static methods, one that gives me an actually-random integer from 0 to Int32.MaxValue, and one that gives me a random double from 0.0 (inclusive) to 1.0 (exclusive).  Not pseudo-random, but actually indistinguishable from a true random uniform distribution. Here’s an attempt:

using CRNG = System.Security.Cryptography.RandomNumberGenerator;
public static class BetterRandom
{
  private static readonly ThreadLocal<CRNG> crng =
    new ThreadLocal<CRNG>(CRNG.Create);
  private static readonly ThreadLocal<byte[]> bytes =
    new ThreadLocal<byte[]>(()=>new byte[sizeof(int)]);
  public static int NextInt()
  {
    crng.Value.GetBytes(bytes.Value);
    return BitConverter.ToInt32(bytes.Value, 0) & int.MaxValue;
  }
  public static double NextDouble()
  {
    while (true)
    {
      long x = NextInt()  0x001FFFFF;
      x <<= 31;
      x |= (long)NextInt();
      double n = x;
      const double d = 1L << 52;
      double q = n / d;
      if (q != 1.0)
        return q;
    }
  }
}

It’s pretty straightforward, but let’s take a quick look. First, all the state is thread-local, so we trade a small amount of per-thread overhead and per-call indirection for thread safety, which is probably a good tradeoff. We’re going to be using the same four-byte buffer over and over again, so I figured it is probably worthwhile to cache it and avoid the hit to collection pressure; however this is just a guess and I would want to verify that with empirical tests.

There are other possible performance problems here; for example, is it worthwhile to generate a few thousand random bytes at once, cache them, and then gradually use them up? Or is the per-call cost of GetBytes sufficiently low that this is not an issue? I’d also want to empirically check that.

I want only positive integers; clearing the top bit by and-ing with 0x7FFFFFF does that nicely without changing the distribution.

For a random double, we know that doubles have 52 bits of precision, so we generate a random 52 bit integer and make that the numerator of a fraction. We want to guarantee that 1.0 is never a possible output, so we reject that possibility; one in every few billion calls we’ll do some additional calls to NextInt. I can live with that.


Exercise: An alternative, and possibly better solution, would be to build the double from bits directly, rather than doing an expensive division. I have not actually written the code or done the math here; as an exercise, does doing so produce a better distribution in any way?


Now that we have a crypto-strength threadsafe random number library, we can build a cheap pseudo-random library out of it:

public static class Pseudorandom
{
  private readonly static ThreadLocal<Random> prng =
    new ThreadLocal<Random>(() =>
      new Random(BetterRandom.NextInt()));
  public static int NextInt() =>  prng.Value.Next();
  public static double NextDouble() =>  prng.Value.NextDouble();
}

We’ve solved two of the main problems with Random: historically it gave highly correlated values when called repeatedly from one thread, and it is not thread safe. We now have a single prng per thread, and it is seeded with a crypto-strength random seed, not the current time.

The problem of the candy-machine interface has gone away because I’ve eliminated that method entirely! We’ll come back to this point in a later episode.

I think this is already a major improvement, in that 99% of the questions about mis-uses of Random on StackOverflow would not exist if this had been the received randomness library in the first place, and I am very happy to learn that at long last, these problems have been fixed in the core implementation.

What about performance? The crypto RNG is a lot slower than the pseudo-RNG, and there is definitely overhead for the thread local infrastructure. But: who cares? Most programs would  not be observably slowed down by these improvements, and for the ones that are slowed down, you go right ahead and make a cheap, thread-unsafe version if you need it for performance.

We should always make good tradeoffs; if the cost of eliminating a whole class of bugs is a tiny performance hit, we should make that tradeoff. The entire premise of managed languages is that we trade a small amount of performance for improved developer productivity. That’s why we have things like a GC, immutable strings, checked array access, and so on; these all trade a small amount of performance for more safety, robustness and productivity. And that’s why we have unsafe pointers; if you need to abandon some safety to get performance, C# should let you do that. But we should default to safety!

A commenter noted last time that recent implementations of System.Random solve this problem using some similar techniques; it’s instructive to compare my four-line solution here to a solution actually engineered to be performant at scale. As always, remember that the code on my blog is for pedagogy and not intended to be an example of industrial-quality code. You’ll notice that for instance they make only the rarely-used parts of the algorithm thread-safe. That seems like a reasonable tradeoff between performance and safety.

Already I think we’re off to a good start, but we can do a lot better still. Like I said last time, these are the trivial problems. There are a lot more problems we could be solving. Two that come to mind:

  • The received PRNG is simply not very good in terms of its randomness; there are far better algorithms that produce harder-to-predict outputs with similar performance and similar amount of state.
  • A common use for PRNGs is to generate a random-seeming but actually-deterministic sequence for the purposes of randomizing a game. But the received implementation has no way to say “save the current state of the PRNG to disk and restore it later”, or any such thing.

I’m not going to address either of these problems in this series, though both are fascinating problems that deserve posts of their own. I’m going to take a different approach to improving how we deal with randomness; I want to look at the problem at a higher level of abstraction than the low-level details of how the random numbers are generated and what the state of the generator is. To really improve the state of the art of dealing with probability in C#, we’ll need some more powerful types.


Next time on FAIC: I’ll make some useful classes for common continuous probability distributions.