Fixing Random, part 5

[Code is here.]

Last time on FAIC I implemented our first discrete distribution, the “choose an integer between min and max” distribution. I thought I might knock out a couple more of the easy discrete distributions. The easiest discrete distribution of all is of course the trivial distribution:

public sealed class Singleton<T> : IDiscreteDistribution<T>
{
  private readonly T t;
  public static Singleton<T> Distribution(T t) =>
new Singleton<T>(t);
  private Singleton(T t) => this.t = t;
  public T Sample() => t;
public IEnumerable<T> Support()
  {
  yield return t;
  }
  public int Weight(T t) =>
EqualityComparer<T>.Default
.Equals(this.t, t) ? 1 : 0;
  public override string ToString() =>
$”Singleton[{t}]”;
}

That is, the probability distribution where 100% of the time it returns the same value. You’ll also sometimes see distributions like this called the “Dirac delta” or the “Kronecker delta”, but we’re not doing quantum physics here, so I’m going to just call this the singleton distribution and be done with it.

Aside: I wish the Enumerable class had the corresponding method: take an item, return the sequence containing only that item. As we just saw, it’s easy to implement; because it is not in the framework, I’ve implemented it literally dozens of times! It is weird to me that the “extract the value of a singleton sequence” is a member of Enumerable, but the inverse operation is not.

Aside: This is the first distribution we’ve seen that does not necessarily have some sort of number as its output. You might be thinking that maybe our distribution type should never have been generic in the first place, if the only distributions I’m going to come up with are either numeric-valued or trivial. But worry not! In the next episodes we’ll see how we can naturally use more complex types in probability distributions.

Let’s look at another integer-valued distribution. The Bernoulli distribution considers the question “what if we flipped a possibly-unfair coin, and scored 1 for a head and 0 for a tail?” The parameter to the distribution is traditionally the probability of heads in our coin, between 0.0 and 1.0. However, as I noted last time, at least for now I’m going to try to avoid double weights. Instead, we’ll think of this distribution as “odds”. Instead of saying “0.7 chance of getting a 1”, we’ll say “we get three zeros for every seven ones”:

public sealed class Bernoulli :
IDiscreteDistribution<int>
{
  public static IDiscreteDistribution<int> Distribution(
int zero, int one)
  {
  if (zero < 0 || one < 0 || zero == 0 && one == 0)
throw new ArgumentException();
  if (zero == 0) return Singleton<int>.Distribution(1);
  if (one == 0) return Singleton<int>.Distribution(0);
  return new Bernoulli(zero, one);
  }
  public int Zero { get; }
  public int One { get; }
  private Bernoulli(int zero, int one)
  {
  this.Zero = zero;
  this.One = one;
  }
  public int Sample() =>
(SCU.Distribution.Sample() <=
((double)Zero) / (Zero + One)) ? 0 : 1;
  public IEnumerable<int> Support() =>
Enumerable.Range(0, 2);
public int Weight(int x) => x == 0 ? Zero : x == 1 ? One : 0;
  public override string ToString() =>
$”Bernoulli[{this.Zero}, {this.One}]”;
}

And sure enough, if we graph it out:

Bernoulli.Distribution(1, 3).Histogram()

We get what you’d expect: three times as many heads as tails:

0|*************
1|****************************************

Next time on FAIC: We’ll stop making new implementations of classic probability distributions and look at how to extend the ones we’ve got.

21 thoughts on “Fixing Random, part 5”

Is there a reason why `Singleton.Weight(T)` is implemented as `Support().Contains(t)`, instead of simply `this.t == t`?

Reply ↓

ericlippert on February 14, 2019 at 9:27 am said:

Try writing it the way you suggest and see what happens!

That said, your point is germane. I could have written EqualityComparer<T>.Default.Equals(this.t, t).

The real answer is: I copy-pasted the code from another class I had already written because I’m lazy.

Reply ↓
- Patrick Huizinga on February 14, 2019 at 9:33 am said:
  
  about == vs EqualityComparer, I had integers in mind, even though I registered the class was generic…
  
  Being lazy is a good reason though. 🙂
  
  Reply ↓
D. Ben Knoble on February 14, 2019 at 9:33 am said:

My guess: it relies on the interface more than the internals, meaning we could change the internals or the implementation Support without changing Weight

Reply ↓

I guess it’s technically true that System.Linq.Enumerable doesn’t have a method dedicated to the singleton enumeration. But, perhaps Microsoft decided that with the Repeat() method, that sufficed, and that it would clutter the API to add another method that was basically the equivalent to Repeat(x, 1).

Of course, that doesn’t explain why Empty() wound up being included, since one could produce the same result by calling Repeat(default(X), 0). Seems like a similar enough argument would apply.

But maybe that’s just where they decided to draw the line. After all, if a method to return a 1-element sequence was justified, maybe a method to return a 2-element sequence would be too. And a 3-element? Tuples as sequences anyone? It’s a slippery slope. 🙂

Reply ↓

Rhialto on February 14, 2019 at 11:20 am said:

My personal limit on these things is indicated by how I count: zero, one, many 🙂

Reply ↓
- Steve on February 18, 2019 at 10:33 am said:
  
  I feel like this is a troll, but maybe I’ve been reading too much Pratchett.
  
  Reply ↓
Adam M. on February 16, 2019 at 7:15 pm said:

The missing method that bugs me is IEnumerable.Concat(T). Why must I create some kind of sequence when I just want to append a single item? And also, why don’t we have “yield foreach” or “yield all”? 😛

Reply ↓
- mattzink on February 19, 2019 at 9:14 pm said:
  
  Append() does exactly that. while Prepend() adds its it to the front of the sequence.
  
  Reply ↓
  - jibalt on February 21, 2019 at 8:02 am said:
    
    Append and Prepend were added in .NET 4.7.1
    
    Reply ↓

I’m trying to copy the code out of the blog and run it in LINQPad, and apart from all the smart quotes, etc, causing grief, I can’t get it to compile. Is there a online repo that I could pull the code from to get it all in a compilable form?

Reply ↓

Sune on February 15, 2019 at 5:43 am said:

1. Convert EN DASH (U+2013) to HYPHEN-MINUS (HYPHEN-MINUS)
2. Convert LEFT DOUBLE QUOTATION MARK (U+201C) and RIGHT DOUBLE QUOTATION MARK (U+201D) to QUOTATION MARK (U+0022)
3. Convert LEFT SINGLE QUOTATION MARK (U+2018) and RIGHT SINGLE QUOTATION MARK (U+2019) to APOSTROPHE (U+0027)

4. Move using alias directives (using XXX = YYY) to ‘Additional Namespace Imports’ with the ‘using’ keyword removed (XXX = YYY)

Now it should compile with no issues

Reply ↓
- ericlippert on February 15, 2019 at 8:34 am said:
  
  WordPress automatically converts straight quotes to smart quotes, minus to en dash, and so on. It is infuriating. Sorry about that.
  
  Reply ↓
ericlippert on February 15, 2019 at 8:35 am said:

I’ve put the code up on GitHub and added a link. I’ll add a new branch each episode!

Reply ↓
- enigmativity on February 19, 2019 at 5:53 pm said:
  
  Thank you.
  
  Reply ↓

Pingback: Dew Drop – February 15, 2019 (#2900) | Morning Dew

Pingback: Fixing Random, part 6 | Fabulous adventures in coding

> I wish the Enumerable class had the corresponding method: take an item, return the sequence containing only that item.

Eric, isn’t that what the Take extension method does? (Specifically, enumerable.Take(1))?

Reply ↓

ericlippert on May 16, 2019 at 9:28 am said:

That turns a sequence into a shorter sequence; I want to turn an item not in a sequence into a sequence of one item. Enumerable.Repeat(item, 1) is the closest we’ve got.

Reply ↓

It seems to me that `new[] { singleItem }` is about as good as you can get converting a single item to an enumerable object. It’s only one line of code to make an explicit enumerator, but then you’d still be allocating an object of about the same size; the state machine isn’t free, and also lives on the heap. Are there reasons why constructing a single-element array is _not_ a good approach?

Specific to this code, are there any significant differences in performance or memory usage between this?:

public IEnumerable Support()
{
yield return t;
}

…and this?:

public IEnumerable Support() => new[] { t };

I made a naive performance comparison and the array appears to be marginally faster in practice. ¯\_(ツ)_/¯

Reply ↓

logiclrd on July 22, 2019 at 12:55 pm said:

Actually I ran the test further and it looks like it’s more than marginally better, I’m seeing about a 33% difference (that is to say, the state machine takes 133% the time of the array).

Reply ↓

Patrick Huizinga on February 14, 2019 at 9:17 am said:

Is there a reason why `Singleton.Weight(T)` is implemented as `Support().Contains(t)`, instead of simply `this.t == t`?

Reply ↓
- ericlippert on February 14, 2019 at 9:27 am said:
  
  Try writing it the way you suggest and see what happens!
  
  That said, your point is germane. I could have written EqualityComparer<T>.Default.Equals(this.t, t).
  
  The real answer is: I copy-pasted the code from another class I had already written because I’m lazy.
  
  Reply ↓
  - Patrick Huizinga on February 14, 2019 at 9:33 am said:
    
    about == vs EqualityComparer, I had integers in mind, even though I registered the class was generic…
    
    Being lazy is a good reason though. 🙂
    
    Reply ↓
- D. Ben Knoble on February 14, 2019 at 9:33 am said:
  
  My guess: it relies on the interface more than the internals, meaning we could change the internals or the implementation Support without changing Weight
  
  Reply ↓
pete.d on February 14, 2019 at 10:37 am said:

I guess it’s technically true that System.Linq.Enumerable doesn’t have a method dedicated to the singleton enumeration. But, perhaps Microsoft decided that with the Repeat() method, that sufficed, and that it would clutter the API to add another method that was basically the equivalent to Repeat(x, 1).

Of course, that doesn’t explain why Empty() wound up being included, since one could produce the same result by calling Repeat(default(X), 0). Seems like a similar enough argument would apply.

But maybe that’s just where they decided to draw the line. After all, if a method to return a 1-element sequence was justified, maybe a method to return a 2-element sequence would be too. And a 3-element? Tuples as sequences anyone? It’s a slippery slope. 🙂

Reply ↓
- Rhialto on February 14, 2019 at 11:20 am said:
  
  My personal limit on these things is indicated by how I count: zero, one, many 🙂
  
  Reply ↓
  - Steve on February 18, 2019 at 10:33 am said:
    
    I feel like this is a troll, but maybe I’ve been reading too much Pratchett.
    
    Reply ↓
- Adam M. on February 16, 2019 at 7:15 pm said:
  
  The missing method that bugs me is IEnumerable.Concat(T). Why must I create some kind of sequence when I just want to append a single item? And also, why don’t we have “yield foreach” or “yield all”? 😛
  
  Reply ↓
  - mattzink on February 19, 2019 at 9:14 pm said:
    
    Append() does exactly that. while Prepend() adds its it to the front of the sequence.
    
    Reply ↓
    - jibalt on February 21, 2019 at 8:02 am said:
      
      Append and Prepend were added in .NET 4.7.1
      
      Reply ↓
enigmativity on February 14, 2019 at 8:00 pm said:

I’m trying to copy the code out of the blog and run it in LINQPad, and apart from all the smart quotes, etc, causing grief, I can’t get it to compile. Is there a online repo that I could pull the code from to get it all in a compilable form?

Reply ↓
- Sune on February 15, 2019 at 5:43 am said:
  
  1. Convert EN DASH (U+2013) to HYPHEN-MINUS (HYPHEN-MINUS)
  2. Convert LEFT DOUBLE QUOTATION MARK (U+201C) and RIGHT DOUBLE QUOTATION MARK (U+201D) to QUOTATION MARK (U+0022)
  3. Convert LEFT SINGLE QUOTATION MARK (U+2018) and RIGHT SINGLE QUOTATION MARK (U+2019) to APOSTROPHE (U+0027)
  
  4. Move using alias directives (using XXX = YYY) to ‘Additional Namespace Imports’ with the ‘using’ keyword removed (XXX = YYY)
  
  Now it should compile with no issues
  
  Reply ↓
  - ericlippert on February 15, 2019 at 8:34 am said:
    
    WordPress automatically converts straight quotes to smart quotes, minus to en dash, and so on. It is infuriating. Sorry about that.
    
    Reply ↓
- ericlippert on February 15, 2019 at 8:35 am said:
  
  I’ve put the code up on GitHub and added a link. I’ll add a new branch each episode!
  
  Reply ↓
  - enigmativity on February 19, 2019 at 5:53 pm said:
    
    Thank you.
    
    Reply ↓
Pingback: Dew Drop – February 15, 2019 (#2900) | Morning Dew
Pingback: Fixing Random, part 6 | Fabulous adventures in coding
Rob on May 15, 2019 at 12:09 pm said:

> I wish the Enumerable class had the corresponding method: take an item, return the sequence containing only that item.

Eric, isn’t that what the Take extension method does? (Specifically, enumerable.Take(1))?

Reply ↓
- ericlippert on May 16, 2019 at 9:28 am said:
  
  That turns a sequence into a shorter sequence; I want to turn an item not in a sequence into a sequence of one item. Enumerable.Repeat(item, 1) is the closest we’ve got.
  
  Reply ↓
logiclrd on July 22, 2019 at 12:53 pm said:

It seems to me that `new[] { singleItem }` is about as good as you can get converting a single item to an enumerable object. It’s only one line of code to make an explicit enumerator, but then you’d still be allocating an object of about the same size; the state machine isn’t free, and also lives on the heap. Are there reasons why constructing a single-element array is _not_ a good approach?

Specific to this code, are there any significant differences in performance or memory usage between this?:

public IEnumerable Support()
{
yield return t;
}

…and this?:

public IEnumerable Support() => new[] { t };

I made a naive performance comparison and the array appears to be marginally faster in practice. ¯\_(ツ)_/¯

Reply ↓
- logiclrd on July 22, 2019 at 12:55 pm said:
  
  Actually I ran the test further and it looks like it’s more than marginally better, I’m seeing about a 33% difference (that is to say, the state machine takes 133% the time of the array).
  
  Reply ↓

Fabulous adventures in coding

Eric Lippert's blog

Fixing Random, part 5

21 thoughts on “Fixing Random, part 5”

Leave a reply to jibalt Cancel reply

Share this:

Related

21 thoughts on “Fixing Random, part 5”

Leave a reply to jibalt Cancel reply