[Code for this episode is here.]

So far in this series we’ve very briefly looked at continuous distributions on doubles, and spent a lot of time looking at discrete distributions with small supports. Let’s take a look at a completely different kind of distribution that doesn’t easily fit into our “small discrete” bucket but isn’t continuous either. Some motivating scenarios:

- Imagine you find yourself placed at random in a city and you want to take a stroll. At every intersection you make a random choice of which direction to go next. The random choice of starting points has a particular distribution, and the distribution of possible direction choices could be different at each intersection. The result of each stroll is a path through the city. What is the distribution of possible strolls?
- Imagine you find yourself at a random Wikipedia/tvtropes/whatever page, and you want to go down that rabbit hole. On each page you randomly select a link to another page within the site (if there is one.) What is the distribution of possible sequences of pages?
- Imagine you are a gambler who starts with a certain random amount of cash in your pocket and a target for how much you want to win and how much you are willing to lose. If you’ve won or lost too much, you go home; otherwise, you make a random-sized bet on a slot machine with certain odds, where the distribution of bet sizes and machine choices is a function of how much money you have in your pocket right now. What is the distribution of possible sequences of cash in your pocket?

Each of these cases has a number of things in common:

- We start in a randomly chosen state.
- We randomly move to a new state
*depending solely on our current state*, and*not*on any “memory” of how we got to that state. - The new state might be a “stop state”, in which case the sequence of states is over.
- The question is: what is the distribution of
*possible state sequences*?

A random process that has these properties is called a *Markov process*, after Russian mathematician Andrey Markov. Such processes are extremely useful and have been heavily studied for the last hundred years or so.

**Aside:** For this series I’m only going to look at Markov processes where each “state change” happens at no particular *time*; all we care about is the *sequence*. There is a whole subfield of studying Markov processes where we care about how long the gap is between events, and what the distribution of those gaps is; I’m not going to go there.

**Aside:** Note that there is no requirement in a Markov process that any of the distributions involved be discrete. The examples I gave all involved discrete quantities: intersections, web pages, dollar amounts. But you could certainly do a Markov process where you randomly chose a point on a line via some continuous distribution, and then chose the next point based on a continuous distribution that is a function of the current position. We will look at examples of such in later episodes; for now though we’ll stick to discrete examples.

**Aside:** Note that there is no requirement whatsoever that the sequence sampled from a Markov process be finite! It can be finite or infinite, depending on the details of the process.

**Aside:** Suppose we produce a sequence of dice rolls. We roll a die, and *no matter what we rolled*, we roll it again, every time. *Technically* that is a Markov process; the next roll depends on nothing, not even the previous roll. We will not be considering such “degenerate” Markov processes in this series; we’re interested in processes where the distribution of next states depends on the current state.

Let’s temporarily leave aside our discrete weighted distributions and go back to our underlying unweighted `IDistribution<T>`

interface for this one. (Working out the “weight” associated with a particular sequence is an interesting problem, but not one that we’re going to cover in this series.) Here we have a distribution on *sequences* of `T`

:

sealed class Markov<T> : IDistribution<IEnumerable<T>>

{

private readonly IDistribution<T> initial;

private readonly Func<T, IDistribution<T>> transition;

public static Markov<T> Distribution(

IDistribution<T> initial,

Func<T, IDistribution<T>> transition) =>

new Markov<T>(initial, transition);

private Markov(

IDistribution<T> initial,

Func<T, IDistribution<T>> transition)

{

this.initial = initial;

this.transition = transition;

}

public IEnumerable<T> Sample()

{

var current = this.initial;

while(true)

{

if (current is Empty<T>)

break;

var s = current.Sample();

yield return s;

current = this.transition(s);

}

}

}

**Aside:** You probably noticed that the state transition function is just another likelihood function: given the current state, it tells you what next states are most likely.

Let’s look at a simplified version of our “gambler’s ruin” example.

- Our state is the current amount of money we have: an integer.
- We need a distribution on initial conditions. Let’s say we’ll start off with exactly
`n`

dollars every time, so that’s a singleton distribution. - If the state ever gets to zero or to twice
`n`

, we quit. We represent quitting by producing an empty distribution. We cannot produce any more states because we cannot sample from the empty distribution! - If we are not in a quitting state then we flip a coin. This causes our state to transition to either the current bankroll minus one, or plus one.

const int n = 5;

var initial = Singleton<int>.Distribution(n);

IDistribution<int> transition(int x) =>

(x == 0 || x == 2 * n) ?

Empty<int>.Distribution :

Flip().Select(b => b ? x – 1 : x + 1);

All right, let’s take it for a spin:

var markov = Markov<int>.Distribution(initial, transition);

Console.WriteLine(markov.Sample().CommaSeparated());

The first time I ran this I got:

5,4,3,2,3,4,5,4,3,4,3,4,5,4,3,4,5,6,5, 6,5,6,5,6,5,6,5,4,5,6,7,8,9,8,9,8,9,10

It took 38 flips to get from 5 to 10. (And we ended up back at the start nine times!)

**Exercise:** Should the length of this sequence surprise you? Is it longer than average, shorter than average, or does it seem like it should be a pretty typical sample from this distribution? The solution is below.

.

.

.

.

.

Let’s think about the possible lengths of sequences. A few things come to mind:

- The shortest possible games are 5, 4, 3, 2, 1, 0 and 5, 6, 7, 8, 9, 10, so we won’t expect to see anything below six.
- Every game must have an even number of items in the sequence; do you see why?
- The maximum length is… well, there is no maximum length. We could go 5, 6, 5, 6, 5, 6, … arbitrarily long before heading off to 10 or 0.

The easiest way to get a handle on typical sequence lengths to just draw the histogram:

markov.Samples().Select(x => x.Count()).DiscreteHistogram()

6|******************************** 8|************************************** 10|**************************************** 12|************************************ 14|******************************** 16|***************************** 18|**************************** 20|************************** 22|*********************** 24|******************* 26|******************* 28|****************** 30|**************** 32|*************** 34|*********** 36|********** 38|********** 40|******** 42|******** 44|******** 46|****** 48|****** 50|****** 52|**** 54|**** 56|*** 58|**** 60|*** 62|** 64|** 66|** 68|*** 70|** 72|*

I cut it off there; the “long tail” in this particular run actually went out to 208!

We can see from the histogram that the *mode* is ten flips and *most* games end in fewer than 38 flips, but there is still a substantial percentage that take longer. We could work out exactly how much, but I think I will leave it there for now. You get the idea.

We seem to be doing pretty well here; we’ve created an entire new class of distributions that we can explore using fluent programming in only a few lines of code.

**Next time on FAIC:** It takes only one Markovian monkey to produce an approximation of Hamlet.