Last time on FAIC I showed how you could produce the `m`

th permutation of `n`

elements, where `m`

is a number between zero and `n!-1`

that chooses from the `n!`

possible permutations. There's a very interesting way to represent numbers that are always less than `n!`

for a given `n`

that I thought I might talk about today.

First off, a quick refresher on how we represent numbers textually in the first place. A four-digit number in base ten, say 2345, is nothing more than a compact way of writing

`2 x 10`

^{3} + 3 x 10^{2} + 4 x 10^{1} + 5 x 10^{0}

If we wanted to write that number in a different base, we could. For example, that quantity in base eight is:

`4 x 8`

^{3} + 4 x 8^{2} + 5 x 8^{1} + 1 x 8^{0}

So in base eight that would be 4451_{8}. (Conventionally the subscript indicates the base; base 10 is assumed if not present.)

There's a fundamental rule here that often goes unmentioned because it is obvious: **the digits must be smaller than the base**.^{1} If we allowed the digits 0 through 9 in numbers written in base eight then we would be in the unfortunate situation that 9_{8} == 11_{8}. **We want the representation of a given integer to be unique.** By requiring that each digit be smaller than the base, we obtain the property that the largest number with n digits is one smaller than the smallest number with n+1 digits; 999 is one less than 1000, 777_{8} is one less than 1000_{8}, and so on.^{2}

Hold onto your butts because now is the bit where things get a bit weird. Who says that the "base" has to be the same for every digit? Who says each digit has to be multiplied by a power? We can instead say that the `n`

th digit is between zero and

, and multiplied by `n`

`n!`

:

`2345`

_{10} = 3 x 6! + 1 x 5! + 2 x 4! + 2 x 3! + 2 x 2! + 1 x 1! + 0 x 0!

That is, we can write the decimal number 2345 in **factoradic notation **as 3122210_{!}. This is simply *another way of uniquely notating every integer*.^{3} Just as any integer between zero and 10^{n}-1 can be represented by `n`

decimal digits, any integer between zero and `n!-1`

can be** uniquely **represented by n factoradic digits.^{4}

Suppose we wanted to generate a random `BigInteger`

between zero and `n!-1`

. All we need to do is generate the `n`

digits of a number in factoradic notation, and we're set:

// Produce a random number between 0 and n!-1 static BigInteger RandomFactoradic(int n, Random random) { BigInteger result = 0; BigInteger radix = 1; for (int i = 1; i < n; ++i) { // We need a digit between 0 and i. result += radix * random.Next(i + 1); radix *= (i + 1); } return result; }

Which means that we can generate a random permutation! Just generate a random number between 0 and `n!-1`

, that gives you back a `BigInteger`

, and you can feed that into the permutation generator we made last time. Awesome.

Hold on just a minute here. Something about this is looking really familiar. Where have I seen this technique for generating a random permutation before?

static void FischerYatesShuffle(T[] array, Random random) { for (int i = array.Length - 1; i > 0; i -= 1) { int item = random.Next(i + 1); T temp = array[i]; array[i] = array[item]; array[item] = temp; } }

This is the standard algorithm for uniformly shuffling an array, and it also chooses a sequence of random numbers `n`

or less, `n-1`

or less, and so on, to produce a random permutation. And in fact that's how we deduced that there were `n!`

permutations in the first place: you choose from `n`

items the first time, `n-1`

items the second time, and so on. So really the algorithm I came up with today is in a sense equivalent to the Fischer-Yates shuffle; both shuffles choose a sequence of numbers that could be digits in a factoradic representation of an integer less than `n!`

and use that to produce a permutation. Precisely which permutation they choose based on a given sequence is of course different, but ultimately they are doing the same thing: choosing an ordering for all the permutations and then picking one out based on an index into that ordering.

**Next time on FAIC:** I see my picture on the cover of a magazine. Then we'll resume this series, with the observation that these implementations lack certain properties that you might want in a high-grade shuffle. We'll explore those properties.

(I am in San Francisco today at Coverity's head office; this posting was pre-recorded.)

- An equally obvious property is that every digit smaller than the base must be legal; otherwise we end up in the opposite situation of having some integers with no representation in the notation. ↩
- Proving this fact for arbitrary base is left as an exercise. ↩
- Opinions vary as to whether you should put the final digit zero on there; since it cannot be anything other than zero it seems redundant. However I'll stick with it for the purposes of this blog post. ↩
- And notice that the property I mentioned before holds: 43210
_{!}is one less than 100000_{!}. ↩

> An equally obvious property is that every digit smaller than the base must be legal

What about balanced ternary? Its base is 3, but it doesn't have the digit 2, yet it still works.

"Balanced ternary" is of course not base 3. It is ... uhh ... "balanced ternary." A way of representing all integers, both positive and negative, without a negative sign. It does have 3 symbols that represent (-1, 0, 1) and the positions do represent powers of 3, but it is not the same as "base 3"

Does this mean that there are an unbounded number of factoradic digits?

Since say 11!-1 is equal to A9876543210!

... which makes an interesting problem when trying to "write down" the higher digits, so you probably would revert to a "normal" base-n notation (with a separator between the digits)

Just don't try to write the digits in factoradic notation, too

Why not write the digits in factoradic notation? Challenge Accepted!

(eschewing the terminal zero, using comma for place seperators)

11! - 1 == (120,111,110,101,100,21,20,11,10,1)

(all numbers on the left side are base 10, all numbers on the right, between commas are in factoradic, and the number as a whole is also factoradic. This is factoradic factoradic notation!

Pingback: The Morning Brew - Chris Alcock » The Morning Brew #1346

Ok, so we can represent any integer between 0 and n!-1 using n digits? Like, 50! has about 65 digits, but we would need only 50 digits to represent it? That's awesome!

Do you think that we could use this notation to compress binary files? I mean, since we can interpret large chuncks of bits as integers and apply this factoradic notation to it, having an infinite amount of artificial digits that we can create (kind of a dictionary). Would that be possible?

No, there's no compression technique here. The problem is: suppose you convert a binary file from it's current format, which is essentially a number in base 256, into a number in factoradic notation. Now what bytes are you going to write back out?

We know ahead of time that your proposed compression scheme is not actually a compression scheme because it does not have the key characteristic of a compression scheme. Suppose you interpret a binary file as nothing more than a huge number. Compression schemes work based on the observation that *some numbers are much more commonly encountered than others*. Those commonly encountered numbers are mapped to *smaller* numbers, which means that the uncommonly-encountered numbers must be mapped to *larger* numbers. Some documents get larger when you "compress" them; the trick is to choose a compression algorithm cleverly so that the documents that get larger are extremely rare because the algorithm for identifying common documents is so good.

Your proposal says nothing about identifying common numbers, so it cannot be a compression scheme.

Hi Eric! Thank you. You're absolutely right. I've made a little experiment this morning just to come to the same conclusion that you just wrote above. There's no compression when representing factoradic numbers in base 2.

The bottom line is, as you said: there's no identification of common numbers. In other words, and I apologize because this is kind of obvious, there's no way to uniquely represent n bits with less than n bits, unless some identification of common numbers/patterns is applied (RLE, Huffman, etc).

Nice property to know about factorial, but I dont see what is gained this way as opposed to calculating n! and then generating a random number within that limit. Can someone point me to right direction?