The dedoublifier, part one

Posted on November 30, 2015 by ericlippert

Good Monday morning everyone; I hope my American readers had a lovely Thanksgiving. I sure did!

Well enough chit-chat, here’s a problem I was pondering the other day. We know that doubles introduce some “representation error” when trying to represent most numbers. The number 3/4, or 0.75 for example, can be exactly represented by a double, but the number 3/5, or 0.6, cannot.

To digress for a moment with a brief refresher: a double can be thought of as a fraction where the numerator is a 52 bit integer, and the denominator is a largish power of two. For the exact details, see the Wikipedia page on the subject or my old articles. Since 3/4 can clearly be represented by an integer divided by a power of two, it can be represented exactly. But there is no way to make 3/5 come out to have a power of two on the bottom.

Now, fortunately we have conversions from double to decimal, and from double to base-ten strings, that allow us to easily answer the question “what decimal fraction is this double close to?” The question I am pondering is: suppose we have a double which approximates some weird fraction with neither a power of two or ten in the denominator. 33/17 or 85/97 or whatever. Is there some way to easily get that fraction back out of the double?

We’ll do this in four parts. In this first part I’ll make a new helper class to extract the parts of a double; in the second part I’ll make my own rational number type, and we’ll finish up by proposing some algorithms that searche the space of rational numbers for the one that best matches a given double, and implementing one of them.

Long-time readers will know that I’ve solved the first problem before, back in 2011. But today I am excited to try out some of the code-shortening features of C# 6.0. So let’s get right into it:

struct DoubleHelper
{
  public DoubleHelper(double d)
  {
    this.RawBits = (ulong)BitConverter.DoubleToInt64Bits(d);
  }

  public ulong RawBits { get; }
  // RawSign is 1 if zero or negative, 0 if zero or positive
  public int RawSign => (int)(RawBits >> 63);
  public int RawExponent => (int)(RawBits >> 52) & 0x7FF;
  public long RawMantissa => (long)(RawBits & 0x000FFFFFFFFFFFFF);
  public bool IsNaN => RawExponent == 0x7ff && RawMantissa != 0;
  public bool IsInfinity => RawExponent == 0x7ff && RawMantissa == 0;
  public bool IsZero => RawExponent == 0 && RawMantissa == 0;
  public bool IsDenormal => RawExponent == 0 && RawMantissa != 0;

  // Sign is 1 if positive, -1 if negative, 0 if zero
  public int Sign => IsZero ? 0 : 1 - RawSign * 2;
  public long Mantissa => IsZero || IsDenormal || IsNaN || IsInfinity ?
    RawMantissa :
    RawMantissa | 0x0010000000000000;

  public int Exponent
  {
    get
    {
      if (IsZero)
        return 0;
      else if (IsDenormal)
        return -1074;
      else if (IsNaN || IsInfinity)
        return RawExponent; 
      else
        return RawExponent - 1075;
    }
  }
}

You know, I was initially skeptical of the => syntax for simple read-only properties, but I really like this. With just these few lines of code I can now analyze a double both in terms of what the raw bit fields are, and be smart about figuring out the meaning of the sign, exponent and mantissa parts.

Next time on FAIC: we’ll use this little helper to implement arbitrary precision rational numbers.

30 thoughts on “The dedoublifier, part one”

Daniel Müller on November 30, 2015 at 2:22 pm said:

Hi,
Thanks for this blog post. I’ve made a fraction class some time ago. Line 166 in
https://github.com/danm-de/Fractions/blob/master/Fractions/Fraction/Fraction.ConvertFrom.cs
creates a fraction (using two BigIntegers) from a double.
Greetings,
Daniel

Reply ↓
tobi on November 30, 2015 at 2:29 pm said:

Sign => IsZero ? 0 : 1 – RawSign * 2

Omg, Eric… playing with magic numbers there!

May I suggest: -RawSign or RawSign == 1 ? -1 : 1.

Reply ↓
- ericlippert on November 30, 2015 at 2:43 pm said:
  
  Neither of those have the correct semantics. The semantics of Sign should be -1 for negative, 0 for zero and 1 for positive. The raw sign bit is 1 for not-positive and 0 for not-negative. So I have to have a test for zeroness in there somewhere.
  
  Now, IsZero ? 0 : Sign == 1 ? -1 : 1 would be correct but I think that is hardly easier to read.
  
  Regardless, the code could use some comments as this point is clearly confusing.
  
  Reply ↓
  - tobi on November 30, 2015 at 2:52 pm said:
    
    `match` to the rescue.
    
    Reply ↓
Adam M. on November 30, 2015 at 3:16 pm said:

I think the numerator in your analogy is actually 53-bits due to the hidden 1 bit prefixing the mantissa.

I had to write code to convert doubles to/from decimal and string. It’s surprisingly hard to do both correctly and efficiently. (I gave up and just went the slow and correct route.) You’d expect such a fundamental problem to have been solved decades ago, but significant advances were made as recently as 2010.

Anyway, while we’re talking about C# features, one of the things I think C# really messed up on is not having a “natural int” type. You’ve got stuff like Array.Length and Array.LongLength, with Length being “wrong” in 64-bit and LongLength being unnecessarily slow in 32-bit. All ICollection-derived classes only support 32-bit lengths. Stream.Read/Write only work with 32-bit indexes. Etc etc. There are so many 32-/64-bit problems in core .NET APIs. The CLR has a natural int type. It’s the perfect match for array indexes, etc! But we can’t use it. IntPtr exists but isn’t a normal numeric type… /rant

Reply ↓
- Jason on November 30, 2015 at 11:18 pm said:
  
  The more C++ code I write, the more the lack of an equivalent to size_t bothers me in C#. It just makes so much sense to have a type that is explicitly tied to the system word size!
  
  IntPtr/UIntPtr seem perfect for this. I never understood what the rational was for excluding the usual arithmetic operators on those types.
  
  Reply ↓
  - EduardoS on December 1, 2015 at 7:44 am said:
    
    In the way C# is supposed to be used a native int makes no sense, if you need more than 2^31 elements in 64 bit you may need more than that in 32 bits as well, it is not about how much memory your computer have, a collection may load itens from disk or may just be a function that generates the elements on the fly.
    
    But really few use cases actually needs collections with more than 2^31 elements, I can understand the choice .Net designers made.
    
    Reply ↓
    - John Payson on December 1, 2015 at 7:51 am said:
      
      I wonder how often it really makes sense to have more than 2^31 discrete items, or any single homogeneous object larger than 2^31 bytes? In most cases, I would think that code which hits either limit would do well to either aggregate or subdivide things to bring the number of items and their size closer together.
      
      Reply ↓
- protectedinternal on December 1, 2015 at 6:49 am said:
  
  Array.Length is only “wrong” in 64 bit if you use multidimentional arrays, and turn on support for greater than 2 GB objects. This is true as of .NET 4.5. Before that there was no support for greater than 2GB objects. It remains true through .NET 4.6.2. I suppose it might change in the future, but I tend to doubt it.
  
  The max lengths of other collections following the max length of a single dimentional array makes reasonable sense. The need for larger collections is not actually terribly common, so slowing down the common case to support them is questionable.
  
  The stream read and write methods you mention refer to offsets in a single dimentional array, so using a long would not be beneficial in any way. Streams themselves use a long for their position property to fully support large streams.
  
  Reply ↓
- John Payson on December 1, 2015 at 7:49 am said:
  
  I would suggest that “natural integer” types should have 31 or 63 value bits, with one padding bit. While it’s useful to have types that behave as algebraic rings of integers congruent to 2^n (which is how C treats unsigned types that are not smaller than “int”), subtracting the long natural number 5 from the long natural number 3 should yield the long integer -2–not 18446744073709551614, and not an exception.
  
  Reply ↓
  - Voo on December 8, 2015 at 1:50 pm said:
    
    If C taught us one thing it is that implicit conversions from signed to unsigned or vice versa are an incredibly bad idea.
    
    Also how would you propose that to work anyhow? Should unsigned x – unsigned y really result in a signed result? (which then implies that x – (unsigned) 0 != x – now isn’t that confusing).
    
    Unsigned numbers in general just introduce many complications with very little benefit. For the vast majority of code not having any unsigned numbers would be just fine (hell I’ve worked on compilers in Java and no unsigned numbers was pretty much a non-issue with some helper functions).
    
    Reply ↓
Pingback: The Morning Brew - Chris Alcock » The Morning Brew #1980
Douglas Peterson on December 1, 2015 at 3:05 pm said:
I seem to have missed a language feature somewhere. Why is RawBits, which has no setter, assignable? I thought it was a typo, but then I compiled it.
```
  public DoubleHelper(double d)
  {
    this.RawBits = (ulong)BitConverter.DoubleToInt64Bits(d);
  }

  public ulong RawBits { get; }
```
Reply ↓
- damn on December 1, 2015 at 3:42 pm said:
  
  public ulong RawBits { get; } generates a readonly backing field that is assignable in the constructor if my memory is correct. A quick check with ILSpy could confirm.
  
  Reply ↓
  - Seairth Jacobs on December 2, 2015 at 4:19 am said:
    
    That is correct. One of the new language features is to define an immutable property by specifying only the {get}, then initializing the value in the constructor. For more details, see https://msdn.microsoft.com/en-us/library/bb383979.aspx
    
    Reply ↓
marcbrooks on December 1, 2015 at 3:38 pm said:

There’s a pretty good Fraction class on CodeProject http://www.codeproject.com/Articles/9078/Fraction-class-in-C that Syed Mehroz Alam wrote and I and Jeffery Sax improved. The evolution of the class is interesting, as are the bugs that popped up and got fixed. This class was written for .Net 2 in 2005, so much could be improved these days I suspect.

Reply ↓
- damn on December 1, 2015 at 3:45 pm said:
  
  There is also a good one in Solver Foundation https://msdn.microsoft.com/en-us/library/microsoft.solverfoundation.common.rational(v=vs.93).aspx
  
  Reply ↓
  - ericlippert on December 2, 2015 at 7:19 am said:
    
    Yes, I considered using those, but I decided that it was sufficiently (a) fun and (b) pedagogical to develop my own cheap-and-easy rational class.
    
    Reply ↓
Sam on December 2, 2015 at 6:05 am said:

Your Exponent property could use the => syntax and a large ternary operator to be more concise (though you might argue its less readable).

Reply ↓
- ericlippert on December 2, 2015 at 7:19 am said:
  
  I considered that and decided that the readability suffered too much.
  
  Reply ↓
Bill P. Godfrey on December 2, 2015 at 6:13 am said:

I would be interested in reading an analysis of a hypothetical C# with an int type that’s either 32 bit or 64 bit depending on the CPU. (Or 128 bit on future CPUs.)

Reply ↓
- John Payson on December 8, 2015 at 11:09 am said:
  
  For a language to usefully support machine-dependent integer sizes, they must be the primary integer types. While C could do a much better job with its so-called “fixed-sized” integers, it’s possible for a language with machine-dependent primary integer sizes to usefully include fixed-sized integers which behave as containers which either trap or clip when over-sized values are stored in them, but the benefits of having a machine-dependent integer size would be largely negated if existing methods use fixed-sized types, especially since overload resolution must be done before the CPU type is known.
  
  Reply ↓
i3arnon on December 5, 2015 at 11:43 am said:

> algorithms that **searche** the space of rational

Reply ↓
onodera on December 8, 2015 at 4:39 am said:

Shouldn’t “this.RawBits = (ulong)BitConverter.DoubleToInt64Bits(d);” be “this.RawBits = unchecked((ulong)BitConverter.DoubleToInt64Bits(d));”? It will fail for negative doubles otherwise.

Reply ↓
- onodera on December 8, 2015 at 4:43 am said:
  
  Er, I forgot that C# compiler is not checking for overflow by default. But I still think it’s wise to wrap such casts in unchecked() no matter what setting is selected.
  
  Reply ↓
Pingback: The week in .NET - 12/15/2015 - .NET Blog - Site Home - MSDN Blogs
Pingback: The week in .NET – 12/15/2015-IT大道
Pingback: 주간닷넷 2015년 12월 15일 - Korea Evangelist - Site Home - MSDN Blogs
Michał B. on January 28, 2016 at 3:21 pm said:

Shouldn’t this line “return RawExponent – 1075;” be ammended by “return RawExponent – 1023;” ?

Reply ↓
Pingback: World Taxi and Tourism Blog