The dedoublifier, part one

Good Monday morning everyone; I hope my American readers had a lovely Thanksgiving. I sure did!

Well enough chit-chat, here’s a problem I was pondering the other day. We know that doubles introduce some “representation error” when trying to represent most numbers. The number 3/4, or 0.75 for example, can be exactly represented by a double, but the number 3/5, or 0.6, cannot.

To digress for a moment with a brief refresher: a double can be thought of as a fraction where the numerator is a 52 bit integer, and the denominator is a largish power of two. For the exact details, see the Wikipedia page on the subject or my old articles. Since 3/4 can clearly be represented by an integer divided by a power of two, it can be represented exactly. But there is no way to make 3/5 come out to have a power of two on the bottom.

Now, fortunately we have conversions from double to decimal, and from double to base-ten strings, that allow us to easily answer the question “what decimal fraction is this double close to?” The question I am pondering is: suppose we have a double which approximates some weird fraction with neither a power of two or ten in the denominator. 33/17 or 85/97 or whatever. Is there some way to easily get that fraction back out of the double?

We’ll do this in four parts. In this first part I’ll make a new helper class to extract the parts of a double; in the second part I’ll make my own rational number type, and we’ll finish up by proposing some algorithms that searche the space of rational numbers for the one that best matches a given double, and implementing one of them.


Long-time readers will know that I’ve solved the first problem before, back in 2011. But today I am excited to try out some of the code-shortening features of C# 6.0. So let’s get right into it:

struct DoubleHelper
{
  public DoubleHelper(double d)
  {
    this.RawBits = (ulong)BitConverter.DoubleToInt64Bits(d);
  }

  public ulong RawBits { get; }
  // RawSign is 1 if zero or negative, 0 if zero or positive
  public int RawSign => (int)(RawBits >> 63);
  public int RawExponent => (int)(RawBits >> 52) & 0x7FF;
  public long RawMantissa => (long)(RawBits & 0x000FFFFFFFFFFFFF);
  public bool IsNaN => RawExponent == 0x7ff && RawMantissa != 0;
  public bool IsInfinity => RawExponent == 0x7ff && RawMantissa == 0;
  public bool IsZero => RawExponent == 0 && RawMantissa == 0;
  public bool IsDenormal => RawExponent == 0 && RawMantissa != 0;

  // Sign is 1 if positive, -1 if negative, 0 if zero
  public int Sign => IsZero ? 0 : 1 - RawSign * 2;
  public long Mantissa => IsZero || IsDenormal || IsNaN || IsInfinity ?
    RawMantissa :
    RawMantissa | 0x0010000000000000;

  public int Exponent
  {
    get
    {
      if (IsZero)
        return 0;
      else if (IsDenormal)
        return -1074;
      else if (IsNaN || IsInfinity)
        return RawExponent; 
      else
        return RawExponent - 1075;
    }
  }
}

You know, I was initially skeptical of the => syntax for simple read-only properties, but I really like this. With just these few lines of code I can now analyze a double both in terms of what the raw bit fields are, and be smart about figuring out the meaning of the sign, exponent and mantissa parts.

Next time on FAIC: we’ll use this little helper to implement arbitrary precision rational numbers.

30 thoughts on “The dedoublifier, part one

  1. Sign => IsZero ? 0 : 1 – RawSign * 2

    Omg, Eric… playing with magic numbers there!

    May I suggest: -RawSign or RawSign == 1 ? -1 : 1.

    • Neither of those have the correct semantics. The semantics of Sign should be -1 for negative, 0 for zero and 1 for positive. The raw sign bit is 1 for not-positive and 0 for not-negative. So I have to have a test for zeroness in there somewhere.

      Now, IsZero ? 0 : Sign == 1 ? -1 : 1 would be correct but I think that is hardly easier to read.

      Regardless, the code could use some comments as this point is clearly confusing.

  2. I think the numerator in your analogy is actually 53-bits due to the hidden 1 bit prefixing the mantissa.

    I had to write code to convert doubles to/from decimal and string. It’s surprisingly hard to do both correctly and efficiently. (I gave up and just went the slow and correct route.) You’d expect such a fundamental problem to have been solved decades ago, but significant advances were made as recently as 2010.

    Anyway, while we’re talking about C# features, one of the things I think C# really messed up on is not having a “natural int” type. You’ve got stuff like Array.Length and Array.LongLength, with Length being “wrong” in 64-bit and LongLength being unnecessarily slow in 32-bit. All ICollection-derived classes only support 32-bit lengths. Stream.Read/Write only work with 32-bit indexes. Etc etc. There are so many 32-/64-bit problems in core .NET APIs. The CLR has a natural int type. It’s the perfect match for array indexes, etc! But we can’t use it. IntPtr exists but isn’t a normal numeric type… /rant

    • The more C++ code I write, the more the lack of an equivalent to size_t bothers me in C#. It just makes so much sense to have a type that is explicitly tied to the system word size!

      IntPtr/UIntPtr seem perfect for this. I never understood what the rational was for excluding the usual arithmetic operators on those types.

      • In the way C# is supposed to be used a native int makes no sense, if you need more than 2^31 elements in 64 bit you may need more than that in 32 bits as well, it is not about how much memory your computer have, a collection may load itens from disk or may just be a function that generates the elements on the fly.

        But really few use cases actually needs collections with more than 2^31 elements, I can understand the choice .Net designers made.

        • I wonder how often it really makes sense to have more than 2^31 discrete items, or any single homogeneous object larger than 2^31 bytes? In most cases, I would think that code which hits either limit would do well to either aggregate or subdivide things to bring the number of items and their size closer together.

    • Array.Length is only “wrong” in 64 bit if you use multidimentional arrays, and turn on support for greater than 2 GB objects. This is true as of .NET 4.5. Before that there was no support for greater than 2GB objects. It remains true through .NET 4.6.2. I suppose it might change in the future, but I tend to doubt it.

      The max lengths of other collections following the max length of a single dimentional array makes reasonable sense. The need for larger collections is not actually terribly common, so slowing down the common case to support them is questionable.

      The stream read and write methods you mention refer to offsets in a single dimentional array, so using a long would not be beneficial in any way. Streams themselves use a long for their position property to fully support large streams.

    • I would suggest that “natural integer” types should have 31 or 63 value bits, with one padding bit. While it’s useful to have types that behave as algebraic rings of integers congruent to 2^n (which is how C treats unsigned types that are not smaller than “int”), subtracting the long natural number 5 from the long natural number 3 should yield the long integer -2–not 18446744073709551614, and not an exception.

      • If C taught us one thing it is that implicit conversions from signed to unsigned or vice versa are an incredibly bad idea.

        Also how would you propose that to work anyhow? Should unsigned x – unsigned y really result in a signed result? (which then implies that x – (unsigned) 0 != x – now isn’t that confusing).

        Unsigned numbers in general just introduce many complications with very little benefit. For the vast majority of code not having any unsigned numbers would be just fine (hell I’ve worked on compilers in Java and no unsigned numbers was pretty much a non-issue with some helper functions).

  3. Pingback: The Morning Brew - Chris Alcock » The Morning Brew #1980

  4. I seem to have missed a language feature somewhere. Why is RawBits, which has no setter, assignable? I thought it was a typo, but then I compiled it.

      public DoubleHelper(double d)
      {
        this.RawBits = (ulong)BitConverter.DoubleToInt64Bits(d);
      }
    
      public ulong RawBits { get; }
    
  5. Your Exponent property could use the => syntax and a large ternary operator to be more concise (though you might argue its less readable).

    • For a language to usefully support machine-dependent integer sizes, they must be the primary integer types. While C could do a much better job with its so-called “fixed-sized” integers, it’s possible for a language with machine-dependent primary integer sizes to usefully include fixed-sized integers which behave as containers which either trap or clip when over-sized values are stored in them, but the benefits of having a machine-dependent integer size would be largely negated if existing methods use fixed-sized types, especially since overload resolution must be done before the CPU type is known.

  6. Shouldn’t “this.RawBits = (ulong)BitConverter.DoubleToInt64Bits(d);” be “this.RawBits = unchecked((ulong)BitConverter.DoubleToInt64Bits(d));”? It will fail for negative doubles otherwise.

    • Er, I forgot that C# compiler is not checking for overflow by default. But I still think it’s wise to wrap such casts in unchecked() no matter what setting is selected.

  7. Pingback: The week in .NET - 12/15/2015 - .NET Blog - Site Home - MSDN Blogs

  8. Pingback: The week in .NET – 12/15/2015-IT大道

  9. Pingback: 주간닷넷 2015년 12월 15일 - Korea Evangelist - Site Home - MSDN Blogs

  10. Pingback: World Taxi and Tourism Blog

Leave a comment