Today on the Coverity Development Testing Blog’s continuing series Ask The Bug Guys I’m turning it around and asking you to figure out why a seemingly correct and totally awesome implementation of random.Next has a serious bug. That’s right, it’s everyone’s favourite game, Spot the Defect! Can you figure out where I wrote a bug without running the program? Check it out.
As always, if you have questions about a bug you’ve found in a C, C++, C# or Java program that you think would make a good episode of ATBG, please send your question along with a small reproducer of the problem to
TheBugGuys@Coverity.com. We cannot promise to answer every question or solve every problem, but we’ll take a selection of the best questions that we can answer and address them on the dev testing blog every couple of weeks.
Pingback: Dew Drop – September 18, 2014 (#1858) | Morning Dew
Floating-point numbers are often excellent for purposes of producing quick calculation that are within acceptable tolerances, but extreme care is required whenever they are converted to discrete numbers, and even more care is required when they are converted to any other form of fractional representation. Anything that converts floating-point numbers to a decimal format for purposes of further machine processing raises alarm bells in my head. Even when formatting numbers to be human-readable, I’m more comfortable converting to a whole number, formatting that, and if need be adding punctuation [I wish formatting libraries would include an option to divide by a power of ten, so as to avoid the need to insert a decimal point into a string]. Otherwise there are way too many bugs that can creep in, either in user code or in libraries.
I once had an engraving job end up with a couple of erroneous moves because of a bug in the Turbo C 2.0 printf; when formatting a number like 999.96 using %0.0f, it determined that it was at least 100 but less than 1,000 and should thus be formatted using three characters, then determined that the value should be rounded to 1,000, and then proceeded to output the first three characters of “1000” [i.e. “100”]. Oops.
And of course, who can forget that the difference between Windows 3.1 and Windows 3.11 was, quite literally, 0.01. If I remember right, computing “3+0.11” on the Windows 3.1 calculator would output “3.10” but on Windows 3.11 it would output “3.11”. What happened was that the calculator determined that the result formatted one way was “3.1100000000000”, and decided it should output two digits after the decimal point. It then output the unrounded version of the number which was something like “3.109999999999997” with two digits after the decimal, thus yielding “3.10”.
What do you think of the base-16 floating-point representations? It took me awhile to get used to them, and I’ve not seen any formatting methods that allow one to select a normalization convention [the value 0.2f might be in some contexts be more usefully represented as 0x1.999998p-03f, or sometimes as 0x0.333333p+00f] but provided code accepts the necessity of examining the exponent, round-trip conversions are guaranteed.
Funny you should mention these sorts of issues as literally one minute ago I noticed a bug in the script which outputs test timings. It just output:
You see what happened there? The Blah test took 9.97 seconds; the formatter said “if it is less than 10.0, add a space before the number”, but it decided that *before* rounding to 10.0. Obviously not a critical bug, but still, the kind of problem you run into when mixing math and string logic carelessly.
Converting to a whole number of tenths before formatting would have easily prevented that sort of problem. Except when using a strict-floating-point context, code should avoid requiring anything better than +/-1lsb accuracy or consistency from floating-point maths. If a floating-point value is converted to an integer once and that integer is used twice, one can guarantee that the same value will be observed both times. By contrast, if it is converted to an integer on two separate occasions, it’s possible that the two conversions might yield different results even if the floating-point value didn’t change.
BTW, I wish Java and C# hadn’t used C-style casts for truncating float-to-integer float coercion. Having instance methods for `FloorInt32`, `FloorInt64`, `RoundInt32`, etc. and reserving casts for situations where the floating-point value is *expected* to represent an exact whole number (and failure to do so would indicate an error) would have helped make clear when particular behaviors are expected.
I wish Java and C# hadn’t used C-style casts for truncating float-to-integer float coercion. Having instance methods for `FloorInt32`, `FloorInt64`, `RoundInt32`, etc.
I like it.
reserving casts for situations where the floating-point value is *expected* to represent an exact whole number (and failure to do so would indicate an error) would have helped make clear when particular behaviors are expected.
Nah, then the success of the cast is very much a hit-and-miss affair. Very implementation dependent. I would not want (int) (10.0*0.1) to succeed but (int) (5.0*0.2) to fail. (Or some similar case.) I think it would be better to let the coder use a round() method and not have casts at all. Never allow the assumption that an exact integer representation is possible, since it isn’t in the general case.
Conversion methods give the coder the opportunity to say exactly what kind of conversion the really want, rather than leaving it implicit and promoting math errors.
Sometimes Double is used to hold things that are supposed to represent exact whole numbers, and the code isn’t prepared to deal with anything else; I would suggest that *casting* should be reserved for that case. To be sure, the usage scenarios where one would want fractional values to throw an exception are not as common as those where one would want a rounded or floored value, but if code which expects to always receive a whole number hasn’t explicitly requested a rounded, floored, or truncated value, having it fail fast when unexpectedly given a value like (1000*1.001) would be safer than having it treat the number as either 1000 or 1001 [or, as could very easily happen, having some code treat it as 1000 and some code treat it as 1001!]
Further, I think there would great value in having a consistent rule that an *explicit* conversion operator from X to Y should for all types X and Y *always* either yield a Y which can be converted back to X to yield the original value, or throw an exception if it cannot. Round trip conversions aren’t terribly common, but having a consistent way to assert their correctness with all combinations of types where they are permitted would be helpful.
Sometimes Double is used to hold things that are supposed to represent exact whole numbers, and the code isn’t prepared to deal with anything else
To which I say, with a tip of my beret, “Well there’s your problem.” Such cases should be rare enough that dealing with them by means of library conversion functions instead of casts should be no great burden. I think casts are really for conversions that you are sure will work.