I briefly discussed copy-paste errors in code earlier; though this is a rich area of defects that I will probably at some point go into more detail on, that’s not for today.
Though this is a trivial little issue, I think it is worthwhile to illustrate how to think about these sorts of defects.
What is the defect?
Last week Jon Skeet “tweeted” humorously that in his work on the ECMA committee that is standardizing C#, they had found a mistake in the specification that was probably my fault; some commenters suggested that perhaps hell had also frozen over.
Though amusing, of course I wrote many errors in the specification, most of which were removed by Mads and the other specification reviewers. Specifications are hard to get right, particularly when they are written in an informal language.
In this particular case the offending line is in section 6.1.7, on conversions:
A value type has a boxing conversion to an interface type I if it has a boxing conversion to an interface or delegate type I0 and I0 is variance-convertible to I.
The intention here is that if we have a struct which implements
IEnumerable<Giraffe> then it may legally be boxed directly to
IEnumerable<Animal>. C# does not require you to box to the first interface, and then do a reference conversion to the second.
The error is the “or delegate”.
Why is this a defect?
It is nonsensical in two ways. First, no value type boxes to any delegate type. And second, the “variance-convertible” rules only allow conversions from delegate to delegate or interface to interface, not delegate to interface.
Does the sentence have to be false to be a defect?
From a logical perspective the sentence is not false. 100% of value types that can be boxed to a delegate that can be converted to an interface are convertible to that interface because 100% of zero is zero. But from a pedagogical perspective the sentence is confusing and should be fixed. The reader reasonably expects that if the spec says something is legal, that it is also possible.
How did the defect arise?
The fragment “interface or delegate type” appears a dozen times in the C# 4.0 specification, and most if not all of those occurrences needed editing when variance conversions were added. This is almost certainly a copy-paste error. Though it could have been Mads — we were both editing the specification at the time — it seems likely to be my bad.
What defect prevention mechanisms failed?
Plainly this should not have survived the initial review, but it is as easy to overlook a copy-paste error in review as it is to make one initially.
Worse though, is the fact that I wrote all the conversion logic in Roslyn, and I did so for the most part by carefully following the specification. This had two purposes; first, to be sure that I was implementing the actual specification, and second, as yet another chance to ensure that the specification made sense. For much of the conversion code I have methods where each method implements one line of the specification, for this reason.
What should have happened is: I should have written a method for the offending line of the specification, and then when I tried to write a method that tested whether a value type is convertible to a delegate, I would have realized that the specification was wrong. I did not do this for boxing conversions, probably because they seemed so straightforward compared to, say, user-defined conversions. The code for boxing conversions in Roslyn does the right thing: it enumerates all the interfaces of the value type and checks whether any is convertible to the target interface type. It does not mess around with delegates at all.
In my long experience with the C# compiler I’ve seen that it is very frequently the case that a compiler bug or spec bug is the result of the compiler developer taking shortcuts like this, and implementing what they believe the spec means, not what it says. I shouldn’t have cut this corner.
What is the remediation?
The “or delegate” can simply be removed from the specification.
I don’t recommend that we change the working, debugged source code to have a structure that more clearly matches the spec for this case.
This honestly looks more like what I would call a Muscle Memory Error. It’s the same thing that makes me type a semicolon after an ‘if’ condition (close_paren-semicolon-enter just rolls off the hand too easily);
or makes me hit F5 to attempt to do pretty much everything: sending email, save file, etc), or makes me turn left when leaving my driveway, regardless of which direction I actually need to go.
Pingback: The Morning Brew - Chris Alcock » The Morning Brew #1868
Pingback: Dew Drop – May 27, 2015 (#2022) | Morning Dew
Look at the bright side… If your errors are so miniscule that it takes someone of Jon Skeet’s calibre to find them (spec OR code), you’re doing well 🙂
Pingback: Visual Studio Enterprise - The Daily Six Pack: May 28, 2015
Great post and re-assuring to us lesser beings that you guys make mistakes too. You said “For much of the conversion code I have methods where each method implements one line of the specification, for this reason.” Is this a form of unit testing for the specification? Are these “unit tests” available publicly? I’m just thinking they could be great examples for learning and understanding the specification more accurately.