Avoiding C# defects talk Tuesday

Hello all, I have been crazy busy these last few weeks either traveling for work or actually programming with Roslyn — woo hoo! — and have not had time to blog. I’ve been meaning to do a short tour of the Roslyn codebase, now that it is open-sourced, but that will have to wait for later this summer.

Today I just want to mention that tomorrow, July 15th, at 8:30 AM Pacific Daylight Time, I’ll be doing a live talk broadcast on the internet where I’ll describe how the Coverity static analyzer works and what some of the most common defect patterns we find are. In particular I’m very excited by a new concurrency issue checker that looks for incorrect implementations of double-checked locking, and other “I avoided a lock when I should not have” defects. My colleague Kristen will also be talking about the new “desktop” mode of the analyzer.

If you’re interested, please register beforehand at this link. Thanks to Visual Studio Magazine for sponsoring this event.

If you missed it: the webcast will be recorded and the recording will be posted on the Coverity blog in a couple of days. The recording will also be posted on the Visual Studio Magazine site link above for 90 days.

Real world async/await defects

Today I’m once again asking for your help.

The headliner feature for C# 5 was of course the await operator for asynchronous methods. The intention of the feature was to make it easier to write asynchronous programs where the program text is not a mass of inside-out code that emphasizes the mechanisms of asynchrony, but rather straightforward, easy-to-read code; the compiler can deal with the mess of turning the code into a form of continuation passing style.

However, asynchrony is still hard to understand, and making methods that allow other code to run before their postconditions are met makes for all manner of interesting possible bugs. There are a lot of great articles out there giving good advice on how to avoid some of the pitfalls, such as:

These are all great advice, but it is not always clear which of these potential defects are merely theoretical, and which you have seen and fixed in actual production code. That’s what I am very interested to learn from you all: what mistakes were made by real people, how were they discovered, and what was the fix?

Here’s an example of what I mean. This defect was found in real-world code; obviously the extraneous details have been removed:

Frob GetFrob()
{
  Frob result = null;
  var networkDevice = new NetworkDevice();
  networkDevice.OnDownload += 
    async (state) => { result = await Frob.DeserializeStateAsync(state); };
  networkDevice.GetSerializedState(); // Synchronous
  return result; 
}

The network device synchronously downloads the serialized state of a Frob. When it is done, the delegate stored in OnDownload runs synchronously, and is passed the state that was just downloaded. But since it is itself asynchronous, the event handler starts deserializing the state asynchronously, and returns immediately. We now have a race between GetFrob returning null, and the mutation of closed-over local result, a race almost certainly won by returning null.

If you’d rather not leave comments here — and frankly, the comment system isn’t much good for code snippets anyways — feel free to email me at eric@lippert.com. If I get some good examples I’ll follow up with a post describing the defects.

ATBG: Why do enumerators avoid a bounds check?

I am back from a week in which I visited England, Holland, Belgium, France and Hungary; this is by far the most countries I’ve been to in so little time. It was great to meet so many customers; more on bicycling around Budapest in a future episode. For today though, I’ve posted on the Coverity Development Testing Blog’s continuing series Ask The Bug Guys a follow up to the previous episode. I’ll discuss why the struct-based list enumerator does not do bounds checking in its implementation of the Current property. Check it out! Thanks to reader Xi Chen for the question. Continue reading

What are the fundamental rules of pointers?

A lot of questions I see in the C tag on StackOverflow are from beginners who have never been taught the fundamental rules of pointers. (I note that these rules apply to C# as well, though it is rare to use raw pointers in C#.) A lot of introductions to pointers get caught up on the implementation details of what pointers are for a particular compiler targeting a particular architecture, so I want to be a bit more abstract than that. So, without further ado, here are the fundamentals: Continue reading

ATBG: Why does my code not crash?

For a change of page, today on the Coverity Development Testing Blog’s continuing series Ask The Bug Guys I’ll talk about mostly C and C++, with a little Java and C# thrown in at the end. I’ll discuss a very common question I see on StackOverflow in the “C” and “C++” tags: “here’s a clearly buggy program that I wrote; why does it not AV / segfault / crash when I run it?” Check it out! Continue reading

Lowering in language design, part two

Last time on FAIC I described how language designers and compiler writers use “lowering” to transform a high-level program written in a particular language into a lower-level program written using simpler features of the same language. Did you notice what this technique presupposes? To illustrate, let’s take a look at a line of the C# specification:

the operation e as T produces the same result as e is T ? (T)(e) : (T)null except that e is only evaluated once.

This seems bizarre; why does the specification say that a higher-level operation is exactly the same as a lower-level operation, except that it isn’t? Because there is no way to express the exact semantics of as in legal C# in any lowered form! The feature “evaluate a subexpression to produce its side effects and value once and then use that value several times throughout its containing expression” does not exist in C#, so the specification is forced to describe the lowering in this imprecise and roundabout way.

A nice principle of programming language design that C# does not adhere to is that all the higher-level features of the program can be expressed as lower-level features. Why is this nice to have? Well, not only is it a big convenience for writers of the specification, it also is a big convenience for writers of language analyzers. If you can programmatically transform a high-level language into an exactly equivalent program in a language with far fewer complicated concepts then your analyzer gets simpler. Continue reading

Lowering in language design, part one

Programming language designers and users talk a lot about the “height” of language features; some languages are considered to be very “high level” and some are considered to be very “low level”. A “high level” language is generally speaking one which emphasizes the business concerns of the program, and a low-level language is one that emphasizes the mechanisms of the underlying hardware. As two extreme examples, here’s a program fragment in my favourite high-level language, Inform7:
Continue reading