How do we ensure that method type inference terminates?

Here’s a question I got from a coworker recently:

It is obviously important that the C# compiler not go into infinite loops. How do we ensure that the method type inference algorithm terminates?

The answer is quite straightforward actually, but if you are not familiar with method type inference then this article is going to make no sense. You might want to watch this video if you need a refresher. Continue reading

Optional argument corner cases, part four

Last time we discussed how some people think that an optional argument generates a bunch of overloads that call each other. People also sometimes incorrectly think that

void M(string format, bool b = false) 
{ 
  Console.WriteLine(format, b); 
}

is actually a syntactic sugar for something morally like:

void M(string format, bool? b) 
{ 
  bool realB = b ?? false; 
  Console.WriteLine(format, realB); 
}

Continue reading

Optional argument corner cases, part three

A lot of people seem to think that this:

void M(string x, bool y = false) 
{ 
  ... whatever ... 
}

is actually a syntactic sugar for the way you used to have to write this in C#, which is:

void M(string x) 
{ 
  M(x, false); 
} 
void M(string x, bool y) 
{ 
  ... whatever ... 
}

But it is not. Continue reading

Optional argument corner cases, part one

In C# 4.0 we added “optional arguments”; that is, you can state in the declaration of a method’s parameter that if certain arguments are omitted, then constants can be substituted for them:

void M(int x = 123, int y = 456) { }

can be called as M(), M(0) and M(0, 1). The first two cases are treated as though you’d said M(123, 456) and M(0, 456) respectively.

This was a controversial feature for the design team, which had resisted adding this feature for almost ten years despite numerous requests for it and the example of similar features in languages like C++ and Visual Basic. Though obviously convenient, the convenience comes at a pretty high price of bizarre corner cases that the compiler team definitely needs to deal with, and customers occasionally run into by accident. I thought I might talk a bit about a few of those bizarre corner cases.

Continue reading

Implementing the virtual method pattern in C#, part one

If you’ve been in this business for any length of time you’ve undoubtedly seen some of the vast literature on “design patterns” — you know, those standard solutions to common problems with names like “factory” and “observer” and “singleton” and “iterator” and “composite” and “adaptor” and “decorator” and… and so on. It is frequently useful to be able to take advantage of the analysis and design skills of others who have already given considerable thought to codifying patterns that solve common problems. However, I think it is valuable to realize that everything in high-level programming is a design pattern. Some of those patterns are so good, we’ve baked them right into the language so thoroughly that most of us don’t even think of them as examples patterns anymore, patterns like “type” and “function” and “local variable” and “call stack” and “inheritance”.

I was asked recently how virtual methods work “behind the scenes”: how does the CLR know at runtime which derived class method to call when a virtual method is invoked on a variable typed as the base class? Clearly it must have something upon which to make a decision, but how does it do so efficiently? I thought I might explore that question by considering how you might implement the “virtual and instance method pattern” in a language which did not have virtual or instance methods. So, for the rest of this series I am banishing virtual and instance methods from C#. I’m leaving delegates in, but delegates can only be to static methods. Our goal is to take a program written in regular C# and see how it can be transformed into C#-without-instance-methods. Along the way we’ll get some insights into how virtual methods really work behind the scenes.
Continue reading

What’s the difference between conditional compilation and the Conditional attribute?

User: Why does this program not compile correctly in the release build?

class Program 
{ 
#if DEBUG 
    static int testCounter = 0; 
#endif 
    static void Main(string[] args) 
    { 
        SomeTestMethod(testCounter++); 
    } 
    [Conditional("DEBUG")] 
    static void SomeTestMethod(int t) { } 
}

Eric: This fails to compile in the release build because testCounter cannot be found in the call to SomeTestMethod.

User: But that call site is going to be omitted anyway, so why does it matter? Clearly there’s some difference here between removing code with the conditional compilation directive versus using the conditional attribute, but what’s the difference?

Eric: You already know the answer to your question, you just don’t know it yet. Let’s get Socratic; let me turn this around and ask you how this works. How does the compiler know to remove the method call site?

User: Because the method called has the Conditional attribute on it.

Eric: You know that. But how does the compiler know that the method called has the Conditional attribute on it?

User: Because overload resolution chose that method. If this were a method from an assembly, the metadata associated with that method has the attribute. If it is a method in source code, the compiler knows that the attribute is there because the compiler can analyze the source code and figure out the meaning of the attribute.

Eric: I see. So fundamentally, overload resolution does the heavy lifting. How does overload resolution know to choose that method? Suppose hypothetically there were another method of the same name with different parameters.

User: Overload resolution works by examining the arguments to the call and comparing them to the parameter types of each candidate method and then choosing the unique best match of all the candidates.

Eric: And there you go. Therefore the arguments must be well-defined at the point of the call, even if the call is going to be removed. In fact, the call cannot be removed unless the arguments are extant! But in the release build, the type of the argument cannot be determined because its declaration has been removed.

So now you see that the real difference between these two techniques for removing unwanted code is what the compiler is doing when the removal happens. At a high level, the compiler processes a text file like this. First it “lexes” the file. That is, it breaks the string down into “tokens” — sequences of letters, numbers and symbols that are meaningful to the compiler. Then those tokens are “parsed” to make sure that the program conforms to the grammar of C#. Then the parsed state is analyzed to determine semantic information about it; what all the types are of all the expressions and so on. And finally, the compiler spits out code that implements those semantics.

The effect of a conditional compilation directive happens at lex time; anything that is inside a removed #if block is treated by the lexer as a comment. It’s like you simply deleted the whole contents of the block and replaced it with whitespace. But removal of call sites depending on conditional attributes happens at semantic analysis time; everything necessary to perform that semantic analysis must be present. 

User: Fascinating. Which parts of the C# specification define this behavior?

Eric: The specification begins with a handy “table of contents”, which is very useful for answering such questions. The table of contents states that section 2.5.1 describes “Conditional compilation symbols” and section 17.4.2 describes “The Conditional attribute”.

User: Awesome!

An inheritance puzzle, part one

Once more I have returned from my ancestral homeland, after some weeks of sun, rain, storms, wind, calm, friends and family. I could certainly use another few weeks, but it is good to be back too.

Well, enough chit-chat; back to programming language design. Here’s an interesting combination of subclassing with nesting. Before trying it, what do you think this program should output?

public class A<T> 
{
  public class B : A<int> 
  {
    public void M() 
    {
      System.Console.WriteLine(typeof(T).ToString());
    }
    public class C : B { }
  }
}
class MainClass 
{
  static void Main() 
  {
    A<string>.B.C c = new A<string>.B.C();
    c.M();
  }
}

Should this say that T is int, string or something else? Or should this program not compile in the first place?

It turned out that the actual result is not what I was expecting at least. I learn something new about this language every day.

Can you predict the behaviour of the code? Can you justify it according to the specification? (The specification is really quite difficult to understand on this point, but in fact it does all make sense.)

The answer is in the next episode!

A face made for email, part three

It has happened again: someone has put me on video talking about programming languages.

This time our fabulous C# Community Program Manager Charlie Calvert was good enough to put together a little half-hour-long video of me talking about the scenarios which justify changes to the type inference algorithm for C# 3.0.

The video is here. Enjoy!