Unknown's avatar

About ericlippert

http://ericlippert.com

Dynamic contagion, part two

This is part two of a two-part series on dynamic contagion. Part one is here.


Last time I discussed how the dynamic type tends to spread through a program like a virus: if an expression of dynamic type “touches” another expression then that other expression often also becomes of dynamic type. Today I want to describe one of the least well understood aspects of method type inference, which also uses a contagion model when dynamic gets involved. Continue reading

Dynamic contagion, part one

This is part one of a two-part series on dynamic contagion. Part two is here.


Suppose you’re an epidemiologist modeling the potential spread of a highly infectious disease. The straightforward way to model such a series of unfortunate events is to assume that the population can be divided into three sets: the definitely infected, the definitely healthy, and the possibly infected. If a member of the healthy population encounters a member of the definitely infected or possibly infected population, then they become a member of the possibly infected population. (Or, put another way, the possibly infected population is closed transitively over the exposure relation.) A member of the possibly infected population becomes classified as either definitely healthy or definitely infected when they undergo some sort of test. And an infected person can become a healthy person by being cured.

This sort of contagion model is fairly common in the design of computer systems. For example, suppose you have a web site that takes in strings from users, stores them in a database, and serves them up to other users. Like, say, this blog, which takes in comments from you, stores them in a database, and then serves them right back up to other users. That’s a Cross Site Scripting (XSS) attack waiting to happen right there. A common way to mitigate the XSS problem is to use data tainting, which uses the contagion model to identify strings that are possibly hostile. Whenever you do anything to a potentially-hostile string, like, say, concatenate it with a non-hostile string, the result is a possibly-hostile string. If the string is determined via some test to be benign, or can have its potentially hostile parts stripped out, then it becomes safe.

The “dynamic” feature in C# 4 and above has a lot in common with these sorts of contagion models. As I pointed out last time, when an argument of a call is dynamic then odds are pretty good that the compiler will classify the result of the call as dynamic as well; the taint spreads. In fact, when you use almost any operator on a dynamic expression, the result is of dynamic type, with a few exceptions. (“is” for example always returns a bool.)  You can “cure” an expression to prevent it spreading dynamicism by casting it to object, or to whatever other non-dynamic type you’d like; casting dynamic to object is an identity conversion.

The way that dynamic is contagious is an emergent phenomenon of the rules for working out the types of expressions in C#. There is, however, one place where we explicitly use a contagion model inside the compiler in order to correctly work out the type of an expression that involves dynamic types: it is one of the most arcane aspects of method type inference. Next time I’ll give you all the rundown on that.


This is part one of a two-part series on dynamic contagion. Part two is here.

A method group of one

I’m implementing the semantic analysis of dynamic expressions in Roslyn this week, so I’m fielding a lot of questions within the team on the design of the dynamic feature of C# 4. A question I get fairly frequently in this space is as follows:

public class Alpha
{
  public int Foo(string x) { ... }
}
  ...
  dynamic d = whatever;
  Alpha alpha = MakeAlpha();
  var result = alpha.Foo(d);

How is this analyzed? More specifically, what’s the type of local result?

If the receiver (that is, alpha) of the call were of type dynamic then there would be little we could do at compile time. We’d analyze the compile-time types of the arguments and emit a dynamic call site that caused the semantic analysis to be performed at runtime, using the runtime type of the dynamic expression. But that’s not the case here. We know at compile time what the type of the receiver is. One of the design principles of the C# dynamic feature is that if we have a type that is known at compile time, then at runtime the type analysis honours that. In other words, we only use the runtime type of the things that were actually dynamic; everything else we use the compile-time type. If MakeAlpha() returns a class derived from Alpha, and that derived class has more overloads of Foo, we don’t care.

Because we know that we’re going to be doing overload resolution on a method called Foo on an instance of type Alpha, we can do a “sanity check” at compile time to determine if we know that for sure, this is going to fail at runtime. So we do overload resolution, but instead of doing the full overload resolution algorithm (eliminate inapplicable candidates, determine the unique best applicable candidate, perform final validation of that candidate), we do a partial overload resolution algorithm. We get as far as eliminating the inapplicable candidates, and if that leaves one or more candidates then the call is bound dynamically. If it leaves zero candidates then we report an error at compile time, because we know that nothing is going to work at runtime.

Now, a seemingly reasonable question to ask at this point is: overload resolution in this case could determine that there is exactly one applicable candidate in the method group, and therefore we can determine statically that the type of result is int, so why do we instead say that the type of result is dynamic?

That appears to be a reasonable question, but think about it a bit more. If you and I and the compiler know that overload resolution is going to choose a particular method then why are we making a dynamic call in the first place? Why haven’t we cast d to string? This situation is rare, unlikely, and has an easy workaround by inserting casts appropriately (either casting the call expression to int or the argument to string). Situations that are rare, unlikely and easily worked around are poor candidates for compiler optimizations. You asked for a dynamic call, so you’re going to get a dynamic call.

That’s reason enough to not do the proposed feature, but let’s think about it a bit more deeply by exploring a variation on this scenario that I glossed over above. Eta Corporation produces:

public class Eta {}

and Zeta Corporation extends this code:

public class Zeta : Eta
{
  public int Foo(string x){ ... }
}
  ...
  dynamic d = whatever;
  Zeta zeta = new Zeta();
  var result = zeta.Foo(d);

Suppose we say that the type of result is int because the method group has only one member. Now suppose that in the next version, Eta Corporation supplies a new method:

public class Eta
{
  public string Foo(double x){...}
}

Zeta corporation recompiles their code, and hey presto, suddenly result is of type dynamic! Why should Eta Corporation’s change to the base class cause the semantic analysis of code that uses a derived class to change? This seems unexpected. C# has been carefully designed to avoid these sorts of “Brittle Base Class” failures; see my other articles on that subject for examples of how we do that.

We can make a bad situation even worse. Suppose Eta’s change is instead:

public class Eta
{
  protected string Foo(double x){...}
}

Now what happens? Should we say that the type of result is int when the code appears outside of class Zeta, because overload resolution produces a single applicable candidate, but dynamic when it appears inside, because overload resolution produces two such candidates? That would be quite bizarre indeed.

The proposal is simply too much cleverness in pursuit of too little value. We’ve been asked to perform a dynamic binding, and so we’re going to perform a dynamic binding; the result should in turn be of type dynamic. The benefits of being able to statically deduce types of dynamic expressions does not pay for the costs, so we don’t attempt to do so. If you want static analysis then don’t turn it off in the first place.


Next time on FAIC: The dynamic taint of method type inference.

Is C# a strongly typed or a weakly typed language?

Presented as a dialogue, as is my wont!

Is C# a strongly typed or a weakly typed language?

Yes.

That is unhelpful.

I don’t doubt it. Interestingly, if you rephrased the question as an “and” question, the answer would be the same.

What? You mean, is C# a strongly typed and a weakly typed language?

Yes, C# is a strongly typed language and a weakly typed language.

I’m confused.

Me too. Perhaps you should tell me precisely what you mean by “strongly typed” and “weakly typed”.

Um. I don’t actually know what I mean by those terms, so perhaps that is the question I should be asking. What does it really mean for a language to be “weakly typed” or “strongly typed”?

“Weakly typed” means “this language uses a type verification system that I find distasteful“, and “strongly typed” means “this language uses a type system that I find attractive“.

No way!

Way, dude.

Really?

These terms are meaningless and you should avoid them. Wikipedia lists eleven different meanings for “strongly typed”, several of which contradict each other. Any time two people use “strongly typed” or “weakly typed” in a conversation about programming languages, odds are good that they have two subtly or grossly different meanings in their heads for those terms, and are therefore automatically talking past each other.

But surely they mean something other than “unattractive” or “attractive”!

I do exaggerate somewhat for comedic effect. So lets say: a more-strongly-typed language is one that has somerestriction in its type system that a more-weakly-typed language it is being compared to lacks. That’s all you can really say without more context.

How can I have sensible conversations about languages and their type systems then?

You can provide the missing context. Instead of using “strongly typed” and “weakly typed”, actually describe the restriction you mean. For example, C# is for the most part a statically typed language, because the compiler determines facts about the types of every expression. C# is for the most part a type safe language because it prevents values of one static type from being stored in variables of an incompatible type (and other similar type errors). And C# is for the most part memory safe language because it prevents accidental access to bad memory.

Thus, someone who thinks that “strongly typed” means “the language encourages static typing, type safety and memory safety in the vast majority of normal programs” would classify C# as a “strongly typed” language. C# is certainly more strongly typed than languages that do not have these restrictions in their type systems.

But here’s the thing: because C# is a pragmatic language there is a way to override all three of those safety systems. Cast operators and “dynamic” in C# 4 override compile-time type checking and replace it with runtime type checking, and “unsafe” blocks allow you to turn off type safety and memory safety should you need to. Someone who thinks that “strongly typed” means “the language absolutely positively guarantees static typing, type safety and memory safety under all circumstances” would quite rightly classify C# as “weakly typed”. C# is not as strongly typed as languages that do enforce these restrictions all the time.

So which is it, strong or weak? It is impossible to say because it depends on the point of view of the speaker, it depends on what they are comparing it to, and it depends on their attitude towards various language features. It’s therefore best to simply avoid these terms altogether, and speak more precisely about type system features.


Next time on FAIC: What happens when a dynamic call’s method group has a single member?

High altitude

No computer programming stuff today; just some fun for Friday.

As I’m writing this Felix Baumgartner’s attempt to set the world record for skydiving height by diving from a helium balloon has been scrubbed due to bad weather. This attempt has got me thinking of my good friend JB, who back in 1982 set the world record[1. It’s in the 1988 Guinness Book of World Records.] for hang gliding height by similarly using a helium balloon.

JB is one of those people who proves the truth of the saying that you really can do anything you put your mind to, as he’s been a world-record breaking hang glider pilot, skydiver, balloonist, airplane pilot, ultra-marathon runner, shuttle astronaut candidate[2. His microgravity experiment ended up flying on the Vomit Comet rather than the shuttle.], upper-atmosphere physicist, microgravity physicist, nuclear physicist, father, and I’m probably missing a dozen more accomplishments in there. And teacher! When I was a child he taught me useful skills like how to estimate large numbers, how to do trigonometry, and how to do calculus, usually by pointing out things on the beach and then doing math in the sand, like Archimedes. How many grains of sand are on this beach? How far away is the horizon when you stand on the roof of the cottage? What shape path does this rock make in the air when you throw it? These sorts of questions fascinated me as a child, and, I suppose, still do.

Anyway, I recently learned that JB has uploaded the short film his brother Bims made to document the successful attempt at the record. Check it out, and enjoy the hairstyles of the 1980s.

Does not compute

One of the most basic ways to think about a computer program is that it is a device which takes in integers as inputs and spits out integers as outputs. The C# compiler, for example, takes in source code strings, and those source code strings are essentially nothing more than enormous binary numbers. The output of the compiler is either diagnostic text, or strings of IL and metadata, which are also just enormous binary numbers.  Because the compiler is not perfect, in some rare cases it terminates abnormally with an internal error message. But those fatal error messages are also just big binary numbers. So let’s take this as our basic model of a computer program: a computer program is a device that either (1) runs forever without producing output, or (2) computes a function that maps one integer to another.
Continue reading

How do we ensure that method type inference terminates?

Here’s a question I got from a coworker recently:

It is obviously important that the C# compiler not go into infinite loops. How do we ensure that the method type inference algorithm terminates?

The answer is quite straightforward actually, but if you are not familiar with method type inference then this article is going to make no sense. You might want to watch this video if you need a refresher. Continue reading

Mistakes were made, part three

This post is from my series on building a backyard foundry.

You remember back when I said in part two of this series that I was temporarily using a flimsy stainless steel tub as a crucible until I managed to obtain a 3 1/2 inch (nominal) pipe nipple? Turns out that when you think “I can probably get one more melt out of this thing before it is destroyed”, that is the time to throw it away. The crucible failed. Fortunately, the crucible was still in the furnace. Continue reading

Big boxes

As someone who owns an old house and likes “do it yourself” projects, I spend a lot of time in “big box” warehouse stores. I try my absolute best to interact with as few employees as possible when I go to these stores because it never seems to go well. Here are a few conversational highlights from over the years:

 

—— Holy trash bags, Batman ——

Me: Hi there, you probably don’t have these but Western Safety is closed today. Do you have large four or six mil tear-resistant trash bags?

Big Box Store Employee: I don’t think so; what do you need them for?

Me: I’m tearing up a hundred-year-old sub-floor and the test for asbestos contamination has come back positive. The toxic waste dump won’t take asbestos contaminated waste unless it is properly bagged and labelled.

BBSE: Well, I’d just bag it and throw it out in the regular trash and not tell anyone.

(I went to Western Safety.)

—— Circular is the round one ——

Me, speaking to the guy at the tool counter: Hi, I need an eight inch abrasive cutoff wheel suitable for cutting thin, soft steel with a chop saw or circular saw.

BBSE: You mean these? 

Me: Those are reciprocating saw blades. Circular saw blades are circles.

BBSE: Oh, so you mean these?

Me: Those are ten inch wood cutting blades.

BBSE: Hmm. You mean these?

Me: Those are concrete cutting wheels.

BBSE: How about these?

Me: Those are metal cutting wheels but those are four inches wide. I need eight.

BBSE: Maybe you should try Lowes.

(I tried Ace, successfully.)


—— It’s not a nuclear reactor, it’ll come back online easily enough ——

Me, fifteen minutes before the store closes: Can I have this twelve foot board sawed into two six foot boards? I need two six foot shelves, and a twelve foot board won’t fit in my car.

BBSE: Sorry, the saw is already shut down for the night.

Me, speaking to the store manager 90 seconds later: I have a question for you: is it the policy of this store that the saw “shuts down for the night” at some time before closing?

Manager: Uh, no… who told you that?

Me: I think it was the guy who just vacuumed up the sawdust and doesn’t want to do it again.

Manager: I know just who you mean.

(They sawed my board but boy, were they not happy about it.)

—— That’s just smurfy ——

Me, talking to a guy in the electrical aisle: Can you tell me where to find one-inch diameter flexible electrical conduit? It is made of thin, ridged plastic and is sometimes called “smurf tube” because it’s that colour of blue.

BBSE: Sorry, we don’t carry anything like that.

As I turned to leave I realized that of course the smurf tube was directly behind me; the BBSE was looking at it as he was telling me he didn’t have it.

—— How useful! ——

Me, talking to a (different) guy at the tool counter: Where are the rivets?

BBSE: Rivets?

Me: Rivets.

BBSE: I’ve never heard that word before; what’s a rivet?

Me: A rivet is a metal fastener usually used to attach metal objects together. You insert the rivet through the objects you wish to fasten together and then deform one end of the rivet by peening it with a special tool. If you have access to both sides of the objects you can use solid rivets, otherwise you can use hollow rivets.

BBSE: Wow, that sure sounds useful!

(Another employee knew what rivets were, and, bonus, where in the store they were.)


—— Take a number ——

Now, I understand that the people hired at big box stores have no experience whatsoever using any product that they sell, and, as we’ve just seen, often no knowledge of what they sell in the first place. I know that if I want knowledgeable conversation about a tool with an expert I should go to Hardwick’s, which is like paradise for hardware geeks. The trouble is that they don’t have convenient hours; they’re closed by the time I get home from work, and not open Sundays. I try to go to local small-box stores as much as I can. Which is why this experience I had at my local small-business lumber yard yesterday was so disappointing:

Me: Hi there, I need three dozen eight foot two-by-fours and three sheets of quarter inch drywall.

Cashier standing by the front door: I think we have those.

Me: I’m quite sure that you do, since this is a lumber store. Are the two-bys and sheet rock in this building, or in the warehouse across the street?

Cashier: I don’t know. I think you’ll have to ask someone else.

Me: You don’t know where the two-by-fours are?

Cashier: This is only my fifth day on the job. Take a number and someone will help you.

I would have thought that “where are the two-by-fours” is the kind of thing you’d sort out on day one at the lumber store, but, whatever.

At this point I note that I am the only customer in the store. Behind the counter there are five employees. Three are talking amongst themselves. One is typing on a computer. One is on the phone. As instructed, I take a number, and walk over to the paint aisle to browse spray paint while I wait for one of the five people behind the counter to call my number.

They do so immediately. The moment my number is called, the three employees who were talking amongst themselves immediately leave the building by the back entrance, and the guy on the phone hangs up and leaves by the front entrance, leaving in the building me, the guy on the computer, and the cashier who does not know where the lumber store keeps their two by fours. I point out to the guy on the computer that my number has just been called, and he says that someone else will help me shortly.

I waited ten minutes watching him silently ignore me, typing away, and then I left and went to the big box store at the other end of town; I knew where the two-bys were there. 

Attention small business owners: I am doing my best to give you my money. Stop making it so hard.

Attention big box store owners: You run vast multinational corporations with huge profits. You can afford to hire and/or train employees to familiarize them with the products you sell and their basic functions.

——————

UPDATE

——————

I emailed the last portion of this blog entry to the owner of the small business involved, and:

——————

I really appreciate the opportunity to address such an egregious example of poor service. I won’t bore you with the details but it was a bad intersection of shift changes, yard service people hanging out at the counter and too few sales people. We watched the tape of your arrival and departure and have talked it over with everyone involved. Please let me tell you we’re embarrassed and ashamed of the way we treated you. Please accept my sincere apology.

——————

The owner also offered me a discount on my next order and free delivery, which was I think a very nice gesture. As I have said often, you can tell the quality of customer service at an organization by how they deal with mistakes. Good service means recognizing the mistake, taking ownership of it, identifying the structural problem that allowed it to happen, and making a gesture of goodwill to the customer; this is an example of really excellent customer service, and I appreciate that very much.

Static analysis of “is”

Returning now to the subject we started discussing last time on FAIC: sometimes the compiler can know via static analysis[1. That is, analysis done knowing only the compile-time types of expressions, rather than knowing their possibly more specific run-time types] that an is operator expression is guaranteed to produce a particular result.
Continue reading