Monads, part two

Last time on FAIC I set out to explore monads from an object-oriented programmer's perspective, rather than delving into the functional programmer's perspective immediately. The "monad pattern" is a design pattern for types, and a "monad" is a type that uses that pattern. Rather than describing the pattern itself, let's start by listing some monad-ish types that you are almost certainly very familiar with, and see what they have in common 1:

  • Nullable<T> -- represents a T that could be null2
  • Func<T> -- represents a T that can be computed on demand
  • Lazy<T> -- represents a T that can be computed on demand once, then cached
  • Task<T> -- represents a T that is being computed asynchronously and will be available in the future, if it isn't already
  • IEnumerable<T> -- represents an ordered, read-only sequence of zero or more Ts

Continue reading

  1. These five types are the ones that immediately come to my mind; I am probably missing some. If you have an example of a commonly-used C# type that is monadic in nature, please leave a comment
  2. As I've discussed before, null in a value type is typically interpreted as "the thing has a value but I don't know what it is". That is, there is a decimal that is the net profits for December, I just don't know what that decimal is right now so I'll say "null". It can also be interpreted as "the thing doesn't even have a value". It's not that we don't know the height of the king of France right now, it's that there is no king of France in the first place, so the height of the king of France is null. The exact semantics are not particularly relevant to our discussion of monadic types however.

Monads, part one

Lots of other bloggers have attempted this, but what the heck, I'll give it a shot too. In this series I'm going to attempt to answer the question:

I'm a C# programmer with no "functional programming" background whatsoever. What is this "monad" thing I keep hearing about, and what use is it to me?

Bloggers often attempt to tackle this problem by jumping straight into the functional programming usage of the term, and start talking about "bind" and "unit" operations, and higher-order functional programming with higher-order types. Even worse is to go all the way back to the category theory underpinning monads and start talking about "monoids in the category endofunctors" and the like. I want to start from a much more pragmatic, object-oriented, C#-type-system focussed place and move towards the rarefied heights of functional programming as we go. Continue reading

Static constructors, part four

We'll finish up this series on static constructors by finally getting to the question which motivated me to write the series in the first place: should you use a static constructor like this?

public class Sensitive
{
  static Sensitive()
  {
    VerifyUserHasPermissionToUseThisClass();
  }  
  public static void Dangerous()
  {
    DoSomethingDangerous();
  }
  ...

The intention here is clear. The static constructor is guaranteed to run exactly once, and before any static or instance method. Therefore it will do the authorization check before any dangerous operation in any method. If the user does not have permission to use the class then the class will not even load. If they do, then the expense of the security check is only incurred once per execution, no matter how many methods are called.

If you've read the rest of this series, or anything I've written on security before, you know what I'm going to say: I strongly recommend that you do not use static constructors "off label" in this manner.

First off, as we've seen so far in this series, static constructors are a dangerous place to run fancy code. If they end up delegating any work to other threads then deadlocks can easily result. If they take a long time and are accessed from multiple threads, contention can result. If an exception is thrown out of the static constructor then you have very little ability to recover from the exception and the type will be forever useless in this appdomain.

Second, the security semantics here are deeply troubling. If this code is running on the user's machine then this appears to be a case of the developer not trusting the user. But the .NET security system was designed with the principle that the user is the source of trust decisions. It is the user who must trust the developer, not the other way around! If the user is hostile towards the developer then nothing is stopping them from decompiling the code to IL, removing the static constructor, recompiling, and running the dangerous method without the security check. 1

Moreover, the pattern here assumes that security checks can be performed once and the result is then cached for the lifetime of the appdomain. What if the initial security check fails, but the program was going to impersonate a more trusted user? It might be difficult to ensure that the static constructor does not run until after the impersonation. What if different threads are associated with different users? Now we have a race to see which user's context is used for the security check. What if a user's permissions are revoked while the program is running? The check might be performed while permission is granted, and then the dangerous code runs after it has been revoked.

In short: Static constructors should be used to quickly initialize important static data, and that's pretty much it. The only time that I would use the mechanisms of a static constructor to enforce an invariant would be for a very simple invariant like ensuring that a singleton is lazily initialized, as Jon describes. Complex policy mechanisms like security checks should probably use some other mechanism.


Next time on FAIC: I'm going to join the throngs of tech bloggers who have tried to explain what a monad is.

  1. The resulting program will of course not be strong-named or code-signed by the developer anymore, but who cares?

Static constructors, part three

Earlier in this series I recommended that you brush up on how instance constructors work; if you did, then you'll recall that instance field initializers are essentially moved into the beginning of an instance constructor at a point before the call to the base class constructor. You might think that static field initializers work the same: a static field initializer is silently inserted at the beginning of the static constructor. And that's true. Mostly.

In the case where there is already a static constructor, even an empty one, that's what happens: the field initializers become the prologue to the body of the static constructor, and the usual rules for static constructors then apply. (That is, the static constructor is invoked immediately before the first static member access or instance constructor access.) But suppose there is no user-supplied static constructor. Then what happens?

The C# compiler is not bound by the rules of static constructors in this case, and in fact, does not treat your program as though there was an empty static constructor that has static field initializers in it. Rather, it tells the runtime that the runtime may choose when to run static field initializers, entirely at its discretion, just so long as all the fields are initialized 1 before they are used. In this scenario the runtime is permitted to run static field initializers as late as possible; it could wait until a static field is actually accessed, rather than waiting for any static member or instance constructor to be accessed. The runtime is also permitted to run static field initializers as early as possible; it could decide to run all the field initializers at once at the beginning of the program, even if the class in question was never used. It is up to the runtime implementation to decide.

Pre .NET 4 implementations of the runtime make an interesting choice; they run static field initializers of classes that have no static constructors when the first method that refers to the class is jitted. If we have:

class Alpha 
{
  static int x = 123;
  static void M() { }
}
class Bravo
{
  static int y = 456;
  static void N() { }
}
class Charlie
{
  static void Q(bool b)
  {
    if (b) Alpha.M(); else Bravo.N();
  }
  static void Main() 
  { 
    Q(true); 
  }
}

Then when Q is jitted, the field initializers for both x and y will be executed. If Bravo had a static constructor then it would not be initialized, because Bravo's static constructor is only triggered by the actual execution of the call.

The runtime makes this rather odd-seeming choice as an optimization; this way it doesn't have to generate code on every static member access that checks to see if the field initializers have run yet! It knows that if you get as far as accessing a static member, then the method that does so must have been jitted, and therefore the field initializers have been run already.

I originally believed this to be the case in .NET 4 as well, but Jon informs me that current versions of the runtime are even lazier than that. See Jon's comment below for details. (Thanks Jon!)

I find this to be one of the strangest C#/CLR features and for many years I did not understand it at all well -- and, since Jon has corrected my understanding of it, apparently I still don't! Every time I encountered a question about it, I just referred the questioner to Jon's excellent page on the subject. For more information and discussion on this rather odd feature, check it out.


Next time on FAIC: We'll finish up this series by looking at an abuse of static constructors.

  1. The runtime is still responsible for ensuring that fields are initialized in the right order, because one static field might be initialized based on the contents of another. This is a bad idea, but it is legal, so the compiler has to honour that.

Free beer!

OK, that got your attention.

Most of the Coverity C# analysis team is going to be in Seattle celebrating the opening of our new office on Wednesday February 20th. We'll be at the Tap House Grill on 6th Avenue in Seattle, starting about 6:15. If you're in Seattle at that time, over 21 years old, and want to hang out with me and the team then please stop by!1

There are a very small number of free drink tickets available; if you want one, email me (Eric@Lippert.com) and I'll send you instructions on how to sign up. Supplies are limited, so serious enquiries only please.

  1. Note that this is the Tap House Grill in Seattle, not the one in Bellevue.

Static constructors, part two

Previously on FAIC I gave a quick overview of the basic operation of a static constructor. Today, three unusual corner cases.

The first odd case I want to talk about involves static methods. Take a look at the sample program from last time. Now suppose we edited the Main method to say:

static void Main() 
{
  D.M();
}

First off, is that even legal? Sure! Inheritance means that all inheritable members of B are also members of D. M is an inheritable member of B, so it is a member of D, right?

Unfortunately, this corner case is the one that exposes the leaky abstraction. The compiler generates code as though you had said B.M();, and therefore D's static constructor is not called even though "a member of D" has been invoked. This actually makes a fair amount of sense. The method B.M is going to be called, and there's no reason to go to all the work of running D's static constructor when B.M probably does not depend on any work done by D's constructor. And it would seem strange if calling the same method by two different syntaxes would result in different static constructor invocations.

Now let's consider a second case involving static method invocation. Suppose now we edited Main to say:

static void Main() 
{
  D.N();
}

Clearly D's static constructor must be invoked. What about B? Is its static constructor invoked? No! A static constructor is triggered by a usage of a static member, or by the creation of an instance. Invoking D.N does not use any static member of B and it does not create an instance of B, so B's static constructor is not invoked. People sometimes expect that static constructors of base classes will always be invoked before static constructors of derived classes, but that's not the case.

Our third odd case is: what happens when a static constructor throws an exception?

Absolutely nothing good! First off, of course if the exception goes unhandled then all bets are off. The runtime is permitted to do anything it likes if there is an unhandled exception, including such options as starting up a debugger, terminating the appdomain immediately, terminating the application after running finally blocks, and so on. And an exception in a static constructor can easily go unhandled; trying to wrap every possible first usage of a type with a try-catch block is onerous.

And even if by some miracle the exception gets handled the first time, odds are very good that your program is now in such a damaged state that it is going to go down in flames soon. Remember, I said that a static constructor runs once, and by that I meant once; if it throws, you don't get a second chance. Instead, when a static constructor terminates abnormally, the runtime marks the type as unusable, and every attempt by your program to use that type results in another exception.

An interesting fact about static constructors that throw exceptions is that when the runtime detects that a static constructor has terminated abnormally, it wraps the exception in its own exception and throws that instead. Check out this StackOverflow answer, where Jon demonstrates this in action.


Next time on FAIC: I'll defer to Jon again when I discuss how the runtime is permitted to optimize some static constructors.

Digital pain (rerun)

Today, another fun-for-your-Friday rerun from the past decade of FAIC.


When you bang your finger with a hammer or burn it on the stove, somehow the pain has to get from your finger to your brain via a nerve. That's an immense distance on the cellular level. What possible mechanisms are there for that?

A nerve carries the pain signal from your finger to your brain like a wire carries electrical current. Perhaps zero "voltage" on that nerve would represent "no pain", and then the "voltage" would vary smoothly up to some maximum that represented "extreme pain". That's a plausible mechanism.

Or, you could have a system where zero "voltage" meant "no pain", 100% meant "severe pain", and any lesser amount of pain is measured by the average power delivered over a period of time, say, a millisecond. If the nerve was on 100% for 250 microseconds, then off for 750 microseconds, that indicates a 25% level of pain. If it was then on for 220 microseconds, off for 780, then on for 200, off for 800, that would indicate that pain was decreasing from 25% to 22% to 20%. The granularity of a millisecond might not be quite right, but in principle this would work. 1

Both of those are analog systems: in an analog system the possible signals smoothly vary from 0% to 100%. But neither of them are how nervous systems actually work. If you measure the "signal strength" on pain nerves you see that they actually send groups of extremely short bursts, where the number of bursts per unit time indicates the level of pain. That is, the nervous system communicates pain by sending an integer from the source of the pain to the brain! It's discrete, not analog.

Why is that?

You know what my favourite scene in the movie version of The Return of the King
is? It's the one where Gandalf needs to send a message to Rohan in a hurry, so he has Pippin climb up the side of a mountain to light a signal beacon. We then see this great sequence as the beacon wardens set off the seven signal fires on Amon Din, Eilenach, Nardol, Erelas, Min-Rimmon, Calenhad and Halifirien, one after the other.

Listen carefully to the soundtrack at this point. It echoes Gandalf's "White Rider" theme, which was established earlier in a visually parallel set of shots, as Gandalf and Pippin ride up the seven levels of Minas Tirith. But to give it some additional fire, they add these kick-ass violin arpeggios on top of the basso continuo. The whole thing works perfectly; it's a triumph of cinematography!

But I digress.

The Gondorian war beacon system sends a single binary bit at extremely high speed over the huge distance from the White Tower in Gondor to Meduseld in Rohan. Suppose the Gondorians wanted to send more information than just "we need help!" -- like, say, how big the opposing army was. They could come up with an analog convention. Perhaps a really big signal fire indicates a really big army, a small signal fire indicates a small army, a medium sized signal fire indicates a medium-sized army.

Obviously that doesn't work. There are seven beacon fires. The beacon keepers would have to figure out how big the previous fire was and then set their fire accordingly. Error would accumulate along each part of the process, and the final result might bear little relation to the original input.

To send information accurately over long distances without error you need some kind of discrete system. You need signals that are clearly either ON or OFF. Once you've got ON and OFF, you can use them to transmit integers, letters, morse code, whatever you want.

That is why nerves are digital, not analog. Nerves need to transmit huge amounts of complex data over the vast distance from your toes to your brain without accruing any error along the way as the signal is picked up by new nerves and forwarded on. All the sensory nerves work this way: smell, touch, taste and so on, are digital. We are digital machines, we're just digital machines made out of meat instead of silicon.


Next time on FAIC: We'll resume the current series on uses and abuses of the static constructor.

  1. Lighting circuit dimmer switches work using these two mechanisms. Old-style dimmer switches work by adding a resistance to the line that decreases the voltage overall, and modern dimmer switches work by cutting out a certain percentage of the power signal.

Static constructors, part one

Previously on FAIC we saw how easy it was to deadlock a program by trying to do something interesting in a static constructor.1 Static constructors and destructors2 are the two really weird kinds of methods, and you should do as little as possible in them.

Before I expound further on that topic though, a look at how static constructors work is in order. And before I do that, it's probably a good idea that you get a refresher on how instance constructors work. My article "Why do initializers run in the opposite order of constructors?" provides a detailed look at constructor semantics, so maybe check that out if you have a few minutes. Part one is here and part two is here.

OK, now that you know how instance constructors work, let's dig into static constructors. The idea is pretty simple: a static constructor is triggered to run immediately before the first static method on its class is called, or immediately before the first instance of its class is created. As we saw previously, the runtime tracks when a static constructor is "in flight" and uses that mechanism to ensure that each static constructor is invoked no more than once.

Now that you know all of that, you can predict the output of this simple program:

using System;
class B
{
  static B() { Console.WriteLine("B cctor"); }
  public B() { Console.WriteLine("B ctor"); }
  public static void M() { Console.WriteLine("B.M"); }
}
class D : B
{
  static D() { Console.WriteLine("D cctor"); }
  public D() { Console.WriteLine("D ctor"); }
  public static void N() { Console.WriteLine("D.N"); }
}
class P 
{
  static void Main()
  {
    System.Console.WriteLine("Main");
    new D();
  }  
}

We know that B's instance constructor must be invoked before D's instance constructor, and we know that D's static constructor must be invoked before D's instance constructor. The only interesting question here is "when will B's static constructor be invoked?" An instance of D is also an instance of B, so B's static constructor has to be invoked at some point.

As you know from reading my article on instance constructors, what actually happens is that the compiler generates D's instance constructor so that the first thing it does is call B's instance constructor; that's how we get the appearance that B's instance constructor runs first. Thus, the actual order of events here can be best conceptualized like this:

  • Main starts. It prints out its message and then tries to invoke D's instance constructor on a new instance of D.
  • The runtime detects that D's instance constructor is about to be invoked, so it invokes D's static constructor.
  • D's instance constructor invokes B's instance constructor. The runtime detects that, so it invokes B's static constructor.
  • B's instance constructor runs and returns control to D's instance constructor, which finishes normally.

Pretty straightforward. Let's mix it up a little.


Next time on FAIC: A brief digression for fun on a Friday. Then next week we'll resume this series and take a look at a few less straightforward cases.

  1. Static constructors are also called "class constructors". Since the actual method generated has the name .cctor they are often also called "cctors". Since "static constructor" is the jargon used in the C# specification, that's what I'll stick to.
  2. Astonishingly, I've never blogged about how difficult it is to write a correct destructor, though it has come up on StackOverflow. That's a good topic for a future fabulous adventure.

The view from Columbia Center

NewOffice

Today we set up Coverity's Seattle office! I've spent the day unpacking boxes and booting machines and discovering why it is a bad idea to ship desktop machines with the hard disks in them. Aside from a minor hard disk mishap, everything has gone very smoothly. Special thanks to my colleagues Deidre and Jeff, who came up from San Francisco to get the network humming and make sure everything was taken care of.

The view from my office is, to say the least, awesome. (Click on the images for larger versions.)

NewOffice2

I haven't gotten around to untangling all the wires yet, but we're basically good to go here.


Next time on FAIC: I'll start a short series on the uses and abuses of the static constructor.