Nostalgia, horror, and a very old bug

My next article about graph traversal is pre-empted by this breaking news; I’ll pick up that series again soon.

Yesterday morning a coworker forwarded to me an article about a recently patched security hole in Windows, and wondered if I had any thoughts on it. Oh, did I! I read about the exploit with an odd mixture of nostalgia — because I worked on the code in question back in the 1990s — and horror at how long this exploitable bug had been in Windows.

To be clear, I did not write the actual exploitable code; it predates my time at Microsoft. But I was worried while I was reading the article that it might turn out to be my bad! This is the second time that has happened to me, and it is not a pleasant feeling.

Coverity has a research team devoted specifically to security-impacting bugs, and they were kind enough to ask me to write up my thoughts for their blog. You can read about my guess at what the buggy code looked like here.

If you have examples of “missing restore”-style bugs — security-impacting or not — in real-world code in any language, I would love to see them. Please leave examples in the comments here or on the security blog. Thanks!

About these ads

Heartbleed and static analysis

In the wake of the security disaster that is the Heartbleed vulnerability, a number of people have asked me if Coverity’s static analyzer detects defects like this. It does not yet, but you’d better believe our security team is hard at work figuring out ways to detect and thereby prevent similar defects. (UPDATE: We have shipped a hotfix release that has a checker that will find defects like the HeartBleed defect. The security team works fast!)

I’ll post some links to some articles below, but they’re a big jargonful, so I thought that a brief explanation of this jargon might be appropriate. The basic idea is as follows: Continue reading

Static constructors, part four

We’ll finish up this series on static constructors by finally getting to the question which motivated me to write the series in the first place: should you use a static constructor like this?

public class Sensitive
{
  static Sensitive()
  {
    VerifyUserHasPermissionToUseThisClass();
  }  
  public static void Dangerous()
  {
    DoSomethingDangerous();
  }
  ...

The intention here is clear. The static constructor is guaranteed to run exactly once, and before any static or instance method. Therefore it will do the authorization check before any dangerous operation in any method. If the user does not have permission to use the class then the class will not even load. If they do, then the expense of the security check is only incurred once per execution, no matter how many methods are called.

If you’ve read the rest of this series, or anything I’ve written on security before, you know what I’m going to say: I strongly recommend that you do not use static constructors “off label” in this manner.

First off, as we’ve seen so far in this series, static constructors are a dangerous place to run fancy code. If they end up delegating any work to other threads then deadlocks can easily result. If they take a long time and are accessed from multiple threads, contention can result. If an exception is thrown out of the static constructor then you have very little ability to recover from the exception and the type will be forever useless in this appdomain.

Second, the security semantics here are deeply troubling. If this code is running on the user’s machine then this appears to be a case of the developer not trusting the user. But the .NET security system was designed with the principle that the user is the source of trust decisions. It is the user who must trust the developer, not the other way around! If the user is hostile towards the developer then nothing is stopping them from decompiling the code to IL, removing the static constructor, recompiling, and running the dangerous method without the security check. [1. The resulting program will of course not be strong-named or code-signed by the developer anymore, but who cares?]

Moreover, the pattern here assumes that security checks can be performed once and the result is then cached for the lifetime of the appdomain. What if the initial security check fails, but the program was going to impersonate a more trusted user? It might be difficult to ensure that the static constructor does not run until after the impersonation. What if different threads are associated with different users? Now we have a race to see which user’s context is used for the security check. What if a user’s permissions are revoked while the program is running? The check might be performed while permission is granted, and then the dangerous code runs after it has been revoked.

In short: Static constructors should be used to quickly initialize important static data, and that’s pretty much it. The only time that I would use the mechanisms of a static constructor to enforce an invariant would be for a very simple invariant like ensuring that a singleton is lazily initialized, as Jon describes. Complex policy mechanisms like security checks should probably use some other mechanism.


Next time on FAIC: I’m going to join the throngs of tech bloggers who have tried to explain what a monad is.

Defect spotting, part two

I had a great time hanging out with my colleagues Bob and Amie yesterday at the HUB, talking with students, spotting defects and handing out yo-yos. Thanks to all who came out, and to the SWE for putting on a great event.

To follow up on the puzzle I posted yesterday: the terrible flaw, which most people spotted right away, was that the expression geteuid != 0 was of course intended to be geteuid() != 0. The code as written compares the function pointer to null, which it never is, and therefore the right side is always true, and therefore the conditional falls into the “fail with an error” branch more often than it ought to. The program succeeds if the user really is root, or if they are “sudo” root. It is intended to succeed also if the user is “effectively” root, but it does not. Thank goodness in this case the program fails to a secure mode! It is not at all difficult to imagine a situation where such an accidental function pointer usage causes the program to fail into the insecure mode. In any event, Coverity’s checker catches this one. (And of course more modern languages like C# do not allow you to use methods in a context other than a call or delegate conversion.)

There are of course any number of other flaws in this fragment. First, it’s now considered bad form to check for root like this; rather, check to see if the user is granted an appropriate permission. Second, the code is hard to read if you do not know the convention that the root user gets magical id zero by default; the code could be much more self-documenting. And so on; several people made good observations in the comments.


Next time on FAIC: You can build a hollow house out of solid bricks, and you can build a deadlocking program out of threadsafe methods too.

Defect spotting at the HUB

The University of Washington chapter of the Society of Women Engineers is putting on their quarterly career fair today at the Husky Union Building, and I’ll be there with a couple of my fabulous Coverity colleagues. If you happen to be a UW student reading this and want to stop on by for a round of my favourite game, “Spot the Defect” please do. We’ll be presenting a whole pile of buggy code; some of the defects will be quite straightforward and some of them will be subtle, but they all have a lesson.

If you want to play along at home, here’s one of the easy ones; this code was in a real used-by-customers product when Coverity’s static analysis engine discovered it:

{
  ...
  if (!strcmp(argv[i], "-configure"))
  {
    if (getuid() != 0 && geteuid != 0)
    {
      puts("Only root can use -configure.n");
      exit(1);
    }
  }
  xf86DoConfigure = TRUE;
  xf86AllowMouseOpenFail = TRUE;
  return 1;
}

Can you spot the defect and describe its consequence?

Anyways, like I said, if you’re a UW student then stop on by the booth and we’ll chat. The fair is open 12:30 to 5:30; if you’re a SWE member then you get early admission and can come in any time after 12:00. Hope to see you there!


Next time on FAIC: The solution to today’s puzzle.

Dynamic contagion, part one

This is part one of a two-part series on dynamic contagion. Part two is here.


Suppose you’re an epidemiologist modeling the potential spread of a highly infectious disease. The straightforward way to model such a series of unfortunate events is to assume that the population can be divided into three sets: the definitely infected, the definitely healthy, and the possibly infected. If a member of the healthy population encounters a member of the definitely infected or possibly infected population, then they become a member of the possibly infected population. (Or, put another way, the possibly infected population is closed transitively over the exposure relation.) A member of the possibly infected population becomes classified as either definitely healthy or definitely infected when they undergo some sort of test. And an infected person can become a healthy person by being cured.

This sort of contagion model is fairly common in the design of computer systems. For example, suppose you have a web site that takes in strings from users, stores them in a database, and serves them up to other users. Like, say, this blog, which takes in comments from you, stores them in a database, and then serves them right back up to other users. That’s a Cross Site Scripting (XSS) attack waiting to happen right there. A common way to mitigate the XSS problem is to use data tainting, which uses the contagion model to identify strings that are possibly hostile. Whenever you do anything to a potentially-hostile string, like, say, concatenate it with a non-hostile string, the result is a possibly-hostile string. If the string is determined via some test to be benign, or can have its potentially hostile parts stripped out, then it becomes safe.

The “dynamic” feature in C# 4 and above has a lot in common with these sorts of contagion models. As I pointed out last time, when an argument of a call is dynamic then odds are pretty good that the compiler will classify the result of the call as dynamic as well; the taint spreads. In fact, when you use almost any operator on a dynamic expression, the result is of dynamic type, with a few exceptions. (“is” for example always returns a bool.)  You can “cure” an expression to prevent it spreading dynamicism by casting it to object, or to whatever other non-dynamic type you’d like; casting dynamic to object is an identity conversion.

The way that dynamic is contagious is an emergent phenomenon of the rules for working out the types of expressions in C#. There is, however, one place where we explicitly use a contagion model inside the compiler in order to correctly work out the type of an expression that involves dynamic types: it is one of the most arcane aspects of method type inference. Next time I’ll give you all the rundown on that.


This is part one of a two-part series on dynamic contagion. Part two is here.