What’s the difference? sizeof and Marshal.SizeOf

I often see StackOverflow answers that confuse the sizeof operator with the Marshal.SizeOf method. These two operators do different things and can return different results, so it is important to know which is which.

In a nutshell, the difference is: the sizeof operator takes a type name and tells you how many bytes of managed memory need to be allocated for an instance of that struct.[1. I don’t have to tell long-time readers of this blog that of course this is not necessarily stack memory; structs are allocated off the heap when they are array elements, fields of a class, and so on.] By contrast, Marshal.SizeOf takes either a type object or an instance of the type, and tells you how many bytes of unmanaged memory need to be allocated. These can be different for a variety of reasons. The name of the type gives you a clue: Marshal.SizeOf is intended to be used when marshaling a structure to unmanaged memory.

Another difference between the two is that the sizeof operator can only take the name of an unmanaged type; that is, a struct type whose fields are only integral types, Booleans, pointers and so on. (See the specification for an exact definition.) Marshal.SizeOf by contrast can take any class or struct type.

Nullable micro-optimizations, part one

Which is faster, Nullable<T>.Value or Nullable<T>.GetValueOrDefault()?

Before I answer that question, my standard response to “which horse is faster?” questions applies. Read that first.

.
.
.

Welcome back. But again, before I answer the question I need to point out that the potential performance difference between these two mechanisms for obtaining the non-nullable value of a nullable value type is a consequence of the fact that these two mechanisms are not semantically equivalent. The former may legally only be called if you are sure that the nullable value is non-null; put another way, calling Value without knowing that HasValue is true is a boneheaded exception. The latter may be called on any nullable value. A glance at a simplified version of the source code illustrates the difference.

struct Nullable<T> where T : struct
{
  private bool hasValue;
  private T value;
  public Nullable(T value)
  {
    this.hasValue = true;
    this.value = value;
  }
  public bool HasValue { get { return this.hasValue; } }
  public T Value
  {
    get
    {
      if (!this.HasValue) throw something;
      return this.value;
    }
  }
  public T GetValueOrDefault() 
  {
    return this.value; 
  }
  ... and then all the other conversion gear and so on ...
}

The first thing to notice is that a nullable value type’s ability to represent a “null” integer or decimal or whatever is not magical. (Nullable value types are magical in other ways; for example, there’s no way to write your own struct that has the strange boxing behaviour of a nullable value type; an int? boxes to either an int or null, never to a boxed int?. But let’s not worry about these magical features today.) A nullable value type is nothing more than an instance of the value type plus a bool saying whether it’s null or not.

If a variable of nullable value type is initialized with the default constructor then the hasValue field will be its default value, false, and the value field will be default(T). If it is initialized with the declared constructor then of course the hasValue field is true and the value field is any legal value, including possibly T‘s default value. Thus, the implementation of GetValueOrDefault() need not check the flag; if the flag is true then the value field is set correctly, and if it is false, then it is set to the default value of T.

Looking at the code it should be clear that Value is almost certainly not faster than GetValueOrDefault() because obviously the former does exactly the same work as the latter in the success case, plus the additional work of the flag check. Moreover, because GetValueOrDefault() is so brain-dead simple, the jitter is highly likely to perform an inlining optimization.

An inlining optimization is where the jitter eliminates an unnecessary “call” and “return” instruction by simply generating the code of the method body “inline” in the caller. This is a great optimization because doing so can make code both smaller and faster in some cases, though it does make it harder to debug because the debugger has no good way to generate breakpoints inside the inlined method.

How the jitter chooses to inline or not is an implementation detail, but it is reasonable to assume that it is less likely to perform an inlining optimization on code that contains more than one “basic block” and has a throw in it.

A “basic block” is a region of code where you know that the code will execute from the top of the block to the bottom without any “normal” branches in or out of the middle of the block. (A basic block may of course have exceptions thrown out of it.) Many optimizing compilers use “basic blocks” as an abstraction because it abstracts away the unnecessary details of what the block actually does, and treats it solely as a node in a flow control graph.

It should also be clear that though the relative performance difference might be large, the absolute difference is small. A call, field fetch, conditional jump and return in the typical case makes up the difference, and those things are each only nanoseconds.

Now, this is of course not to say that you should willy-nilly change all your calls to Value to GetValueOrDefault() for performance reasons. Read my rant again if you have the urge to do that! Don’t go changing working, debugged, tested code in order to obtain a performance benefit that is (1) highly unlikely to be a real bottleneck, and (2) highly unlikely to be your worst performance problem.

And besides, using Value has the nice property that if you have made a mistake and fetched the value of a null, you’ll get an exception that informs you of where your bug is! Code that draws attention to its faults is a good thing.

Finally, I note that here we have one of those rare cases where the frameworks design guidelines have been deliberately bent. We have a “Get” method is actually faster than a property getter, and the property getter throws! Normally you expect the opposite: the “Get” method is usually the one that is slow and can throw, and the property is the one that is fast and never throws. Though this is somewhat unfortunate, remember, the design guidelines are our servants, not our masters, and they are guidelines, not rules.


Next time on FAIC: How does the C# compiler use its knowledge of the facts discussed today to your advantage? Have a great Christmas everyone; we’ll pick up this subject again in a week.

Atomicity, volatility and immutability are different, part one

I get a fair number of questions about atomicity, volatility, thread safety, immutability and the like; the questions illustrate a lot of confusion on these topics. Let’s take a step back and examine each of these ideas to see what the differences are between them.

First off, what do we mean by “atomic”? From the Greek ἄτομος, meaning “not divisible into smaller parts”, an “atomic” operation is one which is always observed to be done or not done, but never halfway done. The C# specification clearly defines what operations are atomic in section 5.5. The atomic operations are: reads and writes of variables of any reference type, or, effectively, any built-in value type that takes up four bytes or less, like int, short and so on. Reads and writes of variables of value types that take more than four bytes, like double, long and decimal, are not guaranteed to be atomic by the C# language. (There is no guarantee that they are not atomic! They might in practice be atomic on some hardware. Or they might not.)

Continue reading

What’s the difference between conditional compilation and the Conditional attribute?

User: Why does this program not compile correctly in the release build?

class Program 
{ 
#if DEBUG 
    static int testCounter = 0; 
#endif 
    static void Main(string[] args) 
    { 
        SomeTestMethod(testCounter++); 
    } 
    [Conditional("DEBUG")] 
    static void SomeTestMethod(int t) { } 
}

Eric: This fails to compile in the release build because testCounter cannot be found in the call to SomeTestMethod.

User: But that call site is going to be omitted anyway, so why does it matter? Clearly there’s some difference here between removing code with the conditional compilation directive versus using the conditional attribute, but what’s the difference?

Eric: You already know the answer to your question, you just don’t know it yet. Let’s get Socratic; let me turn this around and ask you how this works. How does the compiler know to remove the method call site?

User: Because the method called has the Conditional attribute on it.

Eric: You know that. But how does the compiler know that the method called has the Conditional attribute on it?

User: Because overload resolution chose that method. If this were a method from an assembly, the metadata associated with that method has the attribute. If it is a method in source code, the compiler knows that the attribute is there because the compiler can analyze the source code and figure out the meaning of the attribute.

Eric: I see. So fundamentally, overload resolution does the heavy lifting. How does overload resolution know to choose that method? Suppose hypothetically there were another method of the same name with different parameters.

User: Overload resolution works by examining the arguments to the call and comparing them to the parameter types of each candidate method and then choosing the unique best match of all the candidates.

Eric: And there you go. Therefore the arguments must be well-defined at the point of the call, even if the call is going to be removed. In fact, the call cannot be removed unless the arguments are extant! But in the release build, the type of the argument cannot be determined because its declaration has been removed.

So now you see that the real difference between these two techniques for removing unwanted code is what the compiler is doing when the removal happens. At a high level, the compiler processes a text file like this. First it “lexes” the file. That is, it breaks the string down into “tokens” — sequences of letters, numbers and symbols that are meaningful to the compiler. Then those tokens are “parsed” to make sure that the program conforms to the grammar of C#. Then the parsed state is analyzed to determine semantic information about it; what all the types are of all the expressions and so on. And finally, the compiler spits out code that implements those semantics.

The effect of a conditional compilation directive happens at lex time; anything that is inside a removed #if block is treated by the lexer as a comment. It’s like you simply deleted the whole contents of the block and replaced it with whitespace. But removal of call sites depending on conditional attributes happens at semantic analysis time; everything necessary to perform that semantic analysis must be present. 

User: Fascinating. Which parts of the C# specification define this behavior?

Eric: The specification begins with a handy “table of contents”, which is very useful for answering such questions. The table of contents states that section 2.5.1 describes “Conditional compilation symbols” and section 17.4.2 describes “The Conditional attribute”.

User: Awesome!

What’s the difference? fixed versus fixed

I got an email the other day that began:

I have a question about fixed sized buffers in C#:

unsafe struct FixedBuffer 
{ 
  public fixed int buffer[100];

Now by declaring buffer as fixed it is not movable…

And my heart sank. This is one of those deeply unfortunate times when subtle choices made in the details of language design encourage misunderstandings.

When doing pointer arithmetic in unsafe code on a managed object, you need to make sure that the garbage collector does not move the memory you’re looking at. If a collection on another thread happens while you’re doing pointer arithmetic on an object, the pointer can get all messed up. Therefore, C# classifies all variables as “fixed” or “movable”. If you want to do pointer arithmetic on a movable object, you can use the fixed keyword to say “this local variable contains data which the garbage collector should not move.” When a collection happens, the garbage collector needs to look at all the local variables for in-flight calls (because of course, stuff that is in local variables needs to stay alive); if it sees a “fixed” local variable then it makes a note to itself to not move that memory, even if that fragments the managed heap. (This is why it is important to keep stuff fixed for as little time as possible.) So typically, we use “fixed” to mean “fixed in place”.

But that’s not what “fixed” means in this context; this means “the buffer in question is fixed in size to be one hundred ints” — basically, it’s the same as generating one hundred int fields in this structure.

Obviously we often use the same keyword to mean conceptually the same thing. For example, we use the keyword internal in many ways in C#, but all of them are conceptually the same. It is only ever used to mean “accessibility to some entity is restricted to only code in this assembly”.

Sometimes we use the same keyword to mean two completely different things, and rely upon context for the user to figure out which meaning is intended. For example:

var results = from c in customers where c.City == "London" select c;

versus

class C<T> where T : IComparable<T>

It should be clear that where is being used in two completely different ways: to build a filter clause in a query, and to declare a type constraint on a generic type parameter.

We cause people to run into trouble when one keyword is used in two different ways but the difference is subtle, like our example above. The user’s email went on to ask a whole bunch of questions which were predicated on the incorrect assumption that a fixed-in-size buffer is automatically fixed in place in memory.

Now, one could say that this is just an unfortunate confluence of terms; that “fixed in size” and “fixed in place” just happen to both use the word “fixed” in two different ways, how vexing. But the connection is deeper than that: you cannot safely access the data stored in a fixed-in-size buffer unless the container of the buffer has been fixed in place. The two concepts are actually quite strongly related in this case, but not at allthe same.

On the one hand it might have been less confusing to use two keywords, say pinned and fixed. But on the other hand, both usages of fixed are only valid in unsafe code. A key assumption of all unsafe code features is that if you are willing to use unsafe code in C#, then you are already an expert programmer who fully understands memory management in the CLR. That’s why we make you write unsafe on the code; it indicates that you’re turning off the safety system and you know what you’re doing.

A considerable fraction of the keywords of C# are used in two or more ways: fixed, into, partial, out, in, new, delegate, where, using, class, struct, true, false, base, this, event, return and void all have at least two different meanings. Most of those are clear from the context, but at least the first three — fixed, into and partial — have caused enough confusion that I’ve gotten questions about the differences from perplexed users. I’ll take a look at into and partial next.