Taking responsibility

Today I answer the question "what's the deal with the fixed statement?" in the form of a dialogue, as is my wont. So:

What's the deal with the fixed statement?

As I noted back in 2009, the purpose of the fixed statement is to tell the garbage collector that your code has made an unsafe, unmanaged pointer into a block of managed memory. Since the garbage collector reserves the right to move that memory around, it is important that you inform the garbage collector that it needs to "pin in place" that memory until you tell it otherwise.

Suppose I am calling unmanaged code from my C# program and I need to pass the code a pointer to a managed array. Eventually control will leave the fixed statement; what if the unmanaged code holds onto that pointer and uses it after the memory becomes unpinned?

Describing what happens in that scenario is not interesting because you are required to not get into that situation in the first place. As the C# specification helpfully points out:

It is the programmer’s responsibility to ensure that pointers created by fixed statements do not survive beyond execution of those statements. For example, when pointers created by fixed statements are passed to external APIs, it is the programmer’s responsibility to ensure that the APIs retain no memory of these pointers.

If you abdicate that responsibility then arbitrarily bad things can happen to your computer; the program can literally do anything that the current process has the right to do, including erasing all your files.

So what if I do that anyway? How do I prevent that undefined behaviour?

If it hurts when you do that then don't do that. Asking "how do I not die from fatally shooting myself?" is a non-starter; don't fatally shoot yourself in the first place if you'd prefer to not die!

No, really, I need to solve this problem! I really do have unmanaged code that captures the pointers I hand to it and dereferences them at an unknown time in the future. What can I do that is responsible?

There are a number of ways to mitigate this terrible situation.

First, you could ensure that control never leaves the fixed block. This is essentially throwing a wrench into the GC performance and also makes it quite difficult to write your program, so I don't recommend it.

Second, you could make a GCHandle object and use it to pin the array in place. It will stay pinned until you free the handle. This will, again, throw a wrench into the garbage collector because there will now be a pinned block that cannot move; the garbage collector will literally have to work around it.1

Third, you could allocate the array out of fixed-in-place unmanaged storage in the first place. For example, you could use AllocHGlobal and FreeHGlobal to do your own memory management. That's what I'd probably do if faced with this unfortunate situation.


Next time on FAIC: What I did on my Kauai vacation.

  1. To mitigate the performance problem you could make the array really big. Large arrays go on a large object heap, which is not compacted like the regular heap is, so the penalty of having an immovable block in the middle of the heap goes away. Of course, making an array far, far larger than it needs to be in order to solve a performance problem is likely to cause performance problems of its own. And also, the behaviour of the large object heap is an implementation detail subject to change at any time, not a contract you can rely on. This is, again, probably a bad idea, but it will work.

What's the difference? fixed versus fixed

I got an email the other day that began:

I have a question about fixed sized buffers in C#:

unsafe struct FixedBuffer 
{ 
  public fixed int buffer[100];

Now by declaring buffer as fixed it is not movable...

And my heart sank. This is one of those deeply unfortunate times when subtle choices made in the details of language design encourage misunderstandings.

When doing pointer arithmetic in unsafe code on a managed object, you need to make sure that the garbage collector does not move the memory you're looking at. If a collection on another thread happens while you're doing pointer arithmetic on an object, the pointer can get all messed up. Therefore, C# classifies all variables as "fixed" or "movable". If you want to do pointer arithmetic on a movable object, you can use the fixed keyword to say "this local variable contains data which the garbage collector should not move." When a collection happens, the garbage collector needs to look at all the local variables for in-flight calls (because of course, stuff that is in local variables needs to stay alive); if it sees a "fixed" local variable then it makes a note to itself to not move that memory, even if that fragments the managed heap. (This is why it is important to keep stuff fixed for as little time as possible.) So typically, we use "fixed" to mean "fixed in place".

But that's not what "fixed" means in this context; this means "the buffer in question is fixed in size to be one hundred ints" -- basically, it's the same as generating one hundred int fields in this structure.

Obviously we often use the same keyword to mean conceptually the same thing. For example, we use the keyword internal in many ways in C#, but all of them are conceptually the same. It is only ever used to mean "accessibility to some entity is restricted to only code in this assembly".

Sometimes we use the same keyword to mean two completely different things, and rely upon context for the user to figure out which meaning is intended. For example:

var results = from c in customers where c.City == "London" select c;

versus

class C<T> where T : IComparable<T>

It should be clear that where is being used in two completely different ways: to build a filter clause in a query, and to declare a type constraint on a generic type parameter.

We cause people to run into trouble when one keyword is used in two different ways but the difference is subtle, like our example above. The user's email went on to ask a whole bunch of questions which were predicated on the incorrect assumption that a fixed-in-size buffer is automatically fixed in place in memory.

Now, one could say that this is just an unfortunate confluence of terms; that "fixed in size" and "fixed in place" just happen to both use the word "fixed" in two different ways, how vexing. But the connection is deeper than that: you cannot safely access the data stored in a fixed-in-size buffer unless the container of the buffer has been fixed in place. The two concepts are actually quite strongly related in this case, but not at allthe same.

On the one hand it might have been less confusing to use two keywords, say pinned and fixed. But on the other hand, both usages of fixed are only valid in unsafe code. A key assumption of all unsafe code features is that if you are willing to use unsafe code in C#, then you are already an expert programmer who fully understands memory management in the CLR. That's why we make you write unsafe on the code; it indicates that you're turning off the safety system and you know what you're doing.

A considerable fraction of the keywords of C# are used in two or more ways: fixed, into, partial, out, in, new, delegate, where, using, class, struct, true, false, base, this, event, return and void all have at least two different meanings. Most of those are clear from the context, but at least the first three -- fixed, into and partial -- have caused enough confusion that I've gotten questions about the differences from perplexed users. I'll take a look at into and partial next.