Taking responsibility

Today I answer the question "what's the deal with the fixed statement?" in the form of a dialogue, as is my wont. So:

What's the deal with the fixed statement?

As I noted back in 2009, the purpose of the fixed statement is to tell the garbage collector that your code has made an unsafe, unmanaged pointer into a block of managed memory. Since the garbage collector reserves the right to move that memory around, it is important that you inform the garbage collector that it needs to "pin in place" that memory until you tell it otherwise.

Suppose I am calling unmanaged code from my C# program and I need to pass the code a pointer to a managed array. Eventually control will leave the fixed statement; what if the unmanaged code holds onto that pointer and uses it after the memory becomes unpinned?

Describing what happens in that scenario is not interesting because you are required to not get into that situation in the first place. As the C# specification helpfully points out:

It is the programmer’s responsibility to ensure that pointers created by fixed statements do not survive beyond execution of those statements. For example, when pointers created by fixed statements are passed to external APIs, it is the programmer’s responsibility to ensure that the APIs retain no memory of these pointers.

If you abdicate that responsibility then arbitrarily bad things can happen to your computer; the program can literally do anything that the current process has the right to do, including erasing all your files.

So what if I do that anyway? How do I prevent that undefined behaviour?

If it hurts when you do that then don't do that. Asking "how do I not die from fatally shooting myself?" is a non-starter; don't fatally shoot yourself in the first place if you'd prefer to not die!

No, really, I need to solve this problem! I really do have unmanaged code that captures the pointers I hand to it and dereferences them at an unknown time in the future. What can I do that is responsible?

There are a number of ways to mitigate this terrible situation.

First, you could ensure that control never leaves the fixed block. This is essentially throwing a wrench into the GC performance and also makes it quite difficult to write your program, so I don't recommend it.

Second, you could make a GCHandle object and use it to pin the array in place. It will stay pinned until you free the handle. This will, again, throw a wrench into the garbage collector because there will now be a pinned block that cannot move; the garbage collector will literally have to work around it.1

Third, you could allocate the array out of fixed-in-place unmanaged storage in the first place. For example, you could use AllocHGlobal and FreeHGlobal to do your own memory management. That's what I'd probably do if faced with this unfortunate situation.


Next time on FAIC: What I did on my Kauai vacation.

  1. To mitigate the performance problem you could make the array really big. Large arrays go on a large object heap, which is not compacted like the regular heap is, so the penalty of having an immovable block in the middle of the heap goes away. Of course, making an array far, far larger than it needs to be in order to solve a performance problem is likely to cause performance problems of its own. And also, the behaviour of the large object heap is an implementation detail subject to change at any time, not a contract you can rely on. This is, again, probably a bad idea, but it will work.

16 thoughts on “Taking responsibility

    • If the object is in a higher generation then it is going to be checked for collection less often and compacted less often, which means that you are less likely to take the penalty for pinning it. Still, it is a bad idea to rely on implementation details of the GC; there is no requirement that the GC be generational, or even that it be mark-and-sweep.

      • A bit of speculations about GC internals.

        Am I correct, that if something is pinned, say, in Gen0 and meanwhile collection occurs - object in question will still remain in Gen0 (and thus developer has a bit of influence on the promoting process)?

        I'm well aware that it's better to leave such things for GC to decide, but it is curious still :)

  1. Pingback: Why are braces required in try-catch-finally? | Fabulous Adventures In Coding

  2. Pingback: The Morning Brew - Chris Alcock » The Morning Brew #1252

  3. Good article. I have to deal with this all the time in NAudio as a lot of Windows audio APIs hang on to pointers you give them and use them later. I've used a mixture of pinned GCHandles and AllocHGlobal / AllocCoTaskMem. On the whole I agree that AllocHGlobal is probably the best approach, but one nice thing about pinned GCHandles is that you can create a pinned byte[], meaning you can avoid an unnecessary array copy. It's a shame you can't bind a byte[] or a struct to memory allocated this way (unless I am missing a trick) or I would move to exclusively using AllocHGlobal.

    • A nice to have API as an alternative would be some sort of hint/flag you could pass to GCHandle to tell the GC this will be long lived. At which point the GC can make the decision to relocate the object to somewhere more appropriate such as the same area AllocHGlobal uses.

  4. I'd like a special allocation method that can allocate objects (even if it's just byte arrays) from an unmoving heap.

    -----

    One interesting variant that's unrelated to unmanaged code is storing cryptographic keys in memory.
    Often you want to be able to destroy those keys once you don't need them anymore. But the GC moving them around might keep a copy somewhere else. This could be avoided by fixing the key array in memory, but that causes the GC issues you mentioned.

    (Of course there are some more issues, such as crash dumps or the swap/hibernation file. Didn't find a good solution yet)

      • SecureString solves the GC issue by using unmanaged memory, but it still has a few issues:
        1) It's designed for strings i.e. character sequences not byte sequences. You can abuse it to store bytes, but that's certainly not idiomatic.
        2) You need unsafe code to access its contents, so you can't use it in sandboxed scenarios.
        3) It's extremely underdocumented, so it's hard to evaluate its security properties.
        4) The encryption is uses is obviously not semantically secure. How weak it is depends on the chaining mode. Assuming AES-CBC, it's probably not a big problem for keys, but for low entropy secret data that can be a problem.

        There is also the ProtectedMemory class, but it has its share of issues too.

  5. This relates to cases when P/Invoke is not used ? I mean since the P/Invoke marshaller pins automatically, even like GetFunctionPointerForDelegate calls are taken care of;of course the unmanaged code has to call the proper cleanup routines on its behalf.
    Then the Marshal class has methods like Marshal.Copy with which you can inter operate with unmanaged code and vice versa.
    So I guess my question is when you use the fixed keyword and when P/Invoke,Marshall?

  6. For what it's worth, a better analogy wouldn't include the word "fatally". Instead use something like:

    Asking "how do I not die from shooting myself through the head?" is a non-starter; don't shoot yourself through the head in the first place if you'd prefer to not die!

    The quibble is that "fatally" implies determinism, while undefined behaviour doesn't. You can survive a bullet through the head (possibly still functional, possibly as a vegetable) or maybe you won't. However, once you commit to the undefined behaviour, the results are out of your hands - it's up to the team of the bullet and your brain at that point.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>