Construction destruction

Take a look at this little program skeleton:

class Foo 
{ 
    private int x;
    private int y;
    public Foo(int x, int y) 
    {
        this.x = x;
        this.y = y;
        SideEffects.Alpha(); // Notice: does not use "this"
    }
    ~Foo() 
    { 
        SideEffects.Charlie(); 
    }
}
static class SideEffects
{
    public static void Alpha() { ... }
    public static void Bravo() { ... }
    public static void Charlie() { ... }
    public static void M()
    {
        Foo foo = new Foo(1, 2); 
        Bravo();
    }
}

Let’s suppose we have three side effects: Alpha, Bravo and Charlie. What precisely they are is not important.

The question is: what do we know about the order in which Alpha, Bravo and Charlie execute when a call to M() occurs?

First off, clearly Alpha must happen before Bravo. The C# compiler and the jit compiler are not permitted to make any optimization that would cause the side effects of a single-threaded program to appear out of order. The construction of foo must be complete before control is passed to Bravo, which means that Alpha has already run.

You might think that Charlie has to happen after Bravo by the following reasoning: Charlie does not run until the object referred to by foo is garbage collected. The local foo is alive until the end of M and therefore it cannot be collected until after Bravo.

This reasoning is incorrect. The C# compiler and the jitter are both permitted to notice that foo is never read from after its initial write, and therefore it can be garbage collected before the end of M, and therefore can be garbage collected before Bravo.

But didn’t I just say that the compilers were not permitted to make this optimization? No, I did not. I said that they were not permitted to make this optimization in a single-threaded program, but the garbage collector and the finalizer queue run on their own threads! The C# language makes very few guarantees about how side effects are ordered in a multi-threaded program; only very special side effects like volatile writes, thread creation, and so on, are guaranteed to be observed in a particular order.

So it is legal for Charlie to happen before — or during! — Bravo. But surely[1. There it is again.] Charlie happens after Alpha?

Nope! This isn’t guaranteed either. The jitter can notice that foo is never read from and is therefore useless; it can throw it away entirely. The jitter can then notice that neither the reference to this nor fields x and y are ever used after this.y = y;, and mark the object as a candidate for finalization before Alpha. It is possible for an object to be destructed while its constructor is running on another thread!

This is of course extremely unlikely, but it is legal, and therefore you have to assume the worst. This is yet another reason why it is so difficult to write a correct destructor in .NET; you can’t assume that the constructor finished before the destructor runs, so any invariants that you’re setting up in your constructor are not necessarily valid in the destructor.

So what do you do if you’re in this crazy situation and you require Charlie to run after Bravo?

    public static void M()
    {
        Foo foo = new Foo(1, 2); 
        Bravo();
        GC.KeepAlive(foo);
    }

The GC.KeepAlive method is a very special method that tells the jitter and the garbage collector that the lifetime of foo must be extended to at least this point. Make sure you understand that: when you call KeepAlive, the object is alive before the call and possibly dead after the call.

Of course, the far better thing to do would be to make Foo expose its cleanup[2. Possibly via IDisposable if the side effects in question are cleaning up an unmanaged resource] and simply say:

    public static void M()
    {
        Foo foo = new Foo(1, 2);
        Bravo();
        foo.Close();
    }

and then have Charlie run in the Close. Now we’re on one thread again and the side effects must be correctly ordered. If the order of the calls is important for the correct functioning of the program then let’s write a program that obviously calls them in the right order.

About these ads

48 thoughts on “Construction destruction

  1. Surely if Foo’s constructor is rewritten so that SideEffects.Alpha() is called first, then that will guarantee Alpha happening before Charlie? I say “surely” as it wouldn’t surprise me if my assumption was incorrect.

    • I don’t see any reason why it would make a difference. How can SideEffects.Alpha() be influenced by the created instance? We don’t pass this to it, so the only way would be globalVar (well not necessarily global, but you get the idea) = this in the constructor before the sideeffects call, in which case we can be certain that the destructor isn’t called on the object anyhow.

      But otherwise? Seems perfectly legitimate to me, if extremely unlikely.

      • Alpha would have to be called before any side effects associated with the assignments. Here’s they’re just ints assigned to ints, so the jitter may be able to deduce there are none, but it’s possible that such a line would involve an overloaded implied cast, and that method might have side effects.

  2. That case where Charlie happens before Alpha .. does that actually have a chance of ever happening or is it in the same category of “legal” as “dereferencing a null-pointer in C++ can format your harddisk”?

    • Unless Foo is sealed, it’s possible for a derived class to have a class constructor which throws an exception without executing `base()`. In that scenario, Charlie will run even though neither Alpha nor Bravo ever gets to do so. If the derived class is written in a language other than C#, such a derived class could also override `Finalize` in such a way that it persists a reference to the never-properly-constructed object but exits without chaining to the base implementation. One could guard against such a thing somewhat by having a type `FInalizerBlockerBase` which simply declared a `new protected virtual Finalize()` method, and have `Foo` inherit from that and declare an `protected sealed Finalize()` method [in CIL, I would expect it should be possible to declare a new virtual sealed method, but in C# it requires an extra level of inheritance]. Note that if `Finalize` is the name of a protected sealed virtual method, no derived class can define a constructor.

  3. @Joel: in that case, it starts getting really messy.
    The JIT compiler (or CPU) may re-order instructions and move the assignments after the call (assume the JIT inlines Alpha and detects that Alpha does not use the x/y fields).
    Then we’re back in the original situation, so Alpha and Charlie may run concurrently.
    [This is assuming the weak memory in the ECMA specification. The current .NET implementation provides stronger guarantees, but I think the problem may still occur depending on the implementation of Alpha.]

    If you want a guarantee that ‘this’ is kept around, use GC.KeepAlive, anything else the optimizer might mess with.
    A good rule is that any object with destructors should call GC.KeepAlive(this) at the end of every method (including the constructor).
    A better rule is to not write destructors, ever. (use SafeHandle instead)

    • “A better rule is to not write destructors, ever. (use SafeHandle instead).”

      It took me a minute to reconcile that statement with my (admittedly limited) knowledge of IDisposable. I think what you are implying is that even for types that implement IDisposable, it is not necessary to write a destructor because any managed resources can either be disposed by explicitly calling Dispose() or by the GC. Any unmanaged resources (wrapped in a SafeHandle) would be disposed during the same explicit Dispose() call or when the SafeHandle instance is GC’ed. Since the SafeHandle will be GC’ed there is no reason to write a destructor to ensure the unmanaged resources will be disposed in your class. Basically, SafeHandle writes the destructor for you.

      Am I too far off the mark there?

      • Some kinds of resources cannot be handled adequately with SafeHandle, but even when using such resources one should define an object for each separately-created resource whose sole purpose is to wrap that resource and ensure its cleanup. Indeed, I would posit that just about every “legitimate” use of Object.Finalize() could have been accomplished just as well if Finalize only existed for objects derived from some class FinalizableObject. As it is, I would posit that only classes which derive directly from System.Object should ever have destructors or override Finalize.

  4. “so any invariants that you’re setting up in your constructor are not necessarily valid in the destructor.”

    It seems to me that this scenario is a special-case of the more general problem of abuse of side-effects.

    Invariants related directly to the object instance (i.e. affecting values in the object itself) would not lead to this problem. They would require the object instance to set up, and so would keep the object alive long enough to ensure against the destructor executing before they are ready.

    At the same time, it would be a mistake to write a destructor that depended on anything other than object-specific invariants (or variants, for that matter!).

    “So what do you do if you’re in this crazy situation and you require Charlie to run after Bravo?”

    IMHO, you fix your code so that either it doesn’t matter when Charlie() is executed, or that it’s not executed in the destructor.

    Yes, one can e.g. follow the IDisposable pattern and force the sequence of execution one wants. But that hasn’t fixed the underlying maintainability issue that’s in the code. It’s just swept it under the rug, leaving it there to be stumbled over by the next person who comes along.

    Ironically, this question of side-effects could lead naturally back to a discussion of F# (or other functional languages) where side-effects are much more naturally discouraged (or even prohibited). Like “goto” statements, side-effects aren’t inherently evil, and unlike “goto” statements may actually be required in some cases. But both lead the programmer right up to the cliff overlooking the Valley of Bugs.

    The less often one stands on the edge like that, the less likely one will be to slip and fall in!

    • It’s not the standing near the edge that bothers me. It’s the walking *closer* to the edge every day believing that doing so is safe because you didn’t fall over the cliff yesterday. That’s an algorithm for finding the edge of a cliff very precisely, but you still go over the cliff.

      We lost *two* space shuttles using that algorithm; now we know *precisely* how cold is too cold to launch, and *precisely* how much damage a wing can sustain from foam before total loss of vehicle and crew is the result.

    • That’s not very clever, teaching them a language that’s horribly obsolete and unlike most other languages (so the skills are not as transferable).

      C++ has stuff like this at every turn but that doesn’t mean you shouldn’t be taught it.

        • Agreed Peter, I enjoy reading Eric’s musings on the various nuances, and caveats of, and within C#, but I find myself thinking that there are far too many of them a lot of the time…..

        • Not really. The ability of a Standard-compiling C++ optimizer to track & limit the lifetimes of under-used objects is pretty much the same as described here for C#. (However, it’s much more likely that such optimization are actually implemented in the C++ compiler — the .NET jitter is just beginning to catch up)

          • “Not really. The ability of a Standard-compiling C++ optimizer to track & limit the lifetimes of under-used objects is pretty much the same as described here for C#.”

            Could you provide a concrete example of this sentence?

          • @Fernando,
            Oddly, the blog will let me reply to my message, but not yours.

            I’ve taken Eric’s code and translated it into C++, shown here on Github:
            https://gist.github.com/jamescurran/5858463

            If we compile that using the VS2012 C++ compiler, in release mode, and look at the assembler output for the function:

            int _tmain(int argc, _TCHAR* argv[])
            {
            SideEffects::M();

            return 0;
            }

            you’ll see that the generated exact is actually for
            int _tmain(int argc, _TCHAR* argv[])
            {
            printf(“Alphan”);
            printf(“Bravon”);
            printf(“Charlien”);
            return 0;
            }

            Output viewable here: https://gist.github.com/jamescurran/5858529

          • @James

            Your C++ example doesn’t show any behaviour that backs up your statement that a conforming C++ compiler can limit and change the lifetime of an object like C#. It shows the constructor being called, Bravo() being called, and the destructor being called.

            Then it demonstrates the compiler’s ability to optimise the resulting code without changing its meaning.

        • Yeah, because all C++ programmers completely understand the nuances and guarantees that memory barriers in C++ have. If you want to start a nice argument, quickly write something that initializes a singleton object in C++ and then try to get a bunch of C++ programmers to agree about what you did wrong.

          If you generally follow the rule that “Finalizers are best left unwritten” then none of this really matters. Yes, sometimes a Finalizer is required. If it is, then it’s *very* important that whoever sits down to write that code understands that he/she is writing code for a very constrained environment and that there are lots of edge cases skirting lots of cliffs.

    • This is exactly why I teach my students to avoid using C# and instead teach them to VB6.

      1. First of all the example of code you are considering is not live practical at all. It might have only theoretical value… if any.
      2. @ “This is exactly why I teach my students to avoid using C# and instead teach them to VB6″
      It may cripple their brains. Better teach them algorithms and data structures, give main ideas of new technologies on the top. Use simple language for it. By the time current students need a language, any language they learned in school will be old. C# is excellent, the most advanced for its area of application, but it evolves too quickly and brings too much new abstract and volatile technologies on board. VB always had reputation of not serious language for not serious programming.

  5. Charlie happen before Bravo in case:
    class Foo
    {
    public Foo(int x, int y)
    {
    GC.Collect();//can happen at any time
    Thread.Sleep(1000); // Some long operation
    SideEffects.Alpha(); // Notice: does not use “this”
    }
    ~Foo()
    {
    SideEffects.Charlie();
    }
    }

    In release configuration Charlie called before Alpha

    • Hi Ruslan,

      These optimization are done by the JIT only in release builds only. This will never be the case in debug builds.

      That is why we must test applications in both Debug and Release as the behavior might be unexpected in these cases.

  6. How can the destructor run if the constructor is still executing? If the constructor is active, the object is still referenced, and should not be destructed. Is C# or the jitter so broken as to allow destructing an object that’s still in use?

    • Even though the constructor is still running, the object being constructed is never used again. Since the object is not in use, it is therefore a candidate for garbage collection.

      The best analogy I can think is having a gun explode when you fire it. You pull the trigger (invoke the constructor), and the bullet begins to move (constructor execution begins). Once the bullet leaves the barrel (no more references to the object being constructed), the gun is destroyed (the object is GCed). This can happen before or after the bullet reaches its target (before the constructor completes its execution), but either way, the bullet’s travel is unaffected (the constructor completes).

    • I spent the entire article explaining this; apparently I was insufficiently clear.

      Your contention that “if the ctor is executing then the object is still referenced” is simply wrong. There’s no requirement whatsoever that the “this” reference be a root of the garbage collector once the last read or write to “this” has happened.

      • “There’s no requirement whatsoever that the “this” reference be a root of the garbage collector once the last read or write to “this” has happened”

        OK so you have said that perfectly clearly, but it still seems crazy. Why isn’t “this” a root?

        I guess if it matters in your program you could put GC.KeepAlive(this) at the end of your ctor, but it seems like you should be able to assume that the object lifetime is a superset of the time spent in it’s methods.

        • Let me turn the question around on you. Why should any local variable continue to be a root if it can be statically determined to be unused after a certain point? Is “this” more special than any other local variable?

  7. This is all perfectly true and possible. But it does require the otherwise unstated assumption that it *matters* that the methods of SideEffects run in a predictable order.

    Nice of a static analyzer to point this out. But with great odds that you’ll end up saying “Thanks, I know/it doesn’t matter” and click the warning away. A great analyzer would detect that order matters ;)

  8. Or just just use C++ and be in control of these things.

    I know actual management of objects and when to delete them is a pain… But it forces clearer thinking.

    • I’ve heard this argument before, and frankly, its a weak argument. It forces “clearer-thinking” about memory management, but that takes away time you could be spending thinking about other aspects of the program’s behavior. I hear it less and less now that .NET has been widely adopted, but there is a common impression among pre-.NET/Java programmers that the GC is a license to be lazy. It’s true that unskilled programmers can cope more easily in a garbage collected language than in a non-garbage collected one, but it’s also true that it can make a skilled programmer much more productive. Not having to worry about explicit memory management so much, I can focus my brain power on more important aspects of the program. I can implement design approaches that are more object-oriented and modular than would be impractical to implement in C++ because of memory management constraints.

      The aspect of the GC described in this article is quirky, true, but 99.9% of the time, it doesn’t matter as it’s extremely isolated. I write C# code on a daily basis, and I have not written a destructor in the last 6 years at least. The only people who should need to write destructors are those that are writing interfaces to native APIs, and those programmers can handle a bit of quirkyness.

      • Explicit object creation and destruction is relatively easy and robust for things that have a single well-defined owner. It can be very difficult and bug-prone for things that do not have a single well-defined owner. Garbage collection does a good job of handling things that don’t have well-defined ownership, and is also an efficient means of memory recollection even for things where ownership would be well-defined.

        Garbage collection has two problems as a primary creation and destruction paradigm. Its inability to ensure the timely release of resources other than heap memory is widely recognized. Less widely recognized, but IMHO no less important, is the fact that GC languages often fail to make a distinction between reference-type storage locations which “own” the state of the object referred to thereby, and those which identify objects owned by someone else. Such distinctions may not matter to the compiler, but they certainly matter to anyone wanting to write correct code.

      • “Not having to worry about explicit memory management so much, I can focus my brain power on more important aspects of the program. I can implement design approaches that are more object-oriented and modular than would be impractical to implement in C++ because of memory management constraints.”

        I think this sentence is unfounded. Could you provide a concrete example?

        • Having assisted in the writing of C# compilers written in both C++ and C#, I can assure you that there are many situations in which “manual” memory management is a complete pain in the neck. In the “native” C# compiler for instance we had to be extremely careful to ensure that any algorithm which relied upon caching for its performance did not hold on to objects past their lifetimes. In the native compiler, a great many methods ended up taking a pointer to the heap upon which its temporary objects were to be allocated; for architectural reasons this often had to be “tramp data” rather than a field of the class. Essentially the “mechanism” code — the mechanical process of allocating and freeing memory correctly — ended up being expressed in the architecture just as much as the “meaning” code that described the algorithms. This is not good. It was like a breath of fresh air in the Roslyn compiler to realize that I could build a cache anywhere I wanted.

          Now this is not to say that garbage-collected languages are a panacea. One of the nice things about manually-managed storage is that garbage collection in the native compiler was extremely cheap. We knew, for example, that almost every object allocated during the analysis of a method body could be thrown away when the analysis was done. Again, this necessitated passing around several heaps as tramp data, and it made for a huge number of bugs when a method analyzer would allocate a long-lived object on the short-lived heap. It was awful code to write, but it was fast. In the managed compiler we ended up pursuing strategies to minimize collection pressure because collecting large graphs of long-lived objects is expensive. We also discovered that our allocation patterns while parsing tended to produce alternating long-lived and short-lived objects, which meant that during collection, you’d end up with the maximum possible number of holes to fill in, which was again not cheap. Fortunately it was not hard to implement a generic pool and use it to recycle managed objects upon collection.

  9. Could it be that the subtle complexities and “gotchas” that are involved in these issues of construction, destruction, garbage collection, and so forth, are indications of a fundamental defect in the .NET architecture? I always find it frustrating when I am forced to pay what seems to be excessive attention to the idiosyncrasies of a language or software platform at the expense of algorithm and data structure development – the stuff that makes computer programming fascinating in the first place.

    • I understand the kind of frustration you talk about. I experienced it a lot back when I was using C++.
      However, in C# you seldom write destructors, and when you do it is for wrapping native objects. If you find yourself calling a constructor and destructor only for their side-effects without using the new instance at all, then you are misusing them and need to learn about the IDisposable pattern which was designed for scoped resource disposal.

  10. Hmmm… This is a very isolated scenario, in that foo is never referenced after it’s created. In reality, though, one would expect it to do something, at least, or have a public property or method of some type which is used after the object is created. If this is true (let’s say that x is exposed) you can persist the object simply by saying something like foo.x++; at the end of M()… although, to be honest, I don’t think I’ve ever encountered a real-world situation in which an object’s entire existence and point is encapsulated completely in its constructor. Interesting, though.

    • I think you can also prevent this happening by switching off certain optimizations in the compiler. For example, I think this will work as shown in the default “release” scenario of Visual Studio, but not in the default “debug” one, which has less optimization. Of course, it could be argued that for code you want to release, this is hardly optimal.

  11. Is there any real world use to using finalizers (deconstructors) in C# if you’re not interacting with unmanaged code?

  12. Did you account the for construction and destruction involved in the assignment statement “Foo foo = new Foo(1, 2);”?

  13. How is this safe?

    I would have thought that “this” was special in the sense that I may be performing non-thread-safe operations in my c’tor or d’tor. If the c’tor and d’tor are running in separate threads, what’s the guarantee that an exception won’t occur?

    Am I expected to put mutexes in my c’tors and d’tors?

    • It’s not safe. Your assumption that “this” is special is simply wrong. The constructor and destructor can certainly be running on different threads at the same time; it’s rare, but it can happen. That is yet another reason why it is so hard to write a correct destructor.

  14. Can one do GC.KeepAlive(this); at the end of a constructor to ensure that it’s always completed? I’m not suggesting this as a valid solution/good pattern, just wondering if it’s viable/allowed.

    • That would guarantee that “this” is not collected before the end of the constructor when the constructor terminates normally. Of course it would not guarantee that you make it to the end of the constructor if, say, someone throws a thread abort exception onto your thread before the end of the constructor. In that case the finalizer will still see a partially constructed object.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s