foreach vs ForEach

A number of people have asked me why there is no Microsoft-provided ForEach sequence operator extension method. The List<T> class has such a method already of course, but there’s no reason why such a method could not be created as an extension method for all sequences. It’s practically a one-liner:

public static void ForEach<T>(
  this IEnumerable<T> sequence, 
  Action<T> action)
{
  // argument null checking omitted
  foreach(T item in sequence) 
    action(item);
}

My usual response to “why is feature X not implemented?” is that of course as Raymond Chen said “Features start out nonexistent and somebody has to make them happen.” All features are unimplemented until someone designs, implements, tests, documents and ships the feature, and no one has yet spent the money to do so in this case.

It’s not much money though. Yes, though I have famously pointed out that even small features can have large costs, this one really is dead easy, obviously correct, easy to test, and easy to document. Cost is always a factor of course, but the costs for this one really are quite small.

Of course, that cuts the other way too. If it is so cheap and easy, then you can do it yourself if you need it. And really what matters is not the low cost, but rather the net benefit. As we’ll see, I think the benefits are also very small, and therefore the net benefit might in fact be negative. But we can go a bit deeper here. I am philosophically opposed to providing such a method, for two reasons.

The first reason is that doing so violates the functional programming principles that all the other sequence operators are based upon. Clearly the sole purpose of a call to this method is to cause side effects. The purpose of an expression is to compute a value, not to cause a side effect. The purpose of a statement is to cause a side effect. The call site of this thing would look an awful lot like an expression (though, admittedly, since the method is void-returning, the expression could only be used in a “statement expression” context.) It does not sit well with me to make the one and only sequence operator that is only useful for its side effects.

The second reason is that doing so adds zero new representational power to the language. Doing this lets you rewrite this perfectly clear code:

foreach(Foo foo in foos){ statement involving foo; }

into this code:

foos.ForEach((Foo foo)=>{ statement involving foo; });

which uses almost exactly the same characters in slightly different order. And yet the second version is harder to understand, harder to debug, and introduces closure semantics, thereby potentially changing object lifetimes in subtle ways.

When we provide two subtly different ways to do exactly the same thing, we produce confusion in the industry, we make it harder for people to read each other’s code, and so on. Sometimes the benefit added by having two different textual representations for one operation (like query comprehensions versus their underlying method call form, or + versus String.Concat) is so huge that it’s worth the potential confusion. But the compelling benefit of query expressions is their readability; this new form of foreach is certainly no more readable than the normal form and is arguably worse.

If you don’t agree with these philosophical objections and find practical value in this pattern, by all means, go ahead and write this trivial one-liner yourself.


Notes from 2020

This episode was, oddly enough, one of the more controversial things I’ve posted on this blog; it sparked a lot of comments and a lot of inbound links from people arguing about the points raised here. Some of the points made in the comments to the original were:

  • The ForEach method does exist on lists, as I noted. Is there anything different about lists than sequences which makes it better on lists? My answer was that I found it still distasteful on lists, but with lists at least we can make the argument that lists are intended to be a mutable data structure, so maybe it makes more sense to have a method designed to have a side effect operating on a mutable data structure? But that’s a pretty weak argument because of course you’re still not supposed to mutate a list while you are iterating it!
  • A commenter pointed out that the list implementation of the ForEach method iterates by index in a for loop; ironically it does not use a foreach.
  • A ForEach extension method is allegedly more readable at the end of a long chain of sequence operators than a foreach statement is at the top.
  • A ForEach extension method that introduces different semantics than regular foreach would have a clear benefit — like AsParallel().ForEach() does. Good point, though of course that is a sharp tool.
  • You could make the extension method non-void by both causing a side effect on each element and then yielding it. There are usages for such a thing; in particular I have used that method for “dump this sequence to the debugger display”. It’s nice to be able to insert a debugging hook like that into the middle of a long chain of sequence operators that is going wrong. But I would still feel icky about putting it into production code.

Reserved and contextual keywords

Many programming languages, C# included, treat certain sequences of letters as “special”.

Some sequences are so special that they cannot be used as identifiers. Let’s call those the “reserved keywords” and the remaining special sequences we’ll call the “contextual keywords”. They are “contextual” because the character sequence might one meaning in a context where the keyword is expected and another in a context where an identifier is expected.[1. An unfortunate consequence of this definition is that using is said to be a reserved keyword even though its meaning depends on its context; whether the using begins a directive or a statement determines its meaning. And similarly for other keywords that have multiple usages, like fixed.]

The C# specification defines the following reserved keywords:

Continue reading

The Stack Is An Implementation Detail, Part Two

A number of people have asked me, in the wake of my earlier posting about value types being on the stack, why it is that value types go on the stack but reference types do not.

The short answer is “because they can”. And since the stack is cheap, we do put them on the stack when possible.

The long answer is… long.

I’ll need to give a high-level description of the memory management strategies that we call “stack” and “heap”. Heap first.

The CLRs garbage-collected heap is a marvel of engineering with a huge amount of detail; this sketch is not actually how it works, but it’s a good enough approximation to get the idea.

The idea is that there is a large block of memory reserved for instances of reference types. This block of memory can have “holes” – some of the memory is associated with “live” objects, and some of the memory is free for use by newly created objects. Ideally though we want to have all the allocated memory bunched together and a large section of “free” memory at the top.

If we’re in that situation when new memory is allocated then the “high water mark” is bumped up, eating up some of the previously “free” portion of the block. The newly-reserved memory is then usable for the reference type instance that has just been allocated. That is extremely cheap; just a single pointer move, plus zeroing out the newly reserved memory if necessary.

If we have holes then we can maintain a “free list” – a linked list of holes. We can then search the free list for a hole of appropriate size and fill it in. This is a bit more expensive since there is a list search. We want to avoid this suboptimal situation if possible.

When a garbage collection is performed there are three phases: mark, sweep and compact. In the “mark” phase, we assume that everything in the heap is “dead”. The CLR knows what objects were “guaranteed alive” when the collection started, so those guys are marked as alive. Everything they refer to is marked as alive, and so on, until the transitive closure of live objects are all marked. In the “sweep” phase, all the dead objects are turned into holes. In the “compact” phase, the block is reorganized so that it is one contiguous block of live memory, free of holes.

This sketch is complicated by the fact that there are actually three such arenas; the CLR collector is generational. Objects start off in the “short lived” heap. If they survive they eventually move to the “medium lived” heap, and if they survive there long enough, they move to the “long lived” heap. The GC runs very often on the short lived heap and very seldom on the long lived heap; the idea is that we do not want to have the expense of constantly re-checking a long-lived object to see if it is still alive. But we also want short-lived objects to be reclaimed swiftly.

The GC has a huge amount of carefully tuned policy that ensures high performance; it attempts to balance the memory and time costs of having a Swiss-cheesed heap against the high cost of the compaction phase. Extremely large objects are stored in a special heap that has different compaction policy. And so on. I don’t know all the details, and fortunately, I don’t need to. (And of course, I have left out lots of additional complexity that is not germane to this article – pinning and finalization and weak refs and so on.)

Now compare this to the stack. The stack is like the heap in that it is a big block of memory with a “high water mark”. But what makes it a “stack” is that the memory on the bottom of the stack always lives longer than the memory on the top of the stack; the stack is strictly ordered. The objects that are going to die first are on the top, the objects that are going to die last are on the bottom. And with that guarantee, we know that the stack will never have holes, and therefore will not need compacting. We know that the stack memory will always be “freed” from the top, and therefore do not need a free list. We know that anything low-down on the stack is guaranteed alive, and so we do not need to mark or sweep.

On a stack, the allocation is just a single pointer move – the same as the best (and typical) case on the heap. But because of all those guarantees, the deallocation is also a single pointer move! And that is where the huge time performance savings is. A lot of people seem to think that heap allocation is expensive and stack allocation is cheap. They are actually about the same, typically. It’s the deallocation costs – the marking and sweeping and compacting and moving memory from generation to generation – that are massive for heap memory compared to stack memory.

Clearly it is better to use a stack than a GC heap if you can. When can you? Only when you can guarantee that all the necessary conditional that make a stack work are actually achieved. Local variables and formal parameters of value type are the sweet spot that achieve that. The locals of frames on the bottom of the stack clearly live longer than the locals on the frames of the top of the stack. Locals of value type are copied by value, not by reference, so the local is the only thing that references the memory; there is no need to track who is referencing a particular value to determine its lifetime. And the only way to take a ref to a local is to pass it as a ref or out parameter, which just passes the ref on up the stack. The local is going to be alive anyway, so the fact that there is a ref to it “higher up” the call stack doesn’t change its lifetime.


A few asides:

  • This explains why you cannot make a “ref int” field. If you could then you could store a ref to the value of a short-lived local inside a long-lived object. Were that legal then using the stack as a memory management technique would no longer be a viable optimization; value types would be just another kind of reference type and would have to be garbage collected.
  • Anonymous function closures and iterator block closures are implemented behind-the-scenes by turning locals and formal parameters into fields. So now you know why it is illegal to capture a ref or out formal parameter in an anonymous function or iterator block; doing so would not be a legal field.

Of course we do not want to have ugly and bizarre rules in the language like “you can close over any local or value parameter but not a ref or out parameter”. But because we want to be able to get the optimization of putting value types on the stack, we have chosen to put this odd restriction into the language. Design is, as always, the art of finding compromises.

  • Finally, the CLR does allow “ref return types”; you could in theory have a method “ref int M() { … }” that returned a reference to an integer variable. If for some bizarre reason we ever decided to allow that in C#, we’d have to fix up the compiler and verifier so that they ensured that it was only possible to return refs to variables that were known to be on the heap, or known to be “lower down” on the stack than the callee. (NOTE FROM 2021: This feature was added to C# 7.)

So there you go. Local variables of value type go on the stack because they can. They can because (1) “normal” locals have strict ordering of lifetime, and (2) value types are always copied by value and (3) it is illegal to store a reference to a local anywhere that could live longer than the local. By contrast, reference types have a lifetime based on the number of living references, are copied by reference, and those references can be stored anywhere. The additional flexibility that reference types give you comes at the cost of a much more complex and expensive garbage collection strategy.

But again, all of this is an implementation detail. Using the stack for locals of value type is just an optimization that the CLR performs on your behalf. The relevant feature of value types is that they have the semantics of being copied by value, not that sometimes their deallocation can be optimized by the runtime.