I often see StackOverflow answers that confuse the
sizeof operator with the
Marshal.SizeOf method. These two operators do different things and can return different results, so it is important to know which is which.
In a nutshell, the difference is: the
sizeof operator takes a type name and tells you how many bytes of managed memory need to be allocated for an instance of that struct.[1. I don’t have to tell long-time readers of this blog that of course this is not necessarily stack memory; structs are allocated off the heap when they are array elements, fields of a class, and so on.] By contrast,
Marshal.SizeOf takes either a type object or an instance of the type, and tells you how many bytes of unmanaged memory need to be allocated. These can be different for a variety of reasons. The name of the type gives you a clue:
Marshal.SizeOf is intended to be used when marshaling a structure to unmanaged memory.
Another difference between the two is that the
sizeof operator can only take the name of an unmanaged type; that is, a struct type whose fields are only integral types, Booleans, pointers and so on. (See the specification for an exact definition.)
Marshal.SizeOf by contrast can take any class or struct type.
Cool. I guess it signals me as a newer reader, but I wasn’t aware that structs could be allocated on the heap in such ways. I guess it makes sense, but I never really thought about it much.
I think that there should be some law which bans people who have said/writen books/blogposts/etc. with the words “value types/primitives are allocated on the stack” from ever teaching programming in any form.
Yet I believe the official .NET 2.0 Foundation MSDN book teaches just that, and obviously would seem to be Microsoft endorsed. Doh!
That is (one reason(*) ) why I don’t use “stack”and “heap”when discussing .NET memory management; I use “over here” and “over there”. When talking about method local variable, “over here” and “over there” correspond to the stack & heap. When talking about the member of an object, they correspond to different spots in the heap.
(*) Another reason is that “The Stack” may not be a stack, the The Heap probably isn’t a heap.
Do you happen to know why the sizeof operator is more restrictive than the underlying sizeof IL instruction? The IL instruction seems capable of dealing with any type that you care to throw at it, unlike the C# operator.
(See http://stackoverflow.com/a/16522565/55847 for an example.)
I’ve wondered about that, too: http://www.informit.com/guides/content.aspx?g=dotnet&seqNum=728
Let me turn the question around on you. Suppose I got out my magic wand and changed the compiler so that you could do sizeof on any struct, including structs that contained managed references. What are you planning on doing with your new super power? Why is the feature useful to you? I can’t think of any reason why it would be useful, which is probably why it doesn’t exist.
Although List(Of T) stores its entire contents in a single array, other implementations of `IList(Of T)` may want to partition their contents into arrays no larger than 84,999 bytes so as to avoid having them insta-promoted to Gen2. How should code do that without knowing the size of the items?
I have definitely been in situations where getting the in-memory size of an arbitrary managed structure would have made it easier to decide when to do things like purge caches of data from memory. Weak references are somewhat helpful here, but if you want precise control over how much memory a cache requires you need a true managed sizeof(). Due to the number of factors involved, particularly as a program evolves over time (or if you’re trying to implement caching as a generic service) it’s not sufficient to guesstimate this value on a per-type basis.
Another example that comes to mind here is being able to build light-weight memory profiling into an application. It’s true that there are sophisticated memory profilers out there that can give you lineage, and per-generation utilization, but they can be tricky and time-consuming to use. Sometimes all you want to do is traverse an object graph, add up the memory used by some subset of the objects in this graph and track it over several runs (or perhaps in all Debug runs, for instance). That’s hard to do without sizeof().
@Eric: So… does that mean the answer to the question is “because we didn’t think anyone would need it”?
And that’d be the answer to sooo many design decisions in a variety of products.
That’s also the real reason behind so many “workaround” usages. Things are used for something not originally intended all the time (hint: make search for “unintended use” and see how many results that’ll produce 🙂 )
@Eric: I asked mainly out of curiosity. I don’t have any desperate day-to-day need for this super power.
Having said that, the distinction does seem slightly arbitrary. For example, sizeof(int) is allowed but sizeof(int?) is not — I can’t see that one of those is more useful than the other.
I suspect the answer you would get from Eric here (I’m sure he’ll correct me if he disagrees) would be that while consistency is important, every feature is by default unimplemented and has to pass the bar of utility and sufficient value to incur the high cost of implementation, testing, documentation, etc. However, as I mention in my comment above, I do think there are some specific cases where this capability would actually have value – whether enough other developers agree is matter that requires a broader perspective and data than just my own opinion.
In my case I was working with very large arrays of structures and needed to know how large I could make the arrays. sizeof works great for simple structures, but it doesn’t work for generics even when the generic types are restricted to value types.
I imagine my case was a little out of the ordinary. Not that many C# programmers are (or were at the time, anyway) working with arrays that contain hundreds of millions of items. Beyond that one project, I’ve rarely needed sizeof in any case, and certainly not for anything but simple types. I imagine not many people really need that SizeOf functionality.
At the time it seemed odd that the compiler implements what appears to be a crippled sizeof, and I wondered why.
I would like to have it for measuring the size of data structures; I have a couple of data structures that offer a MeasureSize() method that figures out how much memory they use. Since there is no access to the necessary IL instruction, MeasureSize() has to take a “sizeOfT” parameter. And since there is no way to measure the on-heap size of a class, I rely on my own knowledge for that (e.g. I know that normal objects have a 2-word header and that arrays have a 3- or 4-word header depending on whether the array type is a struct or class.)
Measuring the true heap size of a class is actually a tricky proposition. Is it sufficient to just measure the space that the fields of the class occupy? For references do you traverse to the reference and measure it’s size? If you do so then you probably want to perform a transitive closure that tracks all of the references you visit (to avoid double-counting them) – and that can be very expensive. Would you traverse weak references? Arrays? Finding an implementation at the language level that is both intuitive, efficient, and has few undesirable qualities (ie. impacting garbage collection) could be quite hard. Having said that, I have definitely run across cases myself where being able to at least get the ‘local’ size of an arbitrary type (which is what you seem to be suggestion) could be useful.
I would expect that `sizeof(T)` should report the number of bytes used by a *storage location* of type T. If T is a reference type or interface type, that would be the number of bytes required to store a heap reference (4 or 8). In 32-bit mode, if `T` is a class type, the largest `n` for which a `T[n]` can avoid a LOH allocation will be about 21,240 for any class type `T`, regardless of whether each `T` instance takes 20 bytes or 20 megabytes.
Why is it that (IIRC) sizeof is not allowed outside of an unsafe block? I’ve occasionally wanted to make a top-level decision based on sizeof(IntPtr) while not actually doing anything ‘unsafe’ until much later.
> structs are allocated off the heap when they are array elements, fields of a class, and so on.
Or, most simply of all, when they are boxed.
If you want to know the size of an IntPtr then why aren’t you using the aptly-named IntPtr.Size?
If what you really want to know is if you are on a 64 bit operating system then you shouldn’t be messing around with IntPtr at all; you should be using the Is64BitProcess and/or Is64BitOperatingSystem properties.
Sometimes you can structure your code in such a way as to use IntPtr.Size as a parameter to change the way an algorithm works for 32-bit and 64-bit systems. This is particularly useful when you release both versions and they interact with system libraries which have both 32-bit and 64-bit variants. Primarily with packing and unpacking data going to and from those native libraries. Saves on having to maintain two sides of an if-else branch.
Note that when a struct is boxed it just involves the creation of a new object to with a field of the type of the boxed object. The value being boxed is set as that field value. Boxing can thus be though of (conceptually at least) of a special case of the “field of a class” option listed.
Pingback: The Morning Brew - Chris Alcock » The Morning Brew #1377
Pingback: Fun with __makeref – Xenoprimate's Dev Blog