One of the primary design goals of C# in the early days was to be familiar to C and C++ programmers, while eliminating many of the “gotchas” of C and C++. It is interesting to see what different choices were possible when trying to reduce the dangers of certain idioms while still retaining both familiarity and performance. I thought I’d talk a bit about one of those today, namely, how integer arithmetic works in C#.
One of the biggest problems with C and C++ is that you don’t even know for sure what the range is of an integer; every compiler can differ, which makes it tricky to write truly portable code. C# solves this problem by simply removing the ambiguity. In C#, an int is a 32 bit twos-complement, number, end of story. So that problem is solved. But many more remain.
The fundamental problem is that integer arithmetic in C, C++ and C# behaves only superficially like the integer arithmetic you learned in school. In normal arithmetic there are nice properties like “adding two positive numbers results in a third positive number”, which do not hold in these languages because of integer overflow. Even the property that there is no highest integer is a very useful mathematical property that does not hold. (A mathematician would note that the integer arithmetic that we have in C# is a commutative ring, but few developers have studied ring theory.)
This is bad because it leads to bugs. It is good because the vast majority of integer arithmetic done in any of these languages involves integers whose magnitudes are tiny compared to the possible range of the integer type, and because this kind of arithmetic can be done extremely quickly by computers. So then question then for the designers of C# is: how do we keep the desirable high performance while still enabling developers to detect and prevent bugs?
Of course, one choice would be to simply reject the premise that speed is the most important thing, and make math work correctly across the board. A “big integer” could be the default integer type, as it is in some other languages. Frankly, I spend billions of nanoseconds waiting for stuff to stream down from the network every day; I don’t really care if my arithmetic takes a few extra nanoseconds. It might be worthwhile to say that the default type is big integers, and if you want high performance integers, then you have to use a special type.
But when C# was developed, I doubt that this was even considered for a moment. Keeping the performance up, and being able to interface easily with existing libraries of unmanaged code, and leveraging the existing knowledge of developers used to 32 bit integers, were all high priorities. And we lived in a world where high latency was due mostly to the CPU taking a long time, not waiting for network I/O so much.
The decision the C# designers actually made was to have two kinds of integer arithmetic: checked, and unchecked. Unchecked arithmetic has all the speed and danger that you have learned to love from C and C++, and checked arithmetic throws an exception on overflow. It is slightly slower, but a lot safer.
How exactly is that safer? Because it is better to crash the program and bring the bug to the attention of the users and the developers, than to muddle on through. If the programmer believed that all arithmetic operations should be on integers that are tiny compared to the range of an integer, and in fact they are not, then something is fundamentally wrong with the program, it is unreliable, and it should be stopped before it does any more harm.
So it seems pretty clear why we would want to have a “checked” keyword in C#; it says “in this bit of the code, I assert that my integer arithmetic does not overflow, and if it does, I’d rather crash the program than muddle on through”. But what possible use is the unchecked keyword? Unchecked arithmetic is the default!