Unknown's avatar

About ericlippert

http://ericlippert.com

You Want Salt With That? Part Three: Salt The Hash

Last time we were considering what happens if an attacker gets access to your server’s password file. If the passwords themselves are stored in the file, then the attacker’s work is done. If they’re hashed and then stored, and the hash algorithm is strong, then there’s not much to do other than to hash every string and look through the password file for that hash. If there’s a match, then you’ve discovered the user’s password.

You don’t have to look through the vast space of strings in alphabetical order of course. An attacker will start with a dictionary of likely password strings. We want to find some way to make that attacker work harder. Setting a policy which disallows common dictionary words as passwords would be a good idea. Another technique is to spice up the hashes a bit with some salt.

System #3

For every user name we generate a random unique string of some fixed length. That string is called the “salt”. We now store the username, the salt and the hash of the string formed by concatenating the user’s password to the salt. If user Alpha’s password is “bigsecret” and the salt is “Q3vd” then we’ll hash “Q3vdbigsecret”.

Since every user has their own unique random salt, two users who happen to have the same password get different salted hashes. And the dictionary attack is foiled; the attacker cannot compute the hashes of every word in a dictionary once and then check every hash in the table for matches anymore. Rather, the attacker is going to have to re-hash the entire dictionary anew for every salt. A determined attacker who has compromised the server will have to mount an entire new dictionary attack against every user’s salted hash, rather than being able to quickly scan the list for known hashes.

Salting essentially makes it less feasible to attack every user at once when the password file is compromised; the attacker must start a whole new attack for each user. Still, given enough time and weak passwords, an attacker can recover passwords.

In this system the client sends the username and password to the server, the server appends the password to the salt, hashes the result, and compares the result to the salted hash in the table.

This answers the original question posed by the JOS poster; the salt can be public because it is just a random string. Ideally, both the salt and the salted hash would be kept private so that an attacker would not be able to mount a dictionary attack against that salt. But there is no way to deduce any information whatsoever just from the salt.

And of course, it’s better to not get into this situation in the first place — don’t allow your password list to be stolen! But it’s a good idea for a security system to not rely on other security systems for its own security. We call this idea “defense in depth”. You want to make the attacker have to do many impossible things to compromise your security, so that if just one of those impossible things turns out to be possible after all, you’re not sunk.

But what about the fact that the password goes over the wire in the clear, where anyone can eavesdrop? That’s now the weak point of this system. Can we do something about that? Tune in next time and we’ll see what we can come up with.


MSFT archive of original post is here.

You Want Salt With That? Part Two: We Need A Hash

OK, we want to sketch out an authentication system which is sufficiently secure against common attacks even if all the details of the system are known to the attacker. Let’s start with a simple system, take a look at what its vulnerabilities are, and see if we can mitigate them:

System #1

The client transmits the username and password “in the clear” to the server. The server consults a list of usernames and passwords, and grants or denies access accordingly.

There are two big problems with such a system. First, it’s susceptible to eavesdropping if the client and the server are not the same machine. If someone can listen in on the message sent from the client to the server with a network packet sniffer then the eavesdropper learns a valid username and password. Second, if the server is ever compromised and the password list is read, the attacker learns everyone’s username and password.

Can we mitigate these problems? Let’s look at the second vulnerability first. Does the server need to store the password? How could it NOT store the password? What would it compare against?

We can eliminate the need for the server to store the password if we have a hash function. A hash function is a function which takes a string as its argument and produces a number, usually a hundred–or-so-bit integer, as its result.

A good hash algorithm has the property that slightly different inputs produce very different outputs. A one-bit change in the input should cause on average 50% of the output bits to change. Because of this, it should be extremely difficult to deduce the an input that produces a given output, even given partial information about the input. (There are other hash algorithm properties that are important for cryptographic operations such as document signing, but that’s a topic for another day.)

Proving that a given algorithm actually has this property can be quite tricky, but we have some industry-standard hash algorithms which have withstood rigorous testing and deep analysis by professional cryptographers and are widely believed to be “one way functions” — it’s easy to go from string to number, and very hard to go backwards.

System #2

The client sends the username and password to the server. The server has a list of usernames and the hashes of passwords, but not the passwords themselves. (When the user first created the password, the system hashed the password and then discarded it, saving only the hash.) The server hashes the client-supplied password and compares the hashes.

This is better; now the server is not storing the passwords, just the hashes. If an attacker compromises the server, they can’t easily go from the hash back to the password. (It also has the nice property that every entry in the password table is now the same size. Longer passwords do not have longer hashes.)

But there are two new problems with this system. First, any two users who have the same password have the same hash. If one of those users is evil and compromises the server, they immediately learn who has the same password as they do.

Second, this system is susceptible to dictionary attacks. An attacker can hash every word in a large dictionary, compromise the server, and then compare every password hash to every word in the dictionary. Since dictionary words are likely passwords, the attacker will probably be able to figure out at least a few passwords.

And of course we still haven’t mitigated the fact that eavesdroppers could be listening in on the conversation between the client and the server.

Next time, we’ll add a little salt to the mix in an attempt to mitigate the dictionary attack and same-password vulnerabilities. Then we’ll see if we can use some of the hash technology to mitigate the eavesdropping attack.


MSFT archive of original post is here.

You Want Salt With That? Part One: Security vs Obscurity

A poster to one of the Joel On Software fora the other day asked what a “salt” was (in the cryptographic sense, not the chemical sense!) and why it’s OK to make salts public knowledge. I thought I might talk about that a bit over the next few entries.

But before I do, let me give you all my standard caution about rolling your own cryptographic algorithms and security systems: don’t.   It is very, very easy to create security systems which are almost but not quite secure. A security system which gives you a false sense of security is worse than no security system at all! This blog posting is for informational purposes only; don’t think that after you’ve read this series, you have enough information to build a secure authentication system!

OK, so suppose you’re managing a resource which belongs to someone — a directory full of files, say.  A typical way to ensure that the resource is available only to the authorized users is to implement some authentication and authorization scheme.  You first authenticate the entity attempting to access the resource — you figure out who they are — and then you check to see whether that entity is authorized to delete the file, or whatever.

A standard trick for authenticating a user is to create a shared secret. If only the authentication system and the individual know the secret then the authentication system can verify the identity of the user by asking for the secret.

But before I go on, I want to talk a bit about the phrase “security through obscurity” in the context of shared secrets. We usually think of “security through obscurity” as badness. A statistician friend of mine once asked me why security systems that depend on passwords or private keys remaining secret are not examples of bad “security through obscurity”.

By “security through obscurity” we mean that the system remains secure only if the implementation details of how the security system itself works are not known to attackers. Systems are seldom obscure enough of themselves to provide any real security; given enough time and effort, the details of the system can be deduced. Lack of source code, clever obfuscators, software that detects when it is being debugged, all of these things make algorithms more obscure, but none of these things will withstand a determined attacker with lots of time and resources. A login algorithm with a “back door” compiled into it is an example of security through obscurity; eventually someone will debug through the code and notice the backdoor algorithm, at which point the system is compromised.

A strong authentication system should be resistant to attack even if all of its implementation details are widely known. The time and resources required to crack the system should be provably well in excess of the value of the resource being protected.

To put it another way, the weakest point in a security system which works by keeping secrets should be the guy keeping the secret, not the implementation details of the system. Good security systems should be so hard to crack that it is easier for an attacker to break into your house and install spy cameras that watch you type than to deduce your password by attacking the system, even if all the algorithms that check the password are widely known. Good security systems let you leverage a highly secured secret into a highly secured resource.

One might think that ideally we’d want it both ways: a strong security system with unknown implementation details. There are arguments on both sides; on the one hand, security plus obscurity seems like it ought to make it especially hard on the attackers. On the other hand, the more smart, non-hostile people who look at a security system, the more likely that flaws in the system can be found and corrected. It can be a tough call.

Now that we’ve got that out of the way, back to our scenario. We want to design a security system that authenticates users based on a shared secret. Over the next few entries we’ll look at five different ways to implement such a system, and what the pros and cons are of each.


MSFT archive of original post is here.

The attribute of manliness

This is a technical, not a political, current-events, linguistic or academic blog. (You know of course that as soon as I say that, it’s because I’m about to post something that is political, timely, linguistic and academic. Foreshadowing: your sign of a quality blog!) Despite all that, I was so struck by this passage I read last night that I felt I had to share it. We’ll get back to error handling in VBScript or some such topic later this week.

The writer is discussing semantics, specifically how word meanings and popular opinions change in political debates during wartime. The writer is… well, I’ll just let him say it, and talk about the writer afterwards.

Words had to change their ordinary meaning:

  • reckless audacity came to be considered the courage of a loyal ally ; prudent hesitation, specious cowardice
  • moderation was held to be a cloak for unmanliness, frantic violence became the attribute of manliness
  • ability to see all sides of a question, inaptness to act on any
  • cautious plotting, a justifiable means of self-defense
  • the advocate of extreme measures was always trustworthy; his opponent was a man to be suspected
  • the fair proposals of an adversary were met with jealous precautions by the stronger of the two, and not with a generous confidence
  • revenge also was held of more account than self-preservation

The cause of all these evils was the lust for power arising from greed and ambition; and from these passions proceeded violence.

Thus Thucydides of Athens, 2435 years ago. (Translation by Richard Crawley. I’ve changed the formatting and trimmed it a bit — Crawley gets a little wordy, but I love the balanced sentences.)

The first reaction I had upon reading this was “isn’t it astonishing how modern Thucydides sounds across the ages? If he’d only thought to coin the snappy term ‘doublespeak’, he’d have scooped Orwell by a couple millennia!”

And then I gave my head a shake, because of course I was reasoning backwards. This shouldn’t be astonishing in the least; I live in a culture where general opinions on government, politics, warfare, sports and art are more or less just as they were in Classical Greece. It would be more astonishing if Thucydides insights into human nature were not applicable today.


Commentary from 2020:

I do not recall precisely what triggered this post but it was some disingenuous statement by President Bush or another federal politician on the subject of the ongoing then, and ongoing now, pointless then and pointless now, American invasion of Afghanistan.

The comments on this article were mostly from other fans of ancient Greece, rather than engaging in the modern political situation that inspired it. I did mention in the original comments a funny conversation I once had with a member of the VB compiler team when I was an intern:


I recall having a meeting when I was an intern. One of the devs said:

“We can’t ask the users to understand this Byzantine documentation. It’s a Sisyphean task!”

A silence fell over the conference room. Finally, the intern piped up.

“Did you by any chance attend a private school in England when you were growing up?”

“Why yes, but what does that have to do with anything? And how did you know?”

What I did on my summer vacation

I’m back, and I’ve almost made it through the 525 not-automatically-sorted email messages, caught up on my blog reading, and so on.  There are a number of interesting technical questions in my backlog that I’ll start getting to later this week once I dig myself out of the pile of bug reports that accumulated during my absence.

Until then, again, this was just too precious to not share.  If you only want technical stuff, stop reading now.

One of the highlights of my twice-annual return to my ancestral home is spending time with my cousins.  My five-year-old cousin Zephy takes great delight in taunting me.  Every year she teaches the small army of munchkins that she hangs out with some ditty which is to be shouted repeatedly whenever I come into view.  This year it was “Eric is evil!  Eric is evil!  Eeeeevil!”  It’s quite the experience, believe me.  I suspect that the root of this behaviour has something to do with the fact that I once convinced her that Lake Huron is chock-full of Great Canadian Beaver-Sharks — giant buck-toothed, flat-tailed sharks which subsist on a diet of driftwood, canoe paddles, wooden sailboats and little girls — and then repeatedly threatened to throw her in the lake. In retrospect, maybe that wasn’t such a good idea.

Her older sister Victoria does not believe in Beaver-Sharks.  At one point she and her friend Kelsey ran up to me (ten year old girls run everywhere) to ask if they could borrow my pair of kayaks.  “Sure.  You can always borrow the kayaks even if I’m not around as long as you tell a responsible adult that you’re going out on the lake,” I said.  Kelsey got a slightly worried look — “Is my mother a responsible adult?” she deadpanned.  

For future reference: unless otherwise noted, all mothers are responsible adults.  

And finally, Vic has a “mad crush” on a boy, who will remain unnamed.  She wasn’t sure what to do about that, and since apparently I’m an internationally recognized expert on getting boys to like you, she asked my advice.  I wasn’t sure what to say — the first girl I ever had a mad crush on I ended up dating for seven years, which is probably atypical — so I’ve started surveying every 8-12 year old girl that I meet as to what they do about mad crushes.  I met an eight-year-old girl named Heather at a barbecue over the weekend and asked her.  Her detailed off-the-cuff reply showed that she’d already put a lot of thought into this question, though she had not actually needed to test her theories yet.  Allow me to quote from memory:

There are two things you can do if you have a mad crush on a boy, you can ask him to propose marriage and if he won’t, then beat him up, then send him to an island, then surround the island with huge rocks so that he can’t escape, then send him Valentine’s cards that say ‘I HATE YOU!’ but if he does propose marriage then you can kiss him and marry him and move into an apartment and have a baby and bake him a cake that says ‘YOU ARE MY FAVOURITE BOYFRIEND’ in the icing.

Sounds like a good plan! Any current or former 8-12 girls out there who have additional advice for surviving a mad crush (who I suppose happen to also be interested in programming language design if you’re reading my blog…) please leave comments and I’ll forward them on.  Run-on sentences are fine.

Customer service is not rocket science

I was down at Fry’s Electronics yesterday — huge electronics warehouse. Geek paradise. Everything from DVD boxed sets to multimeters. Why I was there is unimportant; let’s just say that the connector consipiracy is after me in a big way. This was my second time in Fry’s, Leah’s first.

It took a while to figure out what was the right Monster widget I needed to fix my problem, it was getting into the afternoon and we were getting kind of peckish. “There’s a sandwich shop right here in the store” said Leah. Perfect.

These guys think of everything — you want people to spend all day shopping in your store, you’ve got to feed them. Smart business move.

Or is it? Maybe not.

We go into the sandwich shop, stand in line for a while, order chicken salad for Leah, pastrami, hot, no mustard for me. Sit down at a table.

The line was quite slow, but, whatever. No problem yet.

We wait. And wait. And wait. And wait some more. The place has other customers, but not so many that it should take twenty minutes to put together a couple of sandwiches. I ask the guy who took our order what was up with the sandwiches

“Yeah, they’re coming”.

We now have a problem, obviously, but this problem is merely Vexing. I can live with Vexing.

A couple minutes later, there’s still no sign of Leah’s sandwich, but mine arrives — covered in mustard.

Now we have a more serious problem. Three problems actually: where’s Leah’s sandwich, why is mine covered in mustard, and why is this all taking so long? We have moved from the Vexing category to Boneheaded.

We are now seriously low on blood sugar and getting cranky.

I once more go up to the guy and point out that my sandwich has mustard on it.

I am not making this up: the very first thing out of his mouth is

“That’s not my fault. You saw me write down ‘no mustard’.”

OK, now we have a BIG problem. We have rapidly left Boneheaded far behind and are firmly ensconced into the Fatal problem category. We now have a meta-problem. This guy wants to argue with me about whose fault it is, rather than making me a new sandwich. I am not particularly interested in having that conversation.

“Look, you know what, I could have driven home and made a sandwich in the amount of time we’ve been waiting. Leah’s still hasn’t shown up. Void out the transaction and we’ll just eat at home.”

“I need my manager to void a transaction.”

“You do what you have to do, Zach.”

At this point we start watching the clock with growing interest.

A solid five minutes later, a young woman ambles in, who is apparently the manager. She completely ignores me — she does not speak a single word to me throughout this entire encounter, though she does attempt a feeble, unconvincing justification to Leah on the subject of why it is that a chicken salad takes so long.

She chews out Zach for writing down Leah’s order with no table number.

She then chews out the sandwich makers — who, I gather from her conversation with them, found an order for a chicken salad, made it, discovered that there was no table number written on the order, and therefore stuck it behind the counter and ignored it.

Obviously this belies her earlier ridiculous explanation that chicken salad takes a long time to prepare, but she chooses to ignore this little contradition.

Now, I used to make sandwiches for a living, and let me tell you, even if you have only a few simple sandwich making skills, it’s not that hard to figure out that someone probably wants to eat that sandwich, and that if you don’t know to whom it belongs, it behooves you to find out. I mean, what did they think would be the outcome of hiding it? (If you guessed “they’d give up, go home and blog about it” — you’re right!)

When she’s done chewing out her staff, she admits that actually, she has no idea how to work the cash register and therefore cannot void out the transaction. She needs her manager.

So far, I have encountered zero competent employees, and a considerable number of incompetent employees. We sit back to watch the clock again.

You know those old Star Trek TNG episodes where Picard goes “hostile aliens are loose in the Engineering Room! Riker, Worf, take care of it!” and then instead of, oh, I don’t know, beaming themselves instantaneously into engineering, they kind of walk — briskly — the quarter mile from the bridge to the engine room? My initial conjecture was that things were taking so long because everyone was really, really busy serving other customers. Based on the speed that they actually move, I’m now starting to think that they’re just plain slow.

Anyway, ANOTHER solid five minutes later, the manager’s manager ambles in. This guy attempts to save the day. After he’s brought up to speed by the cashier and the manager, the first thing out of his mouth is “I’m sorry this happened.”

“You know, you’re the first person to say that in the last fifteen minutes.”

“Oh. Well. I’m sorry about that too.”

We are, for the first time, on the right track. Can he pull it off? Tragically, no. He tries to do the right thing, but he screws it up. How he screws it up is interesting. Thus far, every mistake made has been due to total incompetence. Let’s break it down:

First order mistakes: Hide a sandwich when you don’t know whose it is. Put mustard on a sandwich where the order clearly says no mustard. Fail to understand how your own cash register works.

Second order mistkes: When given a customer problem, engage in blame shifting. Argue back to the customer. Ignore the customer’s problem while you concentrate on process. Don’t take responsiblity for your mistakes. Don’t apologize. Don’t do anything to actually SOLVE the PRIMARY problem (two hungry people with a basket full of high margin widgets that they’d like to buy). Call in multiple levels of management to solve a simple sandwich making issue.

These are all ridiculous and obvious mistakes that should be covered in the first day of new employee orientation at a business that so heavily depends on repeat customers.

What was the manager’s final, deal breaking mistake?

“Is there anything we can do to make it up to you?”

This last mistake is subtle. Clearly he meant well and knew what to do — apologize, take responsibility, mollify the customer — but not how to do it.

The problem is that I’ve already told them what I want them to do — I want them to give me a sandwich, and, if they cannot, to give me my money back so that I can stop spending my incredibly busy day with time-wasting idiots.

Engaging in a negotiation with management over what would be an appropriate level of contrition for them to display for the disaster they’ve managed to embroil me in is not how I want to spend another second of my day. We are in this situation BECAUSE I want to stop talking to them.

Figuring out what they can do for me when they screw up is management’s job, not the customer’s job! Thus, I said “Thanks, but I’m just going home.” put down my basket, and left.

Customer service at a sandwich shop is not rocket science. I said in an earlier entry:

My father has been in the restaurant business for many years. Something he taught me at an early age is that one measure of the quality of a restaurant is how few mistakes they make, but a more important measure is how they treat the customer once a mistake has been made. Do they apologize, take responsibility, and immediately act to correct the mistake, or do they engage in cover-ups, blame-shifting and foot-dragging? I don’t go back to the second kind of restaurant.

When that happens to me at a restaurant where the core competency is in serving food, the restarant probably loses tens or hundreds of dollars of business from me. In a restaurant business where the core competency is actually separating technology-loving geeks like me from thousands of dollars at a time, the opportunity cost of making a customer relations disaster out of a sandwich is considerably higher.

Finally, let me make this very clear: though obviously it is fun to vent, that’s not my primary purpose here. I want to call attention to this problem in a public way because Fry’s sells Microsoft products and therefore I want them to succeed. Even if I cannot, in good conscience, ever shop there again, I want other people to have a pleasant shopping experience there, and buy lots of computers and XBox games. If I didn’t want them fixed, I wouldn’t point out the problems.

I’m going to send a link to this to upper management at Fry’s, and I invite them to respond with details of how they’re solving these problems.


Update from 2022: They never responded to my post and went out of business in 2021.

Multi-cast delegates the evil way

A lot of people have asked me over the years how various kinds of event binding work.  Basically, event binding works like this:

1)     Someone clicks on a button,
2)     then a miracle happens, and…
3)     the button’s event handlers execute.

It’s that second step that people struggle with.

First, some terminology.  I studied applied mathematics, and somethings we talked about quite a bit were sources and sinks. Sources produce something — a faucet produces water at a certain rate, for example.  A sink takes that water away.  We’ll borrow this terminology for our discussion of events.  An event source is something that produces events, like a button or a timer.  An event sink is something that consumes events, like an event handler function.  (Event sinks are also sometimes called “listeners”, which mixes metaphors somewhat, but that’s hardly unusual in this profession.)

This terminology leads to a rather unfortunate homonymy — when I first heard “this method sinks the click event”, I heard “this method syncs the click event”.  When we talk about event sinks, we’re talking about the consumer of something, not about synchronizing two things in time.  (Sinks, of course, can be asynchronous…)

The miracle actually isn’t that miraculous.  Implementing event sources and sinks requires two things: first, a way to wrap up a function as an object, such that when the source wants to “fire” the event, all it does is invokes the sink’s wrapper.  Second, a way for the thread to detect that the button, or whatever, has been pressed and thereby know to trigger the sink wrappers.

An explanation of the magic behind the latter would take us fairly far afield.  Suffice to say that in IE, the details of how that mouse press gets translated into windows messages and how those messages are dispatched by the COM message loops behind the scenes are miracles that I don’t want to talk about in this article.  I’m more interested in those wrappers.

In the .NET world, an object that can be invoked to call a function is called a delegate.  In JScript Classic, all functions are first-class objects, so in a sense, all functions are delegates.  How does the source know that the developer wishes a particular delegate (ie, event sink) to be invoked when the event is sourced?

Well, in IE, it’s quite straightforward:

function doSomething() {  }
button1.onclick = doSomething;  // passes the function object, does not call the function

But here’s an interesting question — what if you want TWO things to happen when an event fires?  You can’t say

function doSomething() {  }
function doOtherThing() {  }
button1.onclick = doSomething;
button1.onclick = doOtherThing;

because that will just replace the old sink with the new one.  The DOM only supports “single-cast” delegates, not “multi-cast” delegates.  A given event can have no more than one handler in this model.

What to do then?  The obvious solution is to simply combine the two.

function doSomething() {  }
function doOtherThing() {  }
function doEverything() { doSomething(); doOtherThing(); }
button1.onclick = doEverything;

But what if you want to dynamically add new handlers at runtime?  I recently saw an inventive, clever, and incredibly horribly awful solution to this problem.  Some code has been changed to protect the guilty.

function addDelegate( delegate, statement) 
{
  var source = delegate.toString() ;
  var body = source.substring(
    source.indexOf('{')+1,   
    source.lastIndexOf('}'));
  return new Function(body + statement);
}

Now you can do something like this:

function dosomething() { /* whatever */ }
button1.onclick = dosomething;
// ... later ...
button1.onclick = addDelegate(button1.onclick, "doOtherThing();");

That will then decompile the current delegate, extract the source code, append the new source code, recompile a new delegate using “eval”, and assign the new delegate back.

OK, people, pop quiz.  You’ve been reading this blog for a while.  What’s wrong with this picture?  Put your ideas in comments and I’ll discuss them in my next entry.

This is a gross abuse of the language, particularly considering that this is so easy to solve in a much more elegant way.  The way to build multi-cast delegates out of single-cast delegates is to — surprise — build multi-cast delegates out of single cast delegates.  Not decompile the single-cast delegate, modify the source code in memory, and then recompile it!  There are lots of ways to do this.  Here’s one:

function blur1(){whatever}
function blur2(){whatever}

var onBlurMethods = new Array();

function onBlurMultiCast() {
for(var i in onBlurMethods)
onBlurMethods[i]();
}
blah.onBlur = onBlurMultiCast;
onBlurMethods.push(blur1);
onBlurMethods.push(blur2);

I’ll talk about VBScript and JScript .NET issues with event binding another time.

The JScript Type System Part Eight: The Last Blog Entry About Arrays, I Promise

Recall that I defined a type as consisting of two things: a set of values, and a rule for associating values outside of that set with values inside the set.  In JScript .NET, assigning a value outside of a type to a variable annotated with that type restriction does that coercion if possible

var s : String = 123; // Converts 123 to a String

Similarly, I already discussed what happens when you assign a JScript array to a hard-typed CLR array variable

var sysarr : int[] = [10, 20, 30]; // Create new int[3] and copy

and what happens when you assign a one-dimensional CLR array to a JScript array variable:

var jsarr : Array = sysarr; // Wrap sysarr

But what happens when you assign a hard-typed CLR array to a variable annotated with a different CLR array type?

var intarr : int[] = [10, 20, 30];
var strarr : String[] = intarr;

You might think that this does the string coercion on every element, but in fact this is simply not legal. Rather than creating a copy with every element coerced to the proper type, the compiler simply gives up and says that these are not type compatible. If you find yourself in this situation, then you will simply have to write the code to do the copy for you.  Something like this would work:

function copyarr(source : System.Array) : String[]
{
  var dest : String[] = new String[source.Length];
  for(var index : int in source)
    dest[index] = source.GetValue(index);
  return dest;
}

There are a few notable things about this example. First, notice that this copies a rank-one array of any element type to an array of strings. This is one of the times when it comes in handy to have the System.Array “any hard-typed array” type!

Second, notice that you can use the for-in loop with hard-typed CLR arrays. The for-in loop enumerates all the indices of an array rather than the contents of the array. Since CLR arrays are always indexed by integers the index can be annotated as an int. The loop above is effectively the same as

for (var index : int = 0 ; index < source.Length ; ++index)

but the for-in syntax is less verbose and possibly more clear.

Third, you might recall that GetValue (and SetValue) take an array of indices because the array might be multidimensional. But we’re not passing in an array here.  Fortunately, you can also pass only the index if it is a single-dimensional array.

Generally speaking, hard-typed array types are incompatible with each other. There is an exception to this rule, which I’ll discuss later when I talk about what exactly “subclassing” means in JScript .NET.

A grammatical aside

I just wrote in a comment to my previous entry, “The ability to rate one’s knowledge of a subject accurately is strongly correlated with one’s knowledge.”

Wait a minute.  “One’s”???  Word’s grammar checker didn’t blink at that.  But nor does it blink at “ones”.  According to the OED, “one’s” is the genitive declension of “one”.  Let’s sum up:

Pronoun   Genitive
-----------------
Me        My
You       Your
Us        Our
Him       His
Her       Hers
Them      Their
Thou      Thine
It        Its
One       One's

I always thought that the reason that “its” doesn’t take an apostrophe-s was because the rule “add an apostrophe-s to form a possessive” applied only to noun phrases, not to pronouns (And of course, we all know that apostrophe-s does not itself form a genitive noun — otherwise, in the sentence “The First Lady is the President of America’s wife,” Laura Bush would be associated with America, not President Bush.)

What the heck is going on here?  Surely there is some grammar pedant out there who can justify this.  My faith in English grammar has been sorely tried.


Update from 2023: My erstwhile colleague Mike Pope, who still writes an entertaining blog about English usage, gave some fascinating historical context.


Mike in 2003:

Well, let’s work backward.

In the phrase “The First Lady is the President of America’s wife”, the possessive is applied to the entire phrase: “(the President of America)’s wife.” This is common; here’s a nice example: “The woman I went to school with’s daughter”.

(http://www.chessworks.com/ling/papers/myths/mb003.htm)

FWIW, the ability to add a possessive to a noun phrase and not just to a noun is a comparatively recent development in English: “Until well into Middle English times what Jespersen calls the ‘group genitive’, i.e. ‘[the king of England]’s’ nose did not exist, but the usual type was ‘[the king]’s nose of England’. In Old English the usual structure, before the use of the of-possessive would have been ‘the king’s nose England’s’

http://www.linguistlist.org/issues/5/5-524.html

What’s actually interesting to contemplate is why the hell we have an apostrophe for the possessive at all. Possessive is just the genitive case; as such, it’s a normal noun declension, and has no more need for an apostrophe than the plural does. Nothing is elided with the possessive/genitive. And as noted, pronouns manage without it. German likewise has an -s for the genitive and manages without a possessive marvelously well. So whence the flingin-flangin possessive apostrophe, which does little more these days than confuse and annoy people?


Eric in 2023: I and other commenters pointed out that historically, possessives were formed by adding “es”, and the apostrophe indicates that the “e” has been removed, the same way an apostrophe indicates removal of letters in other contractions. “Its” was originally “ites”.


Mike again:

You’re on the right track with the “e” being elided with an apostrophe – that is indeed the origin of the use of an apostrophe as the indication of the genitive. What seems to have happened is that “ites” got elided, as frequently-used words tend to, but this happened much earlier in the history of English than the elision that happened to all other genitive forms. (Presumably because it was a widely-used word – the workhorses of a language are the ones that tend to get streamlined first, which is why the verb ‘to be’ is highly irregular in most languages.) So the progression looks like this: originally the word was “ites”, then it became “its” at a time when apostrophes were apparently not required on such elisions, and then quite a lot later, we started to elide *all* the genitive forms, but by then it was considered correct to indicate such elisions with an apostrophe.

[…]

The issue of elision of the vowel in genitive -es only partly explains the possessive apostrophe; the -as ending was also used for plural of masculine strong nouns in OE

(nice declension and conjugation chart here: http://www.engl.virginia.edu/OE/courses/handouts/magic.pdf),

which suggests that many noun plurals once had, as they did genitive singular, an unstressed vowel to go with their -s. Granted, it has less to do with how things really were than how they were perceived to be when our not-quite-rational system of orthography was being codified. As I sort of opined earlier, IMO the apostrophe is more trouble than it’s worth for possessives; even educated people are confused about its use, if my email Inbox is any evidence. In historical linguistics, mass confusion about forms is often a prelude to an evolutionary change. 🙂

Six out of ten ain’t bad

Occasionally I interview C++ developers. I’m always interested in how people rate themselves, so I’ll occasionally ask a candidate, “On a scale from one to ten, how do you rate your C++ skills?”

The point of the question is actually not so much to see how good a programmer the candidate is — I’m going to ask a bunch of coding questions to determine that. Rather, it’s sort of a trick question. What I’m actually looking for — what I’m looking for in almost every question I ask — is “how does the candidate handle a situation where there is insufficient information available to successfully solve a problem?” Because lemme tell ya, that’s what every single day is like here on the Visual Studio team: hard technical problems, insufficient data, deal with it!

The question has insufficient data to answer it because we have not established what “ten” is and what “one” is, or for that matter, whether the scale is linear or logarithmic. Does “ten” mean “in the 90th percentile” or “five standard deviations from the mean” or what? Is a “one” someone who knows nothing about C++? Who’s a ten?

Good candidates will clarify the question before they attempt to answer it. Bad candidates will say “oh, I’m a nine, for sure!” without saying whether they are comparing themselves against their “CS360: Algorithmic Design” classmates or Stanley Lippman.

I mention this for two reasons — first of all, my favourite question to ask the “I’m a nine out of ten” people actually came up in a real-life conversation today: OK, smartypants: what happens when a virtual base class destructor calls a virtual method overridden in the derived class? And how would you implement those semantics if you were designing the compiler? (Funny how that almost never comes up in conversation, and yet, as today proved, it actually is useful knowledge in real-world situations.)

The second reason is that ten-out-of-ten C++ guru Stanley Lippmann has started blogging. Getting C++ to work in the CLR environment was a major piece of design work, of a difficulty that makes porting JScript to JScript.NET look like a walk in the park on a summer day.

Compared to Stanley Lippmann, I give myself a six.


Update from 2023:

Two things:

First, a commenter on the original post mentioned an interviewing technique which I immediately adopted. When the candidate says “oh I’m an 8” or whatever, without calibrating the scale, the right follow-up question is: what is something that you found difficult when you were a 6 or 7? Make the candidate calibrate their own scale, and that’s then signal on how they should be able to handle the coding problems which follow.

Second, I was being somewhat tongue-in-cheek when I said that I’d follow up with a trivia question about the specification. Trivia questions are not great interview questions; as I noted in a comment to the original post, what I’m really looking for is not whether the candidate can regurgitate the specification on command, but rather whether they know that compilers are not magical; compilers need to generate code which implements the specification, and there are common techniques for doing so; do you know what they are? It’s all about gaining signal on how productive the candidate could be when solving problems we will actually face on the job.