A commenter on part two asked “can you explain the logic that a string is not always a String
but a regexp is always a RegExp
? What is the recommended way of determining if a value is a string?”
Indeed, the commenter is correct:
print(/foo/ instanceof RegExp); // true print(new RegExp("foo") instanceof RegExp); // true print("bar" instanceof String); // false print(new String("bar") instanceof String); // true print(typeof("bar")); // string print(typeof(new String("bar"))); // object
Why’s that? First off, the question about strings.
In JScript there is this bizarre feature where primitive values — Booleans, strings, numbers — can be “wrapped up” into objects. Doing so leads to some bizarre situations. The type of a wrapped primitive is always an object type, not a primitive type. Also, we use object equality, not value equality:
print(new String("bar") == new String("bar")); // false
I strongly recommend against using wrapped primitives. Why do they exist? The reasoning has kind of been lost in the mists of time, but one good reason is to make the prototype inheritance system consistent. If bar
is not an object then how is it possible to say
print("bar".toUpperCase());
? From the point of view of the specification, this is just a syntactic sugar for
print((new String("bar")).toUpperCase());
Of course as an implementation detail we do not actually cons up a new object every time you call a property on a primitive value! That would be a performance nightmare. The runtime engine is smart enough to realize that it has a value and that it ought to pass it as the this
object to the appropriate method on String.prototype
and everything just kind of works out.
This also explains why it is possible to stick properties onto value types that magically disappear. When you say
var bar = "bar"; bar.hello = "hello"; print(bar.hello); // nada!
what is happening is logically equivalent to:
var bar = "bar"; (new String(bar)).hello = "hello"; print((new String(bar)).hello); // nada!
The magical temporary object is just that — magical and temporary. Once you’ve used it, poof, it disappears.
But this magical temporary object does not appear when the typeof
or instanceof
operators are involved. The instanceof
operator says “hey, this thing isn’t even an object, so it can’t possibly be an instance of anything”. For both consistency and usability, it would have been nice if "bar" instanceof String
logically created a temporary object and hence said yes, it is an instance of String
. But for whatever reason, that’s not the specification that the committee came up with.
The question about regular expressions is easily answered now that we know what is going on with strings. The difference between regular expressions and strings is that regular expressions are not primitives. Just because you have the ability to express a regular expression as a literal does not mean that it is a primitive! That thing is always an object, so there is no behaviour difference between the compile-time-literal syntax and the runtime syntax.
The question about how to determine whether something is a string is surprisingly tricky. If typeof
returns "string"
then obviously it is a string, end of story. But what if typeof
returns "object"
— how can you tell if that thing is a wrapped string?
It’s not easy. instanceof String
doesn’t tell you whether that thing is a string, it tells you whether String.prototype
is on the prototype chain. There’s nothing stopping you from saying
function MyString() {} MyString.prototype = String.prototype; var s = new MyString(); // See part two for why this happens: print(s.constructor == String); // true print(s instanceof String); // true print(String.prototype.isPrototypeOf(s)); // true
So now what are you going to do? JScript is excessively dynamic! You can’t rely on any object being what it says it is. JScript forces people to be operationalists. (Operationalism is the philosophical belief that if it walks like a duck and quacks like a duck, it is a duck.) In the face of the kind of weirdness described above, all you can do is try to use the thing like a string, and if it acts like a string, it s a string.
Commentary from 2020
- A commenter pointed out that I must be a “Lisp geek” because I used “cons” to mean “allocate”. I am not much of a Lisp programmer but I’m willing to use Lisp jargon to get street cred from genuine Lisp geeks. 🙂 (If you are a casual user of Lisp you might think of cons as the function which pushes an item onto the head of a list, but a better way to think of it is that it is an allocator for a head, tail pair. Such a pair is called a “cons cell” for historical reasons.)
- The title obviously refers to “duck typing” which usually thought of as the characteristic of a type system where what we care about is the existence of the right “shape”; we don’t care if it is a duck, we care if it is a thing that can quack. What I wanted to illustrate here is that JavaScript carries that concept even farther than you might think. It’s not just “does this thing have the properties of a duck?” It’s that in some situations, there is no by-design way to even get a reliable answer to the question “is this a duck or not?” The JavaScript type system is weird and I hope that anyone building a new type system these days has the good sense to not create a situation where you cannot even reliably tell if a string is a string.