Unknown's avatar

About ericlippert

http://ericlippert.com

The JScript Type System, part one

I thought I might spend a few days talking about the JScript and JScript .NET type systems, starting with some introductory material.

Consider a JScript variable:

var myVar;

Now think about the possible values you could store in the variable. A variable may contain any number, any string or any object. It can also be true or false or null or even undefined. This is a rather large set of possible values. In fact, the set of all legal values is infinite. Countably infinite, and in practice limited by available memory, but in theory there is no upper limit.

A type is characterized by two things, a set and a rule. First, a type consists of a subset (possibly infinitely large) of the set of all possible values. Second, a type defines a rule for transforming values outside the set into values in the set. (This rule may specify that certain values are not convertible and hence produce “type mismatch” errors.)

For example, String is a type. The set of all possible strings is an (infinite) subset of the set of all possible values, and there are rules for determining how all non-string values are converted into strings.

JScript Classic is a dynamically typed language. This means that any value of any type may be assigned to any variable without restriction. It is often said — inaccurately — that “JScript has only one type”. This is true only in the sense that JScript has no restrictions on what data may be assigned to any variable, and in that sense every variable is “the same type” – namely, the “any possible value” type. However, the statement is misleading because it implies that JScript supports no types at all, when in fact it supports six built-in types.

JScript .NET, by contrast, is an optionally statically-typed language. A JScript .NET variable may be given a type annotation which restricts the values which may be stored in the variable. This annotation is optional; an unannotated variable acts like a JScript variable and may be assigned any value.

JScript has the property that a value can always describe its own type at runtime. This is not true in, say, C, where you can have a void* and no way of asking it “are you pointing to an integer or a string variable?” In JScript, you can always ask a value what its type is and it will tell you.

The concept of subtyping is not particularly important in JScript Classic though it will become quite useful when we discuss JScript .NET classes later. Essentially a type T1 is a subtype of another type T2 if T1’s set of values is a subset of T2’s set of values. A type consisting of the set of all integers might be a subtype of a type consisting of all the numbers, for instance. (This is not how the integers are traditionally construed; the C type system makes integers and floats disjoint types, where the integer 1 and the float 1.0 are different values that happen to compare as equal — but comparisons across types is a subject for a later blog entry.)

Anyway, JScript Classic has six built-in types, all of which are disjoint. They are as follows:

  • The Number type contains all floating-point numbers as well as positive Infinity, negative Infinity and a special Not-a-Number (“NaN”) values. It may seem odd that “Not-a-Number” is a Number but this does in fact make sense. NaN is the value returned when an operation logically must return a number but no actual number makes sense. For example, when trying to convert the string "banana" to a Number, NaN is the result. Because numbers in JScript are actually represented by a 64 bit floating point number there are a finite number of possible Number values. The number of numbers is very large (in fact there are 18437736874454810627 possible numbers, which is just shy of 2^64.) Numbers have approximately fifteen decimal digits of precision and can range from as tiny as 2.2 x 10-308 to as large as 1.7 x 10308.
  • The String type contains all Unicode strings of any length (including zero-length empty strings.) The string type is for all practical purposes infinite, as the length of a string is limited only by the ability of the operating system to allocate enough memory to hold it.
  • The Boolean type has two values: true and false
  • The Null type has one value: null
  • The Undefined type has one value: undefined. All uninitialized JScript variables are automatically set to undefined
  • The Object type has an infinite number of values. An object is essentially a collection of named properties where each property can be a value of any type. In JScript many things are objects: functions, dates, arrays and regular expressions are all objects.

Types are not themselves “first class” objects in JScript, though they are in JScript .NET. I’ll discuss that, along with the differences between prototype and class inheritance, in later entries.


Notes from 2020

It would be interesting to revisit this series in the context of the TypeScript type system, which is a great example of how rich and powerful a gradually typed language can be. The JScript.NET gradual type system was very simple compared to the TS system!

This series got a lot of good comments from readers. A few highlights from this first episode:

  • the typeof operator in JScript bizarrely does not follow the rules of the type system that I just laid out. It identifies null as an object, but does not identify functions as objects, for example. External functions, like DOM functions, may not be identified as functions. It can also return “unknown” in some rare cases.
  • Also confusing: typeof(3) is number, but typeof(new Number(3)) is object.
  • And also confusing, typeof(this) when called from the body of a user-defined function on the prototype of Number is object.
  • A number of inconsistencies and confusions regarding prototypes were also raised; I discuss these in a later episode.
  • In the earliest published version of this article I used the terms “strong” and “weak” with respect to the type system, and readers rightly took me to task for that. This was the beginning of a realization that I have since strongly expressed: “strong” and “weak” are so vague that they are meaningless. I haven’t checked lately, but one time when I read the Wikipedia article on strong typing, it listed eleven different contradictory meanings. I’ve since expressed the (strong!) opinion that “strongly typed” simply means “a type system that I admire”. Instead of characterizing type systems as “strong” or “weak”, instead say what properties they really have: what restrictions do they impose, when are they imposed, and in what ways can those restrictions be violated, and what are the consequences?
  • Readers similarly noted that “untyped” is vague; it is often used to mean “no restriction is placed on the possible values of variables” (as is the case in classic JavaScript) but is also used to mean “no type system is imposed at all on any values” (as is the case in, say, untyped lambda calculus, where all values are function from function to function; there are no integers or strings at all.)

 

Eval is evil, part two

As I promised, more information on why eval is evil. (We once considered having T-shirts printed up that said “Eval is evil!” on one side and “Script happens!” on the other, but the PM’s never managed to tear themselves away from their web browsing long enough to order them.)

Incidentally, a buddy of mine who is one of those senior web developer kinda guys back in Waterloo sent me an email yesterday saying “Hello, my name is Robert and I am an evalaholic”. People, it wasn’t my intention to start a twelve step program, but hey, whatever works!

As I discussed the other day, eval on the client is evil because it leads to sloppy, hard-to-debug-and-maintain programs that consume huge amounts of memory and run unnecessarily slowly even when performing simple tasks. But like I said in my performance rant, if it’s good enough, then hey, it’s good enough.  Maybe you don’t need to write maintainable, efficient code. Seriously! Script is often used to write programs that are used a couple of times and then thrown away, so who cares if they’re slow and inelegant?

But eval on the server is an entirely different beast. First off, server scenarios are generally a lot more performance sensitive than client scenarios.  On a client, once your code runs faster than a human being can notice the lag, there’s usually not much point in making it faster.  But  as I mentioned earlier, ASP goes to a lot of work to ensure that for a given page, the compiler only runs once. An eval defeats this optimization by making the compiler run every time the page runs! On a server, going from 25 ms to 40 ms to serve a page means going from 40 pages a second to 25 pages a second, and that can be expensive in real dollar terms.

But that’s not the most important reason to eschew eval on the server.  Any use of eval (or its VBScript cousins Eval, Execute and ExecuteGlobal) is a potentially enormous security hole:

<%
  var Processor_ProductList;
  var Software_ProductList;
  var HardDisk_ProductList;
  // ...
  CategoryName = Request.QueryString("category");
  ProductList = eval(CategoryName & "_ProductList");
  // ...

What’s wrong with this picture?  The server assumes that the client is not hostile.  Is that a warranted assumption?  Probably not!  You know nothing about the client that sent the request.  Maybe your client page only sends strings like “Processor” and “HardDisk” to the server, but anyone can write their own web page that sends

((new ActiveXObject('Scripting.FileSystemObject')).
DeleteFile('C:*.*',true)); 
Processor

which will cause eval to evaluate

((new ActiveXObject('Scripting.FileSystemObject')).
DeleteFile('C:*.*',true)); 
Processor_ProductList

Obviously that’s a pretty unsophisticated attack.  The attacker can put any code in there that they want, and it will run in the context of the server process.  Hopefully the server process is not a highly privileged one, but still, there’s vast potential for massive harm here just by screwing up the logic on your server.

Never trust the input to a server, and try to never use eval on a server.  Eval injection makes SQL injection look tame!

To try and mitigate these sorts of problems, JScript .NET has some restrictions on its implementation of eval, but that’s a topic for another entry.


Notes from 2020

The attack I’m briefly describing here is of course only one of a great many “hostile client gets bad string onto the server” attack patterns. During my time at Coverity I got to take a deep look at the tools which attempt to detect code paths where a string goes from an untrusted source to a dangerous sink. In terms of the complexity of the analyzed control flows, these were probably the most sophisticated checkers we had, and among the most prone to false positives.

Defeating injection attacks is a hard problem, and I wish we at Microsoft had solved it in the type system, rather than creating a market for expensive third-party solutions that use symbolic execution to effectively do the work of imposing a type system post hoc on an existing program.

A parable

Once upon a time I was in high school. Ah, the halcyon days of my youth. One day I was sitting in class, minding my own business when the teacher said: “Does anyone have a thin metal ruler?”

No answer. Apparently no one had a thin metal ruler.

“No? How about a nail file?”

No answer. Now, I cannot imagine that of all the girls in the class, not one of them had a nail file. But I can well imagine that none of them wanted to share it with a teacher.

“No? Hmm.”

So I piped up: “What do you need a nail file for?”

“I have this big staple in this document that I need to remove.”

Upon which point one of my classmates mentioned that he had a staple remover. Problem solved.

Over and over again I find that script customers (both internal consumers here at Microsoft and third-party developers) frequently ask questions like my teacher. That is, they have a preconceived notion of how the problem is going to be solved, and then ask the necessary questions to implement their preconceived solution. And in many cases this is a pretty good technique! Had someone actually brought a thin metal ruler to class, the problem would have been solved. But by choosing a question that emphasizes the solution over the problem, the questioner loses all ability to leverage the full knowledge of the questionees.

When someone asks me a question about the script technologies I almost always turn right around and ask them why they want to know. I might be able to point them at some tool that better solves their problem. And I might also learn something about what problems people are trying to solve with my tools.

Joel Spolsky once said that people don’t want drills, they want holes. As a drill provider, I’m fascinated to learn what kinds of holes people want to put in what kinds of materials, so to speak. Sometimes people think they want a drill when in fact they want a rotary cutter.


Commentary from 2019:

First off, I misattributed that quotation. “People don’t want to buy a quarter-inch drill, they want a quarter-inch hole.” is a quote from the economist Theodore Levitt. At the time I wrote this, I was sure that I had read about this idea in a Joel On Software article, but if I did, I cannot find it now. Apologies for the error.

Second, I did not know at the time that we have a name for this pattern of “have a problem, get a crazy idea about a solution, ask baffling questions about the crazy idea, rather than stating the problem directly” that we see so often on StackOverflow. It is an “XY problem“, which strikes me as a terrible name.

Third, I am reminded of a story about the time I was helping Morton Twillingate put a roof on his shed. “Hand me the screw driver there b’y,” he said so I handed him a Philips head screwdriver. “Sweet t’underin’ Jaysus b’y, give me the screw driver!” he said, pointing at the hammer in my other hand, “If I’d wanted the screw remover I’d have said so!”

 

Eval is evil, part one

The eval method — which takes a string containing JScript code, compiles it and runs it — is probably the most powerful and most misused method in JScript. There are a few scenarios in which eval is invaluable. For example, when you are building up complex mathematical expressions based on user input, or when you are serializing object state to a string so that it can be stored or transmitted, and reconstituted later.

However, these worthy scenarios make up a tiny percentage of the actual usage of eval. In the majority of cases, eval is used like a sledgehammer swatting a fly — it gets the job done, but with too much power. It’s slow, it’s unwieldy, and tends to magnify the damage when you make a mistake. Please spread the word far and wide: if you are considering using eval then there is probably a better way. Think hard before you use eval.

Let me give you an example of a typical usage.

<span id="myspan1"></span>
<span id="myspan2"></span>
<span id="myspan3"></span>
function setspan(num, text)
{
  eval("myspan" + num + ".innerText = '" + text + "'");
}




Somehow the program is getting its hands on a number, and it wants to map that to a particular span. What’s wrong with this picture?

Well, pretty much everything. This is a horrid way to implement these simple semantics. First off, what if the text contains an apostrophe? Then we’ll generate

myspan1.innerText = 'it ain't what you do, it's the way thacha do it';

Which isn’t legal JScript. Similarly, what if it contains stuff interpretable as escape sequences? OK, let’s fix that up.

eval("myspan" + num).innerText = text;

If you have to use eval, eval as little of the expression as possible, and only do it once. I’ve seen code like this in real live web sites:

if (eval(foo) != null && eval(foo).blah == 123)
  eval(foo).baz = "hello";


Yikes! That calls the compiler three times to compile up the same code! People, eval starts a compiler. Before you use it, ask yourself whether there is a better way to solve this problem than starting up a compiler!

Anyway, our modified solution is much better but still awful. What if num is out of range? What if it isn’t even a number? We could put in checks, but why bother? We need to take a step back here and ask what problem we are trying to solve.

We have a number. We would like to map that number onto an object. How would you solve this problem if you didn’t have eval? This is not a difficult programming problem! Obviously an array is a far better solution:

var spans = new Array(null, myspan1, myspan2, myspan3);
function setspan(num, text)
{
  if (spans[num] != null)
    spans[num].innertext = text;
}

Since JScript has string-indexed associative arrays, this generalizes to far more than just numeric scenarios. Build any map you want. JScript even provides a convenient syntax for maps!

var spans = { 1 : mySpan1, 2 : mySpan2, 12 : mySpan12 };

Let’s compare these two solutions on a number of axes:

Debugability: what is easier to debug, a program that dynamically generates new code at runtime, or a program with a static body of code? What is easier to debug, a program that uses arrays as arrays, or a program that every time it needs to map a number to an object it compiles up a small new program?

Maintainability: What’s easier to maintain, a table or a program that dynamically spits new code?

Speed: which do you think is faster, a program that dereferences an array, or a program that starts a compiler?

Memory: which uses more memory, a program that dereferences an array, or a program that starts a compiler and compiles a new chunk of code every time you need to access an array?

There is absolutely no reason to use eval to solve problems like mapping strings or numbers onto objects. Doing so dramatically lowers the quality of the code on pretty much every imaginable axis.

It gets even worse when you use eval on the server, but that’s another post.


Notes from 2020

This was my first deliberately-multi-episode topic.

There were many great comments on this article on the original blog site; to summarize a few of them:

  • There are a number of scenarios where you want to dynamically create a new function, but “new Function” is the appropriate choice rather than “eval” most of the time.
  • However, the scoping rules for “new Function” and “eval” are different — thanks, JavaScript — and so sometimes there are scenarios where you are forced to eval a new function.
  • I was not then and am not now an expert on the browser’s object model. I have many times noted the irony that as a developer of the JS compiler, I was an expert on the inner workings of the JS compiler, and not on how it was used in practice. A reader pointed out that none of my solutions were good practice compared with the expediency of:
var span = document.all("myspan" + num);
if (span != null) span.innertext = text;

or, equivalently, getElementById, on browsers which supported it at the time.

Functions are not frames

I just realized that on my list of features missing from JScript.NET “fast mode” I forgot about the caller property of functions. In compatibility mode you can say

function foo(){ bar(); }
function bar(){ print(bar.caller); }
foo();

In fast mode this prints null, in compatibility mode it prints function foo(){bar();}.

Eliminating this feature does make it possible to generate faster code — keeping track of the caller of every function at all times adds a fair amount of complexity to the code generation. But just as importantly, this feature is simply incredibly broken by its very design. The problem is that the function object is completely the wrong object to put the caller property upon in the first place. For example:

function foo(x){ bar(x-1); }
function bar(x)
{
  if (x > 0)
    foo(x-1);
  else
  {
    print(bar.caller.toString().substring(9,12));
    print(bar.caller.caller.toString().substring(9,12));
    print(bar.caller.caller.caller.toString().substring(9,12));
    print(bar.caller.caller.caller.caller.toString().substring(9,12));
  }
}

function blah(){ foo(3); }

blah();

 

This silly example is pretty straightforward — the global scope calls blah. blah calls foo(3), which calls bar(2), which calls foo(1), which calls bar(0), which prints out the call stack.

So the call stack at this point should be foo, bar, foo, blah, right? So why does this print out foo, bar, foo, bar?

Because the caller property is a property of the function object and it returns a function object. bar.caller and bar.caller.caller.caller are the same object, so of course they have the same caller property!

Clearly this is completely broken for recursive functions. What about multi-threaded programs, where there may be multiple callers on multiple threads? Do you make the caller property different on different threads?

These problems apply to the arguments property as well. Essentially the problem is that the notion we want to manipulate is activation frame, not function object, but function object is what we’ve got. To implement this feature properly you need to access the stack of activation frames, where an activation frame consists of a function object, an array of arguments, and a caller, where the caller is another activation frame. Now the problem goes away — each activation frame in a recursive, multi-threaded program is unique. To gain access to the frame we’d need to add something like the this keyword — perhaps a frame keyword that would give you the activation frame at the top of the stack.

That’s how I would have designed this feature, but in the real world we’re stuck with being backwards compatible with the original Netscape design. Fortunately, the .NET reflection code lets you walk stack frames yourself if you need to. Though it doesn’t integrate perfectly smoothly with the JScript .NET notion of functions as objects, at least it manipulates frames reasonably well.


Notes from 2020

My then-colleague and partner in mayhem Peter Torr pointed out to my embarrassment that I had completely forgotten that though, yes, the caller property on a function object is completely broken and useless, the caller property on an arguments object is what we want: per frame. He also pointed out that in some versions of JS, the arguments object is writable and actually gives access to the real frame, not a copy of its values! That is, if we have something like

function f(x)
{
  print(x);
  print(arguments[0]);
  danger();
  print(x);
  print(arguments[0]);
}
function danger()
{
  arguments.caller[0] = "goodbye";
} 
f("hello");

then whether or not the value of x is observed to change depends on what version of JavaScript you are using. Rather terrifying.

Global State On Servers Considered Harmful

The other day I noted that extending the built-in objects in JScript .NET is no longer legal in “fast mode”. Of course, this is still legal in “compatibility mode” if you need it, but why did we take it out of fast mode?

As several readers have pointed out, this is actually a kind of compelling feature. It’s nice to be able to add new methods to prototypes:

String.prototype.frobnicate = function(){/* whatever */}
var s1 = "hello";
var s2 = s1.frobnicate();

It would be nice to extend the Math object, or change the implementation of toLocaleString on Date objects, or whatever.

Unfortunately, it also breaks ASP.NET, which is the prime reason we developed fast mode in the first place. Ironically, it is not the additional compiler optimizations that a static object model enables which motivated this change! Rather, it is the compilation model of ASP.NET.

I discussed earlier how ASP uses the script engines — ASP translates the marked-up page into a script, which it compiles once and runs every time the page is served up. ASP.NET’s compilation model is similar, but somewhat different. ASP.NET takes the marked-up page and translates it into a class that extends a standard page class. It compiles the derived class once, and then every time the page is served up it creates a new instance of the class and calls the Render method on the class.

So what’s the difference? The difference is that multiple instances of multiple page classes may be running in the same application domain. In the ASP Classic model, each script engine is an entirely independent entity. In the ASP.NET model, page classes in the same application may run in the same domain, and hence can affect each other. We don’t want them to affect each other though — the information served up by one page should not depend on stuff being served up at the same time by other pages.

I’m sure you see where this is going. Those built-in objects are shared by all instances of all JScript objects in the same application domain. Imagine the chaos if you had a page that said:

String.prototype.username = FetchUserName();
String.prototype.appendUserName = 
  function() { return this + this.username; };
var greeting = "hello";
Response.Write(greeting.appendUserName());

We’ve created a race condition. Multiple instances of the page class running on multiple threads in the same appdomain might all try to change the prototype object at the same time, and the last one is going to win. Suddenly you’ve got pages that serve up the wrong data! That data might be highly sensitive, or the race condition may introduce logical errors in the script processing — errors which will be nigh-impossible to reproduce and debug.

A global writable object model in a multi-threaded appdomain where class instances should not interact is a recipe for disaster, so we made the global object model read-only in this scenario. If you need the convenience of a writable object model, there is always compatibility mode.


Notes from 2020

There were some good questions posted as comments on the original instance of this article, which I will briefly summarize here.

  • Why does fast mode also require use of var for declarations? Is the reasoning the same as for disallowing global modifications?

Yes — enforcing var improves clarity, improves optimizations and prevents accidental fouling of the global namespace.

  • Is JScript .NET being adopted?

At the time, I had no idea. Since the project was cancelled shortly after this blog was written, apparently not. It was very frustrating.

  • How does JScript .NET perform on the server compared to C# and VB.NET?

At the time, in typical realistic line-of-business benchmarks VB.NET and JS.NET were running about 5% slower throughput than C#, and that gap was closing. I have no idea what the figures are like now.

  • Should “fast mode” really be called “ASP.NET mode”?

I take the point, but in general it is a good idea to describe a feature by its characteristics, and not name it after the constituency whose scenarios motivated the feature.

I have many times since made the joke that it would have been just as accurate to name “fast mode” and “compatible mode” as instead “broken mode” and “slow mode”. I think we can be forgiven some editorializing in the choice of names.

How many Microsoft employees does it take to change a lightbulb?

UPDATE: This article was featured in The Best Software Writing I. Thanks Joel!


Joe Bork has written a great article explaining some of the decisions that go into whether a bug is fixed or not. This means that I can cross that one off my list of potential future entries. Thanks Joe!

But while I’m at it, I’d like to expand a little on what Joe said.His comments generalize to more than just bug fixes. A bug fix is one kind of change to the behaviour of the product, and all changes have similar costs and go through a similar process.

Back when I was actually adding features to the script engines on a regular basis, people would send me mail asking me to implement some new feature.Usually the feature was a “one-off” — a feature that solved their particular problem. Like, “I need to call ChangeLightBulbWindowHandleEx, but there is no ActiveX control that does so and you can’t call Win32 APIs directly from script, can you add a ChangeLightBulbWindowHandleEx method to the VBScript built-in functions? It would only be like five lines of code!”

I’d always tell these people the same thing — if it is only five lines of code then go write your own ActiveX object! Because yes, you are absolutely right — it would take me approximately five minutes to add that feature to the VBScript runtime library. But how many Microsoft employees does it actually take to change a lightbulb?

  • One dev to spend five minutes implementing ChangeLightBulbWindowHandleEx.
  • One program manager to write the specification.
  • One localization expert to review the specification for localizability issues.
  • One usability expert to review the specification for accessibility and usability issues.
  • At least one dev, tester and PM to brainstorm security vulnerabilities.
  • One PM to add the security model to the specification.
  • One tester to write the test plan.
  • One test lead to update the test schedule.
  • One tester to write the test cases and add them to the nightly automation.
  • Three or four testers to participate in an ad hoc bug bash.
  • One technical writer to write the documentation.
  • One technical reviewer to proofread the documentation.
  • One copy editor to proofread the documentation.
  • One documentation manager to integrate the new documentation into the existing body of text, update tables of contents, indexes, etc.
  • Twenty-five translators to translate the documentation and error messages into all the languages supported by Windows.The managers for the translators live in Ireland (European languages) and Japan (Asian languages), which are both severely time-shifted from Redmond, so dealing with them can be a fairly complex logistical problem.
  • A team of senior managers to coordinate all these people, write the cheques, and justify the costs to their Vice President.

None of these take very long individually, but they add up, and this is for a simple feature.You’ll note that I haven’t added all the things that Joe talks about, like what if there is a bug in those five lines of code? That initial five minutes of dev time translates into many person-weeks of work and enormous costs, all to save one person a few minutes of whipping up a one-off VB6 control that does what they want.Sorry, but that makes no business sense whatsoever. At Microsoft we try very, very hard to not release half-baked software. Getting software right — by, among other things, ensuring that a legally blind Catalan-speaking Spaniard can easily use the feature without worrying about introducing a new security vulnerability — is rather expensive! But we have to get it right because when we ship a new version of the script engines, hundreds of millions of people will exercise that code, and tens of millions will program against it.

Any new feature which does not serve a large percentage of those users is essentially stealing valuable resources that could be spent implementing features, fixing bugs or looking for security vulnerabilities that DO impact the lives of millions of people.

Further reading:


Notes from 2020

This article generated a lot of interest and feedback; I was very pleased to be included in Best Software Writing 1, and sad that there was never a part two.

Most of the feedback that I got could be summed up as: “You are making an argument for open source”

Absolutely I was not, and I was mystified then, and continue to be mystified now at this comment. If I were making any comment on open source here — which was emphatically not my intention — it would be that the problems of releasing half-baked software are exacerbated by a “drive by contribution” model of open source.

Regardless of whether source code is available or hidden, and regardless of whether a project accepts or rejects contributions from community members, there are design, specification, implementation, testing, documentation and education costs to all code changes, and when we ignore those costs, we can easily make software that is brittle, disorganized, unsupported, unscalable, incompatible, and non-compliant with important real-world considerations such as privacy regulations, accessibility, internationalization, and so on.

JScript Goes All To Pieces

My entry the other day about fast mode in JScript .NET sparked a number of questions which deserve fuller explanations.  I’ll try to get to them in my next couple of blog entries.

For example, when I said that it was no longer legal to redefine a function, I wasn’t really clear on what I meant. JScript .NET still has closures, anonymous functions, and prototype inheritance.  We didn’t remove any of those.  Furthermore, it is very important to emphasize that we implemented compatibility mode so that anyone who does need these features in JScript .NET can still get them – they will pay a performance penalty, but that’s their choice to make.

What I meant was simply that this is now illegal:

function foo() { return 1; }
function foo() { return 2; }

whereas that is perfectly legal in JScript Classic. In JScript Classic this means “discard the first definition”.

Pop quiz: what output does this produce?

function foo(){ alert(1); }
foo();
function foo(){ alert(2); }
foo();

Of course that produces “2” twice, because in JScript Classic, function and variable declarations are always treated as though they came at the top of the block of code, no matter where they are found lexically in the block.

Obviously this is bizarre, makes debugging tricky, and is totally bug-prone.  The earlier definition is completely ignored, and yet it sits there in the source code, confusing maintenance programmers who do not see the redefinition, which might be a thousand lines later.  Thus, it is illegal in JScript .NET.

But we only made this kind of redefinition illegal.  Other kinds of redefinition, like

var foo = function() { return 1; }
print(foo());
foo = function() { return 2; }
print(foo());

continue to work as you’d expect.

So why was this ever legal?  Do language designers get some kind of perverse kick out of larding languages with “gotcha” idioms?  No, actually there was a pretty good reason for these semantics.  Two reasons actually.  The first is our old friend “muddle on through when you get an error”.  However, since this error can be caught at compilation time, this is not a very convincing point.  The more important point is this one:

< script language="JScript" >
function foo(){ alert(1); }
foo();
</ script>
< script language="JScript">
function foo(){ alert(2); }
foo();
</ script>

Aha!  Now we see what’s going on here.  I said “function and variable declarations are always treated as though they came at the top of the block of code”, and here we have two blocks.  the browser will compile and run the first block, and then compile and run the second block, so this really will display “1” and then “2”.  The browser compilation model allows for piecewise execution of scripts. This scenario requires the ability to redefine methods on the fly, so, there you go.

However, ASP does not have a piecewise compilation model, and neither does ASP.NET.  When we designed JScript .NET we removed this feature from fast mode because we knew that most “normal” hosts have all the source code at once and do not ever need to dynamically pull down new chunks from the internet after old chunks have already run.  By disallowing piecewise execution, we can do a lot more optimizations because we know that once you have a function, you’ve got it and no one is going to redefine it later.

The Most Boring Story Ever

The other day a reader suggested:

Make a blogentry about how you started at MS and so on!

You asked, but I’m warning you: it’s the most boring story ever.

I grew up in Waterloo, Ontario, which was a piece of luck as Waterloo has the best computer science school in Canada. I studied applied mathematics and computer science from 1991 to 1996.

Amongst its many claims to fame is: UW has the largest cooperative education program on the planet. For my fourth, fifth and sixth work terms I was an intern on the VBA team here at Microsoft. On the strength of my internship the VBA team extended me a job offer, which I accepted. I worked full-time on the scripting technology for five years.

Then the VBA, Scripting and Microsoft Office Developer teams were reorganized into one large team (the “Trinity” team) tasked with modernizing and improving the Office developer story. I’ve been working on that for about two years now. We’ve just shipped “Microsoft Visual Studio .NET Tools For The Microsoft Office System 2003”, which I actually did very little work on — that was Peter Torr ‘s baby, so read his blog if you want details.

I’ve been working on the next version, which, of course, I can’t talk about except to say that I hope the name is shorter. Also, I do a fair amount of work still on scripting — not implementing new features of course, but ongoing work like attending security reviews, helping out our product support and sustaining engineering teams, and (obviously) writing a blog.


Commentary from 2019

It was an easy choice to go to Waterloo; I could live at home, I had family on staff, I already knew some of the professors, and it was and still is the best school for computer science and mathematics. The co-op program literally changed my life; it’s pretty unlikely that I’d be living in Seattle were it not for those work terms.

We had an all-hands Trinity team meeting the day that the official product name was announced, and people laughed. I was one of them. The team manager was known to have a sense of humour and I figured that this had to be a parody of the clunky-stream-of-nouns approach to product naming that happened at Microsoft. But no, management was serious, and this was the newest and most egregious example of bad product naming ever. “Microsoft” is in there twice for goodness’ sake!

The best product name that came out of that team was we had a little helper application that did… something. Maybe it set up Office interop security policy or something like that? I don’t remember. But it was the Microsoft Office Helper for Interop Technology, or MOHIT.EXE. That it was written by my colleague Mohit Gupta was a total coincidence, I’m sure.

Compatibility vs. Performance

Earlier I mentioned that two of the design goals for JScript .NET were high performance and compatibility with JScript Classic. Unfortunately these are somewhat contradictory goals! JScript Classic has many dynamic features which make generation of efficient code difficult. Many of these features are rarely used in real-world programs. Others are programming idioms which make programs hard to follow, difficult to debug and slow.

JScript .NET therefore has two modes: compatibility mode and fast mode. In compatibility mode there should be almost no JScript program which is not a legal JScript .NET program. Fast mode restricts the use of certain seldom-used features and thereby produces faster programs.

The JSC.EXE command-line compiler and ASP.NET both use fast mode by default. To turn fast mode off in JSC use the /fast- switch.

Fast mode puts the following restrictions on JScript .NET programs:

  • All variables must be declared with the var keyword. As I discussed earlier, in JScript Classic it is sometimes legal to use a variable without declaring it. In those situations, the JScript Classic engine automatically creates a new global variable but when in fast mode, JScript .NET does not. This is a good thing — not only is the code faster but the compiler can now catch spelling errors in variable names.
  • Functions may not be redefined. In JScript Classic it is legal to have two or more identical function definitions which do different things. Only the last definition is actually used. This is not legal in JScript .NET in fast mode. This is also goodness, as it eliminates a source of confusion and bugs.
  • Built-in objects are entirely read-only. In JScript Classic it is legal to add, modify and (if you are perverse), delete some properties on the Math object, the String prototype and the other built-in objects.
  • Attempting to write to read-only properties now produces errors. In JScript Classic writing to a read-only property fails silently, in keeping with the design principle I discussed earlier: muddle on through.
  • Functions no longer have an arguments property. The primary use of the arguments property is to create functions which take a variable number of arguments. JScript .NET has a specific syntax for creating such a function. This makes the arguments object unnecessary. To create a JScript .NET function which takes any number of arguments the syntax is:
function MyFunction(... args : Object[] 
{
  // now use args.length, args[0], etc.
}

Generally speaking, unclear code is slow code. If the compiler is unable to generate good code it is usually because the restrictions on the objects described in the code are so loose as to make optimization impossible. These few restrictions not only let JScript .NET generate faster code, they also enforce good programming style without overly damaging the “scripty” nature of the language. And if you must run code which has undeclared variables, redefined functions, modified built-in objects or reflection on the function arguments, then there is always compatibility mode to fall back upon.

JScript .NET also provides warnings when programming idioms could potentially produce slow code. For example, recall my earlier article on string concatenation.  Using the += operator on strings now produces a warning which suggests using a StringBuilder instead. JScript .NET also produces warnings when code is likely to be incorrect. For example, using a variable before initializing it produces a warning.  So does branching out of a finally block now produce warnings, and so on.


Commentary from 2020

This post generated some good feedback from JS experts who read my blog regularly back in the day.

  • The ability to extend the standard string, number and function capabilities by messing around with the prototype chain was seen by them as a strength of JS, and they therefore suggested weakening the “do not modify the built-in objects” restriction. I do not recall if we followed this advice, but I think not.
  • These restrictions basically make JS.NET in fast mode into another syntax for C#. Well yeah! C# was designed to be fast and understandable, so if you want to make a slow, unpredictable language fast and understandable, making it act more like C# is a sensible way to do that. However, the commenter makes a great point: the only selling point of JS.NET over C# then becomes “attractive to people who know JS but not C#”. But if you know JS already you can easily pick up C#.
  • It would be nice to have a language like JS.NET in the browser. Yes, it really would; a lot of the JS.NET features eventually made it into ES6 so that wish was fulfilled a mere couple decades later.