Functions are not frames

Posted on October 31, 2003 by ericlippert

I just realized that on my list of features missing from JScript.NET “fast mode” I forgot about the caller property of functions. In compatibility mode you can say

function foo(){ bar(); }
function bar(){ print(bar.caller); }
foo();

In fast mode this prints null, in compatibility mode it prints function foo(){bar();}.

Eliminating this feature does make it possible to generate faster code — keeping track of the caller of every function at all times adds a fair amount of complexity to the code generation. But just as importantly, this feature is simply incredibly broken by its very design. The problem is that the function object is completely the wrong object to put the caller property upon in the first place. For example:

function foo(x){ bar(x-1); }
function bar(x)
{
  if (x > 0)
    foo(x-1);
  else
  {
    print(bar.caller.toString().substring(9,12));
    print(bar.caller.caller.toString().substring(9,12));
    print(bar.caller.caller.caller.toString().substring(9,12));
    print(bar.caller.caller.caller.caller.toString().substring(9,12));
  }
}

function blah(){ foo(3); }

blah();

This silly example is pretty straightforward — the global scope calls blah. blah calls foo(3), which calls bar(2), which calls foo(1), which calls bar(0), which prints out the call stack.

So the call stack at this point should be foo, bar, foo, blah, right? So why does this print out foo, bar, foo, bar?

Because the caller property is a property of the function object and it returns a function object. bar.caller and bar.caller.caller.caller are the same object, so of course they have the same caller property!

Clearly this is completely broken for recursive functions. What about multi-threaded programs, where there may be multiple callers on multiple threads? Do you make the caller property different on different threads?

These problems apply to the arguments property as well. Essentially the problem is that the notion we want to manipulate is activation frame, not function object, but function object is what we’ve got. To implement this feature properly you need to access the stack of activation frames, where an activation frame consists of a function object, an array of arguments, and a caller, where the caller is another activation frame. Now the problem goes away — each activation frame in a recursive, multi-threaded program is unique. To gain access to the frame we’d need to add something like the this keyword — perhaps a frame keyword that would give you the activation frame at the top of the stack.

That’s how I would have designed this feature, but in the real world we’re stuck with being backwards compatible with the original Netscape design. Fortunately, the .NET reflection code lets you walk stack frames yourself if you need to. Though it doesn’t integrate perfectly smoothly with the JScript .NET notion of functions as objects, at least it manipulates frames reasonably well.

Notes from 2020

My then-colleague and partner in mayhem Peter Torr pointed out to my embarrassment that I had completely forgotten that though, yes, the caller property on a function object is completely broken and useless, the caller property on an arguments object is what we want: per frame. He also pointed out that in some versions of JS, the arguments object is writable and actually gives access to the real frame, not a copy of its values! That is, if we have something like

function f(x)
{
  print(x);
  print(arguments[0]);
  danger();
  print(x);
  print(arguments[0]);
}
function danger()
{
  arguments.caller[0] = "goodbye";
} 
f("hello");

then whether or not the value of x is observed to change depends on what version of JavaScript you are using. Rather terrifying.

Global State On Servers Considered Harmful

Posted on October 29, 2003 by ericlippert

The other day I noted that extending the built-in objects in JScript .NET is no longer legal in “fast mode”. Of course, this is still legal in “compatibility mode” if you need it, but why did we take it out of fast mode?

As several readers have pointed out, this is actually a kind of compelling feature. It’s nice to be able to add new methods to prototypes:

String.prototype.frobnicate = function(){/* whatever */}
var s1 = "hello";
var s2 = s1.frobnicate();

It would be nice to extend the Math object, or change the implementation of toLocaleString on Date objects, or whatever.

Unfortunately, it also breaks ASP.NET, which is the prime reason we developed fast mode in the first place. Ironically, it is not the additional compiler optimizations that a static object model enables which motivated this change! Rather, it is the compilation model of ASP.NET.

I discussed earlier how ASP uses the script engines — ASP translates the marked-up page into a script, which it compiles once and runs every time the page is served up. ASP.NET’s compilation model is similar, but somewhat different. ASP.NET takes the marked-up page and translates it into a class that extends a standard page class. It compiles the derived class once, and then every time the page is served up it creates a new instance of the class and calls the Render method on the class.

So what’s the difference? The difference is that multiple instances of multiple page classes may be running in the same application domain. In the ASP Classic model, each script engine is an entirely independent entity. In the ASP.NET model, page classes in the same application may run in the same domain, and hence can affect each other. We don’t want them to affect each other though — the information served up by one page should not depend on stuff being served up at the same time by other pages.

I’m sure you see where this is going. Those built-in objects are shared by all instances of all JScript objects in the same application domain. Imagine the chaos if you had a page that said:

String.prototype.username = FetchUserName();
String.prototype.appendUserName = 
  function() { return this + this.username; };
var greeting = "hello";
Response.Write(greeting.appendUserName());

We’ve created a race condition. Multiple instances of the page class running on multiple threads in the same appdomain might all try to change the prototype object at the same time, and the last one is going to win. Suddenly you’ve got pages that serve up the wrong data! That data might be highly sensitive, or the race condition may introduce logical errors in the script processing — errors which will be nigh-impossible to reproduce and debug.

A global writable object model in a multi-threaded appdomain where class instances should not interact is a recipe for disaster, so we made the global object model read-only in this scenario. If you need the convenience of a writable object model, there is always compatibility mode.

Notes from 2020

There were some good questions posted as comments on the original instance of this article, which I will briefly summarize here.

Why does fast mode also require use of var for declarations? Is the reasoning the same as for disallowing global modifications?

Yes — enforcing var improves clarity, improves optimizations and prevents accidental fouling of the global namespace.

Is JScript .NET being adopted?

At the time, I had no idea. Since the project was cancelled shortly after this blog was written, apparently not. It was very frustrating.

How does JScript .NET perform on the server compared to C# and VB.NET?

At the time, in typical realistic line-of-business benchmarks VB.NET and JS.NET were running about 5% slower throughput than C#, and that gap was closing. I have no idea what the figures are like now.

Should “fast mode” really be called “ASP.NET mode”?

I take the point, but in general it is a good idea to describe a feature by its characteristics, and not name it after the constituency whose scenarios motivated the feature.

I have many times since made the joke that it would have been just as accurate to name “fast mode” and “compatible mode” as instead “broken mode” and “slow mode”. I think we can be forgiven some editorializing in the choice of names.

How many Microsoft employees does it take to change a lightbulb?

Posted on October 28, 2003 by ericlippert

UPDATE: This article was featured in The Best Software Writing I. Thanks Joel!

Joe Bork has written a great article explaining some of the decisions that go into whether a bug is fixed or not. This means that I can cross that one off my list of potential future entries. Thanks Joe!

But while I’m at it, I’d like to expand a little on what Joe said.His comments generalize to more than just bug fixes. A bug fix is one kind of change to the behaviour of the product, and all changes have similar costs and go through a similar process.

Back when I was actually adding features to the script engines on a regular basis, people would send me mail asking me to implement some new feature.Usually the feature was a “one-off” — a feature that solved their particular problem. Like, “I need to call ChangeLightBulbWindowHandleEx, but there is no ActiveX control that does so and you can’t call Win32 APIs directly from script, can you add a ChangeLightBulbWindowHandleEx method to the VBScript built-in functions? It would only be like five lines of code!”

I’d always tell these people the same thing — if it is only five lines of code then go write your own ActiveX object! Because yes, you are absolutely right — it would take me approximately five minutes to add that feature to the VBScript runtime library. But how many Microsoft employees does it actually take to change a lightbulb?

One dev to spend five minutes implementing ChangeLightBulbWindowHandleEx.
One program manager to write the specification.
One localization expert to review the specification for localizability issues.
One usability expert to review the specification for accessibility and usability issues.
At least one dev, tester and PM to brainstorm security vulnerabilities.
One PM to add the security model to the specification.
One tester to write the test plan.
One test lead to update the test schedule.
One tester to write the test cases and add them to the nightly automation.
Three or four testers to participate in an ad hoc bug bash.
One technical writer to write the documentation.
One technical reviewer to proofread the documentation.
One copy editor to proofread the documentation.
One documentation manager to integrate the new documentation into the existing body of text, update tables of contents, indexes, etc.
Twenty-five translators to translate the documentation and error messages into all the languages supported by Windows.The managers for the translators live in Ireland (European languages) and Japan (Asian languages), which are both severely time-shifted from Redmond, so dealing with them can be a fairly complex logistical problem.
A team of senior managers to coordinate all these people, write the cheques, and justify the costs to their Vice President.

None of these take very long individually, but they add up, and this is for a simple feature.You’ll note that I haven’t added all the things that Joe talks about, like what if there is a bug in those five lines of code? That initial five minutes of dev time translates into many person-weeks of work and enormous costs, all to save one person a few minutes of whipping up a one-off VB6 control that does what they want.Sorry, but that makes no business sense whatsoever. At Microsoft we try very, very hard to not release half-baked software. Getting software right — by, among other things, ensuring that a legally blind Catalan-speaking Spaniard can easily use the feature without worrying about introducing a new security vulnerability — is rather expensive! But we have to get it right because when we ship a new version of the script engines, hundreds of millions of people will exercise that code, and tens of millions will program against it.

Any new feature which does not serve a large percentage of those users is essentially stealing valuable resources that could be spent implementing features, fixing bugs or looking for security vulnerabilities that DO impact the lives of millions of people.

JScript Goes All To Pieces

Posted on October 27, 2003 by ericlippert

My entry the other day about fast mode in JScript .NET sparked a number of questions which deserve fuller explanations. I’ll try to get to them in my next couple of blog entries.

For example, when I said that it was no longer legal to redefine a function, I wasn’t really clear on what I meant. JScript .NET still has closures, anonymous functions, and prototype inheritance. We didn’t remove any of those. Furthermore, it is very important to emphasize that we implemented compatibility mode so that anyone who does need these features in JScript .NET can still get them – they will pay a performance penalty, but that’s their choice to make.

What I meant was simply that this is now illegal:

function foo() { return 1; }
function foo() { return 2; }

whereas that is perfectly legal in JScript Classic. In JScript Classic this means “discard the first definition”.

Pop quiz: what output does this produce?

function foo(){ alert(1); }
foo();
function foo(){ alert(2); }
foo();

Of course that produces “2” twice, because in JScript Classic, function and variable declarations are always treated as though they came at the top of the block of code, no matter where they are found lexically in the block.

Obviously this is bizarre, makes debugging tricky, and is totally bug-prone. The earlier definition is completely ignored, and yet it sits there in the source code, confusing maintenance programmers who do not see the redefinition, which might be a thousand lines later. Thus, it is illegal in JScript .NET.

But we only made this kind of redefinition illegal. Other kinds of redefinition, like

var foo = function() { return 1; }
print(foo());
foo = function() { return 2; }
print(foo());

continue to work as you’d expect.

So why was this ever legal? Do language designers get some kind of perverse kick out of larding languages with “gotcha” idioms? No, actually there was a pretty good reason for these semantics. Two reasons actually. The first is our old friend “muddle on through when you get an error”. However, since this error can be caught at compilation time, this is not a very convincing point. The more important point is this one:

< script language="JScript" >
function foo(){ alert(1); }
foo();
</ script>
< script language="JScript">
function foo(){ alert(2); }
foo();
</ script>

Aha! Now we see what’s going on here. I said “function and variable declarations are always treated as though they came at the top of the block of code”, and here we have two blocks. the browser will compile and run the first block, and then compile and run the second block, so this really will display “1” and then “2”. The browser compilation model allows for piecewise execution of scripts. This scenario requires the ability to redefine methods on the fly, so, there you go.

However, ASP does not have a piecewise compilation model, and neither does ASP.NET. When we designed JScript .NET we removed this feature from fast mode because we knew that most “normal” hosts have all the source code at once and do not ever need to dynamically pull down new chunks from the internet after old chunks have already run. By disallowing piecewise execution, we can do a lot more optimizations because we know that once you have a function, you’ve got it and no one is going to redefine it later.

The Most Boring Story Ever

Posted on October 27, 2003 by ericlippert

The other day a reader suggested:

Make a blogentry about how you started at MS and so on!

You asked, but I’m warning you: it’s the most boring story ever.

I grew up in Waterloo, Ontario, which was a piece of luck as Waterloo has the best computer science school in Canada. I studied applied mathematics and computer science from 1991 to 1996.

Amongst its many claims to fame is: UW has the largest cooperative education program on the planet. For my fourth, fifth and sixth work terms I was an intern on the VBA team here at Microsoft. On the strength of my internship the VBA team extended me a job offer, which I accepted. I worked full-time on the scripting technology for five years.

Then the VBA, Scripting and Microsoft Office Developer teams were reorganized into one large team (the “Trinity” team) tasked with modernizing and improving the Office developer story. I’ve been working on that for about two years now. We’ve just shipped “Microsoft Visual Studio .NET Tools For The Microsoft Office System 2003”, which I actually did very little work on — that was Peter Torr ‘s baby, so read his blog if you want details.

I’ve been working on the next version, which, of course, I can’t talk about except to say that I hope the name is shorter. Also, I do a fair amount of work still on scripting — not implementing new features of course, but ongoing work like attending security reviews, helping out our product support and sustaining engineering teams, and (obviously) writing a blog.

Commentary from 2019

It was an easy choice to go to Waterloo; I could live at home, I had family on staff, I already knew some of the professors, and it was and still is the best school for computer science and mathematics. The co-op program literally changed my life; it’s pretty unlikely that I’d be living in Seattle were it not for those work terms.

We had an all-hands Trinity team meeting the day that the official product name was announced, and people laughed. I was one of them. The team manager was known to have a sense of humour and I figured that this had to be a parody of the clunky-stream-of-nouns approach to product naming that happened at Microsoft. But no, management was serious, and this was the newest and most egregious example of bad product naming ever. “Microsoft” is in there twice for goodness’ sake!

The best product name that came out of that team was we had a little helper application that did… something. Maybe it set up Office interop security policy or something like that? I don’t remember. But it was the Microsoft Office Helper for Interop Technology, or MOHIT.EXE. That it was written by my colleague Mohit Gupta was a total coincidence, I’m sure.

Compatibility vs. Performance

Posted on October 24, 2003 by ericlippert

Earlier I mentioned that two of the design goals for JScript .NET were high performance and compatibility with JScript Classic. Unfortunately these are somewhat contradictory goals! JScript Classic has many dynamic features which make generation of efficient code difficult. Many of these features are rarely used in real-world programs. Others are programming idioms which make programs hard to follow, difficult to debug and slow.

JScript .NET therefore has two modes: compatibility mode and fast mode. In compatibility mode there should be almost no JScript program which is not a legal JScript .NET program. Fast mode restricts the use of certain seldom-used features and thereby produces faster programs.

The JSC.EXE command-line compiler and ASP.NET both use fast mode by default. To turn fast mode off in JSC use the /fast- switch.

Fast mode puts the following restrictions on JScript .NET programs:

All variables must be declared with the var keyword. As I discussed earlier, in JScript Classic it is sometimes legal to use a variable without declaring it. In those situations, the JScript Classic engine automatically creates a new global variable but when in fast mode, JScript .NET does not. This is a good thing — not only is the code faster but the compiler can now catch spelling errors in variable names.
Functions may not be redefined. In JScript Classic it is legal to have two or more identical function definitions which do different things. Only the last definition is actually used. This is not legal in JScript .NET in fast mode. This is also goodness, as it eliminates a source of confusion and bugs.
Built-in objects are entirely read-only. In JScript Classic it is legal to add, modify and (if you are perverse), delete some properties on the Math object, the String prototype and the other built-in objects.
Attempting to write to read-only properties now produces errors. In JScript Classic writing to a read-only property fails silently, in keeping with the design principle I discussed earlier: muddle on through.
Functions no longer have an arguments property. The primary use of the arguments property is to create functions which take a variable number of arguments. JScript .NET has a specific syntax for creating such a function. This makes the arguments object unnecessary. To create a JScript .NET function which takes any number of arguments the syntax is:

function MyFunction(... args : Object[] 
{
  // now use args.length, args[0], etc.
}

Generally speaking, unclear code is slow code. If the compiler is unable to generate good code it is usually because the restrictions on the objects described in the code are so loose as to make optimization impossible. These few restrictions not only let JScript .NET generate faster code, they also enforce good programming style without overly damaging the “scripty” nature of the language. And if you must run code which has undeclared variables, redefined functions, modified built-in objects or reflection on the function arguments, then there is always compatibility mode to fall back upon.

JScript .NET also provides warnings when programming idioms could potentially produce slow code. For example, recall my earlier article on string concatenation. Using the += operator on strings now produces a warning which suggests using a StringBuilder instead. JScript .NET also produces warnings when code is likely to be incorrect. For example, using a variable before initializing it produces a warning. So does branching out of a finally block now produce warnings, and so on.

Commentary from 2020

This post generated some good feedback from JS experts who read my blog regularly back in the day.

The ability to extend the standard string, number and function capabilities by messing around with the prototype chain was seen by them as a strength of JS, and they therefore suggested weakening the “do not modify the built-in objects” restriction. I do not recall if we followed this advice, but I think not.
These restrictions basically make JS.NET in fast mode into another syntax for C#. Well yeah! C# was designed to be fast and understandable, so if you want to make a slow, unpredictable language fast and understandable, making it act more like C# is a sensible way to do that. However, the commenter makes a great point: the only selling point of JS.NET over C# then becomes “attractive to people who know JS but not C#”. But if you know JS already you can easily pick up C#.
It would be nice to have a language like JS.NET in the browser. Yes, it really would; a lot of the JS.NET features eventually made it into ES6 so that wish was fulfilled a mere couple decades later.

Michael’s Security Blog is online

Posted on October 23, 2003 by ericlippert

Michael Howard has started blogging. If you’re interested in writing secure code (and these days, who isn’t?) you could do worse than to read anything he writes.

Commentary from 2019

Michael was a lot of fun to work with over the years; he has a deep understanding of security, strong opinions, and a willingness to share both. I was particularly honoured to be asked to review the C# sections of Writing Secure Code 2, which is excellent.

I have not read his blog for years but I am delighted to discover that he is still writing it in 2019; the link above has been updated. I have many years of posts to catch up on it seems!

Attention passengers: Flight 0703 is also known as Flight 451

Posted on October 23, 2003 by ericlippert

I hate octal. Octal causes bugs. I hate bugs, particularly stupid “gotcha” bugs. C programmers do things like

int arr_flight = 0703;

not realizing that this does not assign the number 703, but rather 7 * 64 + 3 = 451.

Even worse, JScript programmers do things like

var arr_flight = 0708;
var dep_flight = 0707;

not realizing that the former is a decimal literal but the latter is an octal literal.

Yes, in JScript it really is the case that if a literal begins with 0, consists of only digits and contains an 8 or a 9 then it is decimal but if it contains no 8 or 9 then it is octal! The first version of the JScript lexer did not implement those rules, but eventually we changed it to be compatible with Netscape’s implementation.

This is in keeping with the design principle that I mentioned earlier, namely “Got a problem? Muddle on through!” However, since this problem can be caught at compile time, I think that the decision to make illegal octal literals into decimals was a poor one.

It’s just a mess. Octal literals and escape sequences have been removed from the ECMAScript specification, though of course they live on in actual implementations for backwards compatibility.

This is why I added code to JScript .NET so that any use of an integer decimal or octal literal that begins with zero yields a compiler warning, with one exception. Obviously x = 0; does not produce a warning!

Commentary from 2020

I still hate octal. Fortunately it seems to have fallen out of favour when designing new programming languages.

A commenter asked how warnings work in JScript .NET; I noted that in JScript Classic there was a facility to report errors back to the host but not warnings. JScript .NET’s hosting APIs supported both errors and warnings.

Making Sense of HRESULTS

Posted on October 22, 2003 by ericlippert

Every now and then — like, say, this morning — someone sends me this mail:

I’m getting an error in my JScript program. The error number is -2147024877. No description. Help!

Making sense of those error numbers requires some delving into the depths of how COM represents errors — the HRESULT.

An HRESULT is a 32 bit unsigned integer where the high bit indicates whether it is an
error or a success. The remaining bits in the high word indicate the “facility” of the error — into what broad category does this error fall? The low word indicates the specific error for that facility.

HRESULTS are therefore usually talked about in hex, as the bit structure is a lot easier to read in hex! Consider 0x80070013, for example. The high bit is set, so this is an error. The facility code is 7 and the error code is 0x0013 = 19 in decimal.

Unfortunately, JScript interprets the 32 bit error code as a signed integer and displays it in decimal. No problem — just convert that thing back to hex, right?

var x = -2147024877;
print(x.toString(16))

Whoops, not quite. JScript doesn’t know that you want this as an unsigned number, so it converts it to a signed hex number, -0x7ff8ffed. We need to convert this thing to the value it would have been had JScript interpreted it as an unsigned number in the first place. A handy fact to know is that the difference between an unsigned number interpreted as a signed number and the same number interpreted as an unsigned number is always 0x100000000 if the high bit is set, 0 otherwise.

var x = -2147024877;
print((x<0?x+0x100000000:x).toString(16))

There we go. That prints out 80070013. Or, even better, we could just write a program that takes the error apart:

function DumpHR(hr)
{
  if (hr < 0) hr += 0x100000000;
  if (hr & 0x80000000)
    print("Error code");
  else 
    print("Success code");
  var facility = (hr & 0x7FFF0000) >> 16;
  print("Facility" + facility);
  var scode = hr & 0x0000FFFF;
  print("SCode" + scode);
}
DumpHR(-2147024877);

The facility codes (in decimal) are as follows

FACILITY_NULL 0
FACILITY_RPC 1
FACILITY_DISPATCH 2
FACILITY_STORAGE 3
FACILITY_ITF 4
FACILITY_WIN32 7
FACILITY_WINDOWS 8
FACILITY_SECURITY 9
FACILITY_CONTROL 10
FACILITY_CERT 11
FACILITY_INTERNET 12
FACILITY_MEDIASERVER 13
FACILITY_MSMQ 14
FACILITY_SETUPAPI 15
FACILITY_SCARD 16
FACILITY_COMPLUS 17
FACILITY_AAF 18
FACILITY_URT 19
FACILITY_ACS 20
FACILITY_DPLAY 21
FACILITY_UMI 22
FACILITY_SXS 23
FACILITY_WINDOWS_CE 24
FACILITY_HTTP 25
FACILITY_BACKGROUNDCOPY 32
FACILITY_CONFIGURATION 33
FACILITY_STATE_MANAGEMENT 34
FACILITY_METADIRECTORY 35

So you can see that our example is a Windows operating system error (facility 7), and looking up error 19 we see that this is ERROR_WRITE_PROTECT — someone is trying to write to a write-protected floppy probably.

All the errors generated by the script engines — syntax errors, for example — are FACILITY_CONTROL, and the error numbers vary between script engines. VB also uses FACILITY_CONTROL, but fortunately VBScript assigns the same meanings to the errors as VB does. But in general, if you get a FACILITY_CONTROL error you need to know what control generated the error — VBScript, JScript, a third party control, what? Because each control can define their own errors, and there may be collisions.

Finally, here are some commonly encountered HRESULTs:

E_UNEXPECTED 0x8000FFFF “Catestrophic failure” — something completely unexpected has happened
E_NOTIMPL 0x80004001 “Not implemented” — the developer never got around to writing the method you just called!
E_OUTOFMEMORY 0x8007000E pretty obvious what happened here (remember that out of memory means you ran out of address space, not RAM!)
E_INVALIDARG 0x80070057 you passed a bad argument to a method
E_NOINTERFACE 0x80004002 COM is asking an object for an interface it does not support. This can happen if you try to script an object that doesn’t support IDispatch.
E_ABORT 0x80004004 whatever you were doing was terminated
E_FAIL 0x80004005 something failed and we don’t know what.

And finally, here are three that you should see only rarely from script, but script hosts may see them moving around in memory and wonder what is going on:

SCRIPT_E_RECORDED 0x86664004 this is how we internally track whether the details of an error have been recorded in the error object or not. We need a way to say “yes, there was an error, but do not attempt to record information about it again.”
SCRIPT_E_PROPAGATE 0x80020102 another internal code that we use to track the case where a recorded error is being propagated up the call stack to a waiting catch handler.
SCRIPT_E_REPORTED 0x80020101 the script engines return this to the host when there has been an unhandled error that the host has already been informed about via OnScriptError.

That’s a pretty bare-bones look at error codes, but it should at least get you started next time you have a confusing error number.

Commentary from 2020

First off: write-protected floppies were a real thing! Honest!

There were a number of good user comments on this article with advice extending mine:

FACILITY_ITF means “the interface you’re calling defines the meaning of the error you’re getting” which can be confusing
You can use Windows Calculator in scientific mode to quickly convert decimals to hex DWORDs
Look in winerror.h for more predefined error codes
The HRPLUS utility is good for HRESULT analysis
Visual Studio has an “hr” format specifier that will convert numeric values to their text equivalents. Making a watch on @EAX,hr and @ERR,hr is useful! @ERR shows the value of a call to GetLastError.
For a great explanation of how the script engines propagate errors around, see this SO question.

Constant Folding and Partial Evaluation

Posted on October 21, 2003 by ericlippert

A reader asks “is there any reason why VBScript doesn’t change

str = str & "1234567890" & "hello"

str = str & "1234567890hello"

since they are both constants?”

Good question. Yes, there are reasons.

The operation you’re describing is called constant folding, and it is a very common compile-time optimization. VBScript does an extremely limited kind of constant folding. In VBScript, these two programs generate exactly the same code at the call site:

const foo = "hello"
print foo

is exactly the same as

print "hello"

That is, the code generated for both says “pass the literal string “hello” to the print subroutine”. If foo had been a variable instead of a constant then the code would have been generated to say “pass the contents of variable foo…”

But the VBScript code generator is smart enough to realize that foo is a constant, and so it does not generate a by-name or by-index lookup, it just slams the constant right in there so that there is no lookup indirection at all.

The kind of constant folding you’re describing is compile-time evaluation of expressions which have all operands known at compile time. For short, let’s call it partial evaluation. In C++ (or C#) for example, it is legal to say

const int CallMethod = 0x1;
const int CallProperty = 0x2;
const int CallMethodOrProperty = CallMethod | CallProperty;

The C++ compiler is smart enough to realize that it can compute the third value itself. VBScript would produce an error in this situation, as the compiler is not that smart. Neither VBScript nor JScript will evaluate constant expressions at compile time.

An even more advanced form of constant folding is to determine which functions are pure functions — that is, functions which have no side effects, where the output of the function depends solely on the arguments passed in. For example, in a language that supported pure functions, this would be legal:

const Real Pi = 3.14159265358979;
const Real Sine60 = sine( Pi / 3);  // Pi / 3 radians = 60 degrees

The sine function is a pure function — there’s no reason that it could not be called at compile time to assign to this constant. However, in practice it can be very difficult to identify pure functions, and even if you can, there are issues in calling arbitrary code at compile time — like, what if the pure function takes an hour to run? That’s a long compile! What if it throws exceptions? There are many practical problems.

The JScript .NET compiler does support partial evaluation, but not pure functions. The JScript .NET compiler architecture is quite interesting. The source code is lexed into a stream of tokens, and then the tokens are parsed to form a parse tree. Each node in the parse tree is represented by an object (written in C#) which implements three methods: Evaluate, PartialEvaluate and TranslateToIL.

When you call PartialEvaluate on the root of the parse tree, it recursively descends through the tree looking for nodes representing operations where all the sub-nodes are known at compile time. Those nodes are evaluated and collapsed into simpler nodes. Once the tree has been evaluated as much as is possible at compile time, we then call TranslateToIL, which starts another recursive descent that emits the IL into the generated assembly.

The Evaluate method is there to implement the eval function. JScript Classic (which everyone thinks is an “interpreted” language) always compiles the script to bytecode and then interprets the bytecode — even eval calls the bytecode compiler in JScript Classic. But in JScript Classic, a bytecode block is a block of memory entirely under control of the JScript engine, which can release it when the code is no longer callable.

In JScript .NET, we compile to IL which is then jitted into machine code. If
JScript .NET’s implementation of eval emitted IL, then that jitted code would stay in memory until the appdomain went away! This means that a tight loop with an eval in it is essentially a memory leak in JScript .NET, but not in JScript Classic. Therefore, JScript .NET actually implements a true interpreter! In JScript .NET, eval generates a parse tree and does a full recursive evaluation on it.

I’m digressing slightly. You wanted to know why the script engines don’t implement partial evaluation. Well, first of all, implementing partial evaluation would have made the script engines considerably more complicated for very little performance gain. And if the author does want this gain, then the author can easily fold the constants “by hand”.

But more important, partial evaluation makes the process of compiling the script into bytecode much, much longer as you need to do yet another complete recursive pass over the parse tree. That’s great, isn’t it? I mean, that’s trading increased compilation time for decreased run time. What could be wrong with that? Well, it depends who you ask.

From the ASP implementers’ perspective, that would indeed be great. An ASP page, as I’ve already discussed, only gets compiled once, on the first page hit, but might be run many times. Who cares if the first page hit takes a few milliseconds longer to do the compilation, if the subsequent million page hits each run a few microseconds faster? And so what if this makes the VBScript DLL larger? ASP updates are distributed to people with fast internet connections.

But from the IE implementers’ perspective, partial evaluation is a step in the wrong direction. ASP wants the compilation to go slow and the run to go fast because they are generating the code once, calling it a lot, and generating strings that must be served up as fast as possible. IE wants the compilation to be as fast as possible because they want as little delay as possible between the HTML arriving over the network and the page rendering correctly. They’re never going to run the script again after its generated once, so there is no amortization of compilation cost. And IE typically uses scripts to run user interface elements, not to build up huge strings as fast as possible. Every microsecond does NOT count in most UI scenarios — as long as the UI events are processed just slightly faster than we incredibly slow humans can notice the lag, everyone is happy.

Fabulous adventures in coding

Eric Lippert's blog

Monthly Archives: October 2003

Functions are not frames

Global State On Servers Considered Harmful

How many Microsoft employees does it take to change a lightbulb?

JScript Goes All To Pieces

The Most Boring Story Ever

Compatibility vs. Performance

Michael’s Security Blog is online

Attention passengers: Flight 0703 is also known as Flight 451

Making Sense of HRESULTS

Constant Folding and Partial Evaluation