Functions are not frames

I
just realized that on my list of features missing from JScript.NET “fast mode” I forgot
about the caller property
of functions. “urn:schemas-microsoft-com:office:office” />In
compatibility mode you can say

 

function
foo(){bar();}

function
bar(){print(bar.caller);}

foo();

 

In
fast mode this prints null,
in compatibility mode it prints function
foo(){bar();}.

 

Eliminating
this feature does make it possible to generate faster code — keeping track of the
caller of every function at all times adds a fair amount of complexity to the code
generation. But just as importantly,
this feature is simply incredibly broken by its very design. The problem is that the
function object is completely the wrong object to put the
caller property
upon in the first place
. For
example:

 

function
foo(x){bar(x-1);}

function
bar(x)

{

if
(x > 0)

foo(x-1);

else

{

print(bar.caller.toString().substring(9,12));

print(bar.caller.caller.toString().substring(9,12));

print(bar.caller.caller.caller.toString().substring(9,12));

print(bar.caller.caller.caller.caller.toString().substring(9,12));

}

}

function
bla(){foo(3)}

blah();

 

This
silly example is pretty straightforward — the global scope calls bla. bla calls foo(3),
calls bar(2),
calls foo(1),
calls bar(0),
prints out the call stack. So the call
stack at this point should be foo, bar, foo, bla,
right? So why does this print out foo, bar, foo, bar?

 

Because
the caller property
is a property of the function object and it returns a function object. bar.caller and bar.caller.caller.caller are
the same object
, so of course they have the same caller property!
Clearly this is completely broken for recursive functions. What
about multi-threaded programs, where there may be multiple callers on multiple threads? Do
you make the caller property different on different threads?

 

These
problems apply to the arguments property
as well. Essentially the problem is that
the notion we want to manipulate is activation frame, not function object,
but function object is what we’ve got. To
implement this feature properly you need to access the stack of activation frames,
where an activation frame consists of a function object, an array of arguments, and
a caller, where the caller is another activation frame. Now
the problem goes away — each activation frame in a recursive, multi-threaded
program is unique
. To gain access
to the frame we’d need to add something like the this keyword — perhaps
a frame keyword that would give you the activation frame at the top of the
stack.

 

That’s
how I would have designed this feature,
but in the real world we’re stuck with being backwards compatible with the original
Netscape design. Fortunately, the .NET
reflection code lets you walk stack frames yourself if you need to. Though
it doesn’t integrate perfectly smoothly with the JScript .NET notion of functions
as objects, at least it manipulates frames reasonably well.

Tags JScript JScript .NET Scripting

Comments (3)

Cancel reply

You must be logged in to post a comment.

  1. Peter Torr says:Ahhh, you forget “arguments.caller”, which is per-frame. You also had a typo in the call to “bla” (“blah”) // ———-function bar(x)
    {
    if (x > 0)
    foo(x-1);
    else
    {
    print(arguments.caller.callee)
    print(“—–”)
    print(arguments.caller.caller.callee)
    print(“—–”)
    print(arguments.caller.caller.caller.callee)
    print(“—–”)
    print(arguments.caller.caller.caller.caller.callee)
    }
    }Log in to Reply
  2. (function (){foo(3)})()
  3. function foo(x){bar(x-1);}
  4. Try the slightly modified program (also using a tricky anonymous function) for fun and profit!:
  5. November 3, 2003 at 7:47 pm
  6. Peter Torr says:Here’s an anomoly between 5.x and .NET — run this program on both platforms and see the difference between how we alias ‘arguments’ and actual parameters in 5.x whereas we don’t in .NET:// ———-function ICantControlMyOwnData(theData)
    {
    print(“Before:”)
    print(“theData is ” + theData)
    print(“arguments[0] is ” + arguments[0])print(“—”)
    print(“After:”)
    print(“theData is ” + theData)
    print(“arguments[0] is ” + arguments[0])
    }Log in to Reply
  7. function IMessWithMyCaller()
    {
    arguments.caller[0] = “mwhahahahahahaaa”
    }
  8. IMessWithMyCaller()
  9. print(“I am running in JScript version ” + ScriptEngineMajorVersion())
    ICantControlMyOwnData(“Hello world”)
  10. (Oh and never write code like this — it will break!)
  11. November 3, 2003 at 7:56 pm
  12. Samuel Bronson says:Sigh. They *tried* to take this out of Mozilla, but put it back for legacy reasons: see bugzilla.mozilla.org/show_bug.cgi for some details. Too bad there doesn’t seem to be a *standard* way of achieving this…
  13. Log in to Reply
  14. June 28, 2010 at 12:12 pm

Global State On Servers Considered Harmful

The
other day I noted that extending the built-in objects in JScript .NET is no longer
legal in “fast mode”. “urn:schemas-microsoft-com:office:office” />Of
course, this is still legal in “compatibility mode” if you need it, but why did we
take it out of fast mode?

 

As
several readers have pointed out, this is actually a kind of compelling feature. It’s
nice to be able to add new methods to prototypes:

 

String.prototype.frobnicate
= function(){/* whatever */}

var s1
= “hello”;

var s2
= s1.frobnicate();

 

It
would be nice to extend the Math object,
or change the implementation of toLocaleString on Date objects,
or whatever.

 

Unfortunately,
it also breaks ASP.NET, which is the prime reason we developed fast mode in the first
place. Ironically, it is not the additional
compiler optimizations that a static object model enables which motivated this change! Rather,
it is the compilation model of ASP.NET.

 

I
discussed earlier how ASP uses the script engines — ASP translates the marked-up
page into a script, which it compiles once and runs every time the page is served
up. ASP.NET’s compilation model is similar,
but somewhat different. ASP.NET takes
the marked-up page and translates it into a class that
extends a standard page class. It compiles
the derived class once, and then every time the page is served up it
creates a new instance of the class
and calls the Render method
on the class.

 

So
what’s the difference? The difference
is that multiple instances of multiple page classes may be running in the same application
domain. In the ASP Classic model, each
script engine is an entirely independent entity. In
the ASP.NET model, page classes in the same application may run in the same domain,
and hence can affect each other. We don’t
want them to affect each other though — the information served up by one page should
not depend on stuff being served up at the same time by other pages.

 

Now
I’m sure you see where this is going. Those
built-in objects are shared by all instances of all JScript objects in the same application
domain. Imagine the chaos if you had
a page that said:

 

String.prototype.username
= FetchUserName();

String.prototype.appendUserName
= function() { return this + this.username; };

var greeting
= “hello”;

Response.Write(greeting.appendUserName());

 

Oh
dear me. We’ve set up a race condition. Multiple
instances of the page class running on multiple threads in the same appdomain might
all try to change the prototype object at the same time, and the last one is going
to win. Suddenly you’ve got pages that
serve up the wrong data! That data might
be highly sensitive, or the race condition may introduce logical errors in the script
processing — errors which will be nigh-impossible to reproduce and debug.

 

A
global writable object model in a multi-threaded appdomain where class instances should
not interact is a recipe for disaster, so we made the global object model read-only
in this scenario. If you need the convenience
of a writable object model, there is always compatibility mode.

Tags ASP JScript .NET Scripting

Comments (5)

Cancel reply

You must be logged in to post a comment.

  1. Dan Shappir says:Question 1: Is this the reason you also enforce the use of var in fast mode?Question 3: Out of curiosity – can you say how JScript.NET is fairing in the ASP.NET world vs. C# and VB.NET ?
  2. Log in to Reply
  3. Question 2: Doesn’t this make the term “fast mode” a bit of a misnomer? Shouldn’t it be “ASP.NET mode”?
  4. October 30, 2003 at 3:08 am
  5. Eric Lippert says:1) Yes — enforcing var improves clarity, improves optimizations and prevents accidental fouling of the global namespace.3) I have not the faintest idea. Remember, I haven’t actually worked on JS.NET for over two years now, and even if I was, I’m a developer, not a market researcher. (I can tell you that as far as throughput performance goes, JScript.NET on ASP.NET performs about as well as VB.NET, and both are 5%-10% slower than the equivalent C# in common scenarios — or, at least that was the case when I last ran the numbers.)
  6. Log in to Reply
  7. 2) Don’t be silly. I mean, we could have called them “incompatibility mode” and “slow mode” too, but obviously we wouldn’t. Fast mode was motivated by the requirements of ASP.NET, but the benefits go beyond ASP.NET scenarios.
  8. October 30, 2003 at 11:53 am
  9. Anonymous says:Ignore me I am testing.
  10. Log in to Reply
  11. October 30, 2003 at 12:14 pm
  12. Blake says:Perhaps a compromise could be reached for a future version of the language. It seems most of the interesting reasons for modifying the prototypes of the built in objects are all cases that are one-time setup. If these prototypes were writable only at appdomain creation time and read-only there after I think both sides could be happy?Log in to Reply
  13. (Not that there’s any current way to implement a .cctor in JS.NET that I’m aware of.)
  14. October 30, 2003 at 2:39 pm
  15. Samuel Bronson says:”ASP.NET mode” may be ridiculously specific, but it still seems like “fast mode” doesn’t say enough: it doesn’t say anything about allowing the script engine to be longer-lived or shared…
  16. Maybe it should be called something like “static mode”?
  17. June 28, 2010 at 12:22 pm

How many Microsoft employees does it take to change a lightbulb?

UPDATE: This article was featured in The Best Software Writing I. Thanks Joel!

Joe Bork has written a great article explaining some of the decisions that go into whether a bug is fixed or not. This means that I can cross that one off my list of potential future entries. Thanks Joe!

But while I’m at it, I’d like to expand a little on what Joe said.His comments generalize to more than just bug fixes. A bug fix is one kind of change to the behaviour of the product, and all changes have similar costs and go through a similar process.

Back when I was actually adding features to the script engines on a regular basis, people would send me mail asking me to implement some new feature.Usually the feature was a “one-off” — a feature that solved their particular problem. Like, “I need to call ChangeLightBulbWindowHandleEx, but there is no ActiveX control that does so and you can’t call Win32 APIs directly from script, can you add a ChangeLightBulbWindowHandleEx method to the VBScript built-in functions? It would only be like five lines of code!”

I’d always tell these people the same thing — if it is only five lines of code then go write your own ActiveX object! Because yes, you are absolutely right — it would take me approximately five minutes to add that feature to the VBScript runtime library. But how many Microsoft employees does it actually take to change a lightbulb?

  • One dev to spend five minutes implementing ChangeLightBulbWindowHandleEx.
  • One program manager to write the specification.
  • One localization expert to review the specification for localizability issues.
  • One usability expert to review the specification for accessibility and usability issues.
  • At least one dev, tester and PM to brainstorm security vulnerabilities.
  • One PM to add the security model to the specification.
  • One tester to write the test plan.
  • One test lead to update the test schedule.
  • One tester to write the test cases and add them to the nightly automation.
  • Three or four testers to participate in an ad hoc bug bash.
  • One technical writer to write the documentation.
  • One technical reviewer to proofread the documentation.
  • One copy editor to proofread the documentation.
  • One documentation manager to integrate the new documentation into the existing body of text, update tables of contents, indexes, etc.
  • Twenty-five translators to translate the documentation and error messages into all the languages supported by Windows.The managers for the translators live in Ireland (European languages) and Japan (Asian languages), which are both severely time-shifted from Redmond, so dealing with them can be a fairly complex logistical problem.
  • A team of senior managers to coordinate all these people, write the cheques, and justify the costs to their Vice President.

None of these take very long individually, but they add up, and this is for a simple feature.You’ll note that I haven’t added all the things that Joe talks about, like what if there is a bug in those five lines of code? That initial five minutes of dev time translates into many person-weeks of work and enormous costs, all to save one person a few minutes of whipping up a one-off VB6 control that does what they want.Sorry, but that makes no business sense whatsoever. At Microsoft we try very, very hard to not release half-baked software. Getting software right — by, among other things, ensuring that a legally blind Catalan-speaking Spaniard can easily use the feature without worrying about introducing a new security vulnerability — is rather expensive! But we have to get it right because when we ship a new version of the script engines, hundreds of millions of people will exercise that code, and tens of millions will program against it.

Any new feature which does not serve a large percentage of those users is essentially stealing valuable resources that could be spent implementing features, fixing bugs or looking for security vulnerabilities that DO impact the lives of millions of people.

UPDATE: KC Lemson and Raymond Chen and Chris Pratley have opinions on this as well.

Comments (51)

Cancel reply

  1. What a fantastic argument for Open Source!

  2. Who develops the test plans for open source software? Who updates the screenshots in the user’s guide and online help? And who translates the documentation into Polish and Turkish? Who verifies that the feature doesn’t violate the Americans with Disabilities Act or German privacy laws? (Back when I worked on Linux, the answer was “Nobody. There is no test plan, there is no printed user’s guide, what little documentation there is exists only in English, and nobody cares about complying with the ADA or German privacy laws.” Maybe things have changed since then.)

  3. I’m not following you Robert.

    Thought experiment: tomorrow, Bill open-sources all Microsoft products and set up a hundred billion dollar endowment fund to pay for continued development of the codebase. What makes all of the costs I mentioned suddenly go away?

    Nothing! Open source isn’t magic. There seems to be a strange belief amongst the open source community that just because you can make a change to the source code, and no one pays you to do so, that the change was free. But it wasn’t, because changes don’t cost money. Changes cost EFFORT to do right, and money is just a convenient way to measure effort, not effort itself. There’s only a finite amount of effort in the world, and knowing how to apply it to greatest effect is a difficult problem.

    It doesn’t take a couple dozen people to change a lightbulb here _because_ we sell software for a living — it would take those couple dozen people even if we gave it away with the sources. It takes a couple dozen people because we deeply care about that legally blind Catalan-speaking customer. It takes a couple dozen people to change a lightbulb because software is an insanely complicated device that runs in an insanely complicated world. Managing that complexity is a lot more work than changing the code.

  4. The point is that the customer is free to fix/alter that which affects themselves, but isn’t worth fixing to you. Since the feature/problem isn’t worth it to you, it obviously follows that the feature isn’t for widely deployed software… where all of your effort is (rightly) focused. It just needs to work on a small number of boxes for a very specific purpose, where the blind Catalan-speaking customer doesn’t come into play. Or perhaps the feature is very very important to the customer (yesterday! dangit!), and they don’t have time to wait for MS to QA on Windows XP Home Arabic.

  5. Got it. But this is precisely why we built the script engines to be extensible by arbitrary third party ActiveX objects. Like I said, if the feature is cheap and easy, then implement it yourself in a VB6 object. But the script engines themselves cannot be allowed to fracture into a million slightly different mutually incompatible versions — that doesn’t serve customers well.

  6. Ian Ringrose says:

    Now what if you ‘shipped’ an “open source” set of helper ActiveX objects for the script engine. You could post it on one of the Microsoft web sites and just say that it is a demo of how to extend the script engine… Can we find a modal that lets Microsoft employees get out ‘quick hacks’ without Microsoft being responsible for then in the nest 100 years?

    No all software help to be written to the same standard…

  7. Anonymous says:

    I believe MS already ships quit a significant amount of “open source” software in the form of all the samples in the msdn online, and the samples included with the various product installations, such as VB.

    And if that is not enough, you can find tons of samples online, in various trade publications, in newsgroups and obviously in blogs like this one.

    OTOH you cannot ignore the fact that Microsoft becomes somewhat liable for any piece of software it ships regardless of how unofficial it is. If you copy a sample for this site and it formats your disk, you will blame Microsoft and Eric could loose his job.

  8. Dan Shappir says:

    BTW, I wrote the above comment.

    For some reason, the “Remember Me” check box doesn’t work for me. And I got a memory exception screen when I submited.

    More untested, buggy software from Microsoft

  9. Michael Howard says:

    You missed one step – three people to argue about whether this method is safe for scripting

  10. Anonymous says:

    Raymond wrote: “Who develops the test plans for open source software? There is no test plan, there is no printed user’s guide, what little documentation there is exists only in English, and nobody cares about complying with the ADA or German privacy laws.” Maybe things have changed since then”

    They certainly have changed. If you’re interested in whether this is true, check out the Gnome and KDE projects. Much more organised (esp. KDE) than they ever used to be. In general it’s work that most OSS developers don’t want to do – it’s not an itch to scratch to use the cheesy phrase I keep hearing. Now that there is interest in OSS and money is being thrown about, people are working on these things because they are being paid to do it.

    Eric seem to be arguing (please correct me if I am wrong) “How do we pay for peoples effort if we give everything away”? Well for a start OSS does not mean free. Just because some OSS apps are free doesn’t mean they *have* to be.

    I wonder how many Microsoft employees have actually ever been to opensource.org and can now argue about OSS *without* resorting to the usual FUD we get from Microsoft. I’m not deliberately trolling here, I like OSS and I also like Windows, but when I see the FUD and things like the underhand funding of SCO I don’t like MS. It’s like they don’t want to compete on fair terms – maybe I’m just naieve and that’s how business works. I hope not.

    I have to say about the user guide and it being printed, I have NEVER seen anyone read it. No exageration. Never. Ever. Maybe you should do a poll, you could save money by not printing it

  11. a. says:

    geeks doesn’t read it. joe user does.

  12. > Eric seem to be arguing (please correct me if I am wrong)”How do we pay for peoples effort if we give everything away”?

    No, I’m certainly not arguing that. Perhaps Raymond is, but I am not. I realize that there are companies that pay people to work on open source.

    My argument is that IF you are in the business of writing software that is to be used by millions of people around the world, THEN the primary cost of implementing that software is NOT in the implementation. The cost is in the design, the review, the documentation, the testing, the maintenance, the support calls, etc.

    Whether your business model calls for selling that software or — as we did with the script engines for the last seven years — giving it away for free, whether the source is open or closed, has not the least bearing on my point. Does the open source model work well for one-off changes that will be distributed to one person? Obviously. But that’s not the kind of change I’m talking about.

  13. Dan Isaacs says:

    Opportunity costs are hypothetical. “Stealing” refers to the act of depriving someone of actual property. As a hypothetical is not actual property, “stealing” is a poor word to use in your last paragraph. Better to let the facts stand up for themeselves instead of concluding with a false characterization.

  14. > Better to let the facts stand up for themeselves

    Actually, “facts” are abstract entities which do not have legs. So they can’t really “stand up” for themselves. That’s kind of a poor choice of words, wouldn’t you say? Fortunately, by using my advanced skills in inference I can probably figure out what you intended by your imprecise and colloquial expression.

  15. Andrew says:

    You can tell that these are only techs reading this because hey guys! (particularly Robert, etc) – End-users don’t want half-baked software!!!

    While as a dev, I would gladly rather have the software do what I wanted it to, I would also rather have it do what it can do (without causing more problems), and allow me to change its behaviour.

    Just because Tech Guy down the street can change the behaviour of an application doesn’t mean that all end-users can. We, in the tech community, are in a position where we not only “see” the final result but also see how its made and can affect change. End-users, although this attitude is sometimes changing, don’t care how it works – they just want it to.

    When I ask the time, I don’t want to know how to build a clock nor how to fix the clock – I just want the clock to tell me the time. Open Source isn’t a solution for the problem mentioned above- it’s simply another take on it.

  16. Anonymous says:

    “Just because Tech Guy down the street can change the behaviour of an application doesn’t mean that all end-users can.”

    Admittedly not, but at least they would have the opportunity to pay/hire someone to do it for them. If I want someone to add something to my house heating system, I want *ANYONE* with the skills to fix it to be able to fix it, I don’t want to be FORCED to go back to the original installer ….

  17. Rj says:

    So the Answer is “41 at a minimum”?

  18. It looks like the intricacies of higher-dimensional geometry will have to wait another week; I am incredibly…

  19. John Gruber makes an appearance in the soon to be released book The Best Software Writing I which was put together by Joel Spolsky .

  20. Whether it’s software or content, making a change can often be a much bigger deal than you’d think. I…

  21. Patrick Schmid says:

    Eric,

    just wanted to make sure you saw my response to your comment on my blog:

    well deserved feedback. I remember vividly that I made the argument that a certain bug fix for Outlook 2007 would only be a few lines of code (I stopped short of volunteering to write the code for the devs) without knowing how much other work would be associated with those few lines.

  22. David says:

    What if the user can’t create an ActiveX object?  Not becauser they can’t code, but because they don’t have access to the development environment, compiler, or sufficient authority to create and/or install the object?  After all, VBScript (and JScript, VBA, WSH etc etc etc) is available to many more users than visual studio, or some other development tools.

    You can argue that they should obtain the neccesary software/authority etc to develop ActiveX objects (and perhaps after 12 months the business case might get approved…).  But a language that requires a business case to implement some needed functionality is essentially crippled.

    I can say from personal experience that having a language not go “all the way” is as frustrating as the allusion implies.

  23. RK says:

    So much about open source talk. Forgive me for my ignorance. Who uses open-source? I have heard everyone promoting it. Haven’t seen anyone, at least in the develoment community that i have seen till date using any of those products. Let’s not be hypocritical. You want to make something that people use, it costs. Listen nothing comes FREE.

  24. Yuhong Bao says:

    Sounds so similar to bureaucracy!

  25. Blah says:

    As much as I love Open Source software, this arguement (the one made by the article-writer) holds true for Open Source software, too.  However, with Open Source software, you usually don’t have the luxury of a Program Manager, multiple QA folks, etc all collaborating.  It’s usually you and maybe a couple other dedicated folks passionately working on something.  You get done, you toss it out to users, and then tons of bugs show up that you never thougt about (who’d have thought the user’s would stick the bulb outside in zero-degree weather…OOPS!)  So, in my opinion, this is not an argument “for” Open Source software.  Open Source software suffers from the same “how many does it take” syndrome.

    Getting back to the article writer’s point, you don’t want to toss all kinds of one-off crap into your project, because then it bloats up the project.  However, more and more projects opt for a foundation and then extensible scripting (EG: some video games, firefox, etc).  Of course, if your project is nothing more than a scripting language (VBScript), then it’s meant to be light-weight.  So, yeah, suck it up and do the 5 lines of code yourself.  If VBScript got bloated with everyone’s “one-off” junk, it’d be such a cumbersome hodge-podge of stuff, nobody would use it (or complain about how complicated it’s gotten.)

    I think Bill Cosby said it best…”I don’t know the secret to success, but the secret to failure is trying to please everyone.”

  26. Blah says:

    Lots of folks use open source.  From the person who decides to use Open Office or Firefox, to the person who downloads a piddly little program or trainer to hack some game file so they can buff up their character.  Software is so ubiquitious and easy to make these days, that folks think it has to cost tons of time and money to make it still.  Open Source just means the source is viewable and (possibly) modifiable (depending on the license the author releases it under).   Open Source isn’t necessarily free in some cases…folks can let you see the source, but still charge you to use it or the compiled program.  But, software isn’t some commodity large corporations have sole entitlement to.  It’s like regular writing…anyone can do it, and lots of folks do.  Whether you choose to buy a book from the store that has the info you want, or get the info for free from an internet site is a matter of choise.  Free doesn’t necessarily mean “bad” or “poor quality”.  However, in United States especially, folks think the more something costs the better it is.  So, there’s still a frown on Open Source and Freeware.  It’s like saying only the expensive Dealer can fix your car, because the shade-tree mechanic down the street doesn’t know anything.  On the contrary, the shade-tree mechanic may know a lot.  Then again, he may not.  There’s a greater variance in quality with the shade-tree mechanic, but you can still get poor service from the Dealer.  Quality is not a given just because you pay money for a product or service.  And, inversely, poor quality is not a given just because you get something for free.

  27. It looks like the intricacies of higher-dimensional geometry will have to wait another week; I am incredibly

  28. bob says:

    I think the problem is that the development tools are not free.   Writing and ActiveX extension as the author suggests shows great extensibility, however, if I have to spend $500 or $1000 to get the tools to do this then it’s a non-starter.

    Just one more reason Mac OS X with free developer tools is superior.

  29. The problem that Microsoft has is often scale!  When you have millions of people that depend on your project changes that normally are done without a second thought are now all of a sudden very difficult.  This is the huge dependency on product use.  Divide the number of people using your product by the number of people making your product and you might get a better idea of the efficiency of your team (not the best metric I know but it helps with perspective).  The second problem that is described is flow.  Most of the time described in the many week process is waste due to time waiting on other people.  If you had everyone in the same room then the task would be completed much faster.  Open Source is great, but Open Source has the same problems at larger scales.  These scale issues often cause the community to “fork” and that can make this better but can also cause fragmentation.

  30. anonimous says:

    It doesn’t take a single employee to change a lightbulb, Bill Gates just redefines ‘Darkness’ as the new industry standard…

    (it was in my englishbook bill…)

  31. Aakash says:

    Love it!

    I tired telling my managers how much time a little bug may take to fix and get on production.

  32. Matthias says:

    I can see the point you’re making in the article. In fact this is why I value stuff coming from Redmond so much. It actually is tested, some API concerns are taken into account and above all one can actually find some real-life examples of how to use a particular part of the system.

    No wonder it takes so much to implement such a trivial thing.

    A note about open-source frameworks: they are great, even fantastic! But the best ones have extensive documentation and a fat set of examples. The ones that don’t have that learning resources, even if they are best in the issues they are solving, are doomed to be used once, maybe twice.

  33. Zbyszek says:

    I totally agree with this article. People too often do not realise that a simply change can cause such a huge side effects. Our website is in 16 languages, we have accepted that some text (on some pages even majority) is not translated, because traslating it caused more grief than happines. For example translating some text caused grammatical errors (consider plural form – in English is easy, 1 photo, 2 or more photos, buut in Polish is 1 zdjecie, 2 zdjecia, 5 zdjec, 12 zdjec, but .. 22 zdjecia, however 21 zdjec, etc) and caused a flood of complains, which created an extra task of answering them, which … costs time==money

    @bob – of course you have free development tools (on MS platforms – Express editions, on Linux Eclipse, etc)

    @open source supporters – I love some opensource stuff, but you must accept that if someone makes changes which are just for him they have to accept that the WHOLE modified product cannot be then properly supported, as their small change in the code might have a knock off effect in a very unexpected place. I have developed many libraries in my life and I know from the painful experience that you will get errors in totally unexpected places.

    One the strangest erross we have had had a following scenario (code written in C, many years ago). User input was accepted as a number, including decimal comma. Then we introduced localization allowing decimal comma, routine was scanning a number changing comma to dot, like while(*ptr!=’,’) ptr++; *ptr=’.’.

    Of course once it happenned that users input did not have a comma and code (it was C!) was continuing happyly beyond array and … found value 2c on the stack and converted it to 2e. It happenned to be return address, so as the effect next line after the call of this routine was not executed, and this skipped was a short jump out of the loop, so external loop executed once more and customer name was changed to something else. It took us almost two days to find this problem.

  34. Rachel ‘Groby’ Blum says:

    It’s a classical example of how when a company gets bigger, it starts moving slower and slower (by necessity. MS ignoring any of the above points will almost guaranteed result in a lawsuit somewhere)

    Reading this, it strikes me that the software industry pretty much as a whole has no way of releasing features “along the way” to get better feedback. There’s no way to release a simple, non-localized, non-vetted LightbulbEx, collect feedback, improve it, and only finalize it later – if it’s indeed a crucial feature that everybody needs.

    And while I love OSS, it’s not exactly the answer. You still need to move from that quick-hack version to the fully localized one, you have the issue of incompatibilities between different versions of your SW (which non-geek users don’t exactly appreciate), etc.

    It’s an interesting problem that hasn’t seen a solution yet.

  35. bolthar says:

    Maybe it’s time to change the way you guys release your products – not that I know *how* you should change…

    I’m not following your line of thought. How does changing the distribution model affect the costs of design, implementation, testing, security review, documentation, translation, maintenance or management? Distribution cost is a cost that I did not even think to mention, so I don’t understand why you’re bringing it up. Can you explain? – Eric

JScript Goes All To Pieces

My entry the other day about fast mode in JScript .NET sparked a number of questions which deserve fuller explanations.  I’ll try to get to them in my next couple of blog entries.

 

For example, when I said that it was no longer legal to redefine a function, I wasn’t really clear on what I meant.  JScript .NET still has closures, anonymous functions, and prototype inheritance.  We didn’t remove any of those.  Furthermore, it is very important to emphasize that we implemented compatibility mode so that anyone who does need these features in JScript .NET can still get them – they will pay a performance penalty, but that’s their choice to make.

 

What I meant was simply that this is now illegal:

 

function foo() { return 1; }

function foo() { return 2; }

 

whereas that is perfectly legal in JScript Classic.  In JScript Classic this means “discard the first definition”.

 

Pop quiz: what does this print out?

 

function foo(){ alert(1); }

foo();

function foo(){ alert(2); }

foo();

 

Of course that prints out “2” twice, because in JScript Classic, function and variable declarations are always treated as though they came at the top of the block of code, no matter where they are found lexically in the block.

 

Obviously this is bizarre, makes debugging tricky, and is totally bug-prone.  The earlier definition is completely ignored, and yet it sits there in the source code, confusing maintenance programmers who do not see the redefinition, which might be a thousand lines later.  Thus, it is illegal in JScript .NET.

 

But we only made this kind of redefinition illegal.  Other kinds of redefinition, like

 

var foo = function() { return 1; }

print(foo());

foo = function() { return 2; }

print(foo());

 

continue to work as you’d expect.

 

So why was this ever legal?  Do language designers get some kind of perverse kick out of larding languages with “gotcha” idioms?  No, actually there was a pretty good reason for these semantics.  Two reasons actually.  The first is our old friend “muddle on through when you get an error”.  However, since this error can be caught at compilation time, this is not a very convincing point.  The more important point is this one:

< script language=”JScript” >

function foo(){ alert(1); }

foo();

</ script>

< script language=”JScript”>

function foo(){ alert(2); }

foo();

</ script>

Aha!  Now we see what’s going on here.  I said “function and variable declarations are always treated as though they came at the top of the block of code”, and here we have two blocks.  IE will compile and run the first block, and then compile and run the second block, so this really will display “1” and then “2”.  The IE compilation model allows for piecewise execution of scripts. This scenario requires the ability to redefine methods on the fly, so, there you go.

 

However, ASP does not have a piecewise compilation model, and neither does ASP.NET.  When we designed JScript .NET we removed this feature from fast mode because we knew that most “normal” hosts have all the source code at once and do not ever need to dynamically pull down new chunks from the internet after old chunks have already run.  By disallowing piecewise execution, we can do a lot more optimizations because we know that once you have a function, you’ve got it and no one is going to redefine it later.

The Most Boring Story Ever

The other day a reader suggested:

Make a blogentry about how you started at MS and so on!

You asked, but I’m warning you: it’s the most boring story ever.

I grew up in Waterloo, Ontario, which was a piece of luck as Waterloo has the best computer science school in Canada. I studied applied mathematics and computer science from 1991 to 1996.

Amongst its many claims to fame is: UW has the largest cooperative education program on the planet. For my fourth, fifth and sixth work terms I was an intern on the VBA team here at Microsoft. On the strength of my internship the VBA team extended me a job offer, which I accepted. I worked full-time on the scripting technology for five years.

Then the VBA, Scripting and Microsoft Office Developer teams were reorganized into one large team (the “Trinity” team) tasked with modernizing and improving the Office developer story. I’ve been working on that for about two years now. We’ve just shipped “Microsoft Visual Studio .NET Tools For The Microsoft Office System 2003”, which I actually did very little work on — that was Peter Torr ‘s baby, so read his blog if you want details.

I’ve been working on the next version, which, of course, I can’t talk about except to say that I hope the name is shorter. Also, I do a fair amount of work still on scripting — not implementing new features of course, but ongoing work like attending security reviews, helping out our product support and sustaining engineering teams, and (obviously) writing a blog.


Commentary from 2019

It was an easy choice to go to Waterloo; I could live at home, I had family on staff, I already knew some of the professors, and it was and still is the best school for computer science and mathematics. The co-op program literally changed my life; it’s pretty unlikely that I’d be living in Seattle were it not for those work terms.

We had an all-hands Trinity team meeting the day that the official product name was announced, and people laughed. I was one of them. The team manager was known to have a sense of humour and I figured that this had to be a parody of the clunky-stream-of-nouns approach to product naming that happened at Microsoft. But no, management was serious, and this was the newest and most egregious example of bad product naming ever. “Microsoft” is in there twice for goodness’ sake!

The best product name that came out of that team was we had a little helper application that did… something. Maybe it set up Office interop security policy or something like that? I don’t remember. But it was the Microsoft Office Helper for Interop Technology, or MOHIT.EXE. That it was written by my colleague Mohit Gupta was a total coincidence, I’m sure.

Compatibility vs. Performance

Earlier I mentioned that two of the design goals for JScript .NET were high performance and compatibility with JScript Classic. Unfortunately these are somewhat contradictory goals! JScript Classic has many dynamic features which make generation of efficient code difficult. Many of these features are rarely used in real-world programs. Others are programming idioms which make programs hard to follow, difficult to debug and slow.

JScript .NET therefore has two modes: compatibility mode and fast mode. In compatibility mode there should be almost no JScript program which is not a legal JScript .NET program. Fast mode restricts the use of certain seldom-used features and thereby produces faster programs.

The JSC.EXE command-line compiler and ASP.NET both use fast mode by default. To turn fast mode off in JSC use the /fast- switch.

Fast mode puts the following restrictions on JScript .NET programs:

  • All variables must be declared with the var keyword. As I discussed earlier, in JScript Classic it is sometimes legal to use a variable without declaring it. In those situations, the JScript Classic engine automatically creates a new global variable but when in fast mode, JScript .NET does not. This is a good thing — not only is the code faster but the compiler can now catch spelling errors in variable names.
  • Functions may not be redefined. In JScript Classic it is legal to have two or more identical function definitions which do different things. Only the last definition is actually used. This is not legal in JScript .NET in fast mode. This is also goodness, as it eliminates a source of confusion and bugs.
  • Built-in objects are entirely read-only. In JScript Classic it is legal to add, modify and (if you are perverse), delete some properties on the Math object, the String prototype and the other built-in objects.
  • Attempting to write to read-only properties now produces errors. In JScript Classic writing to a read-only property fails silently, in keeping with the design principle I discussed earlier: muddle on through.
  • Functions no longer have an arguments property. The primary use of the arguments property is to create functions which take a variable number of arguments. JScript .NET has a specific syntax for creating such a function. This makes the arguments object unnecessary. To create a JScript .NET function which takes any number of arguments the syntax is:
function MyFunction(... args : Object[] 
{
  // now use args.length, args[0], etc.
}

Generally speaking, unclear code is slow code. If the compiler is unable to generate good code it is usually because the restrictions on the objects described in the code are so loose as to make optimization impossible. These few restrictions not only let JScript .NET generate faster code, they also enforce good programming style without overly damaging the “scripty” nature of the language. And if you must run code which has undeclared variables, redefined functions, modified built-in objects or reflection on the function arguments, then there is always compatibility mode to fall back upon.

JScript .NET also provides warnings when programming idioms could potentially produce slow code. For example, recall my earlier article on string concatenation.  Using the += operator on strings now produces a warning which suggests using a StringBuilder instead. JScript .NET also produces warnings when code is likely to be incorrect. For example, using a variable before initializing it produces a warning.  So does branching out of a finally block now produce warnings, and so on.


Commentary from 2020

This post generated some good feedback from JS experts who read my blog regularly back in the day.

  • The ability to extend the standard string, number and function capabilities by messing around with the prototype chain was seen by them as a strength of JS, and they therefore suggested weakening the “do not modify the built-in objects” restriction. I do not recall if we followed this advice, but I think not.
  • These restrictions basically make JS.NET in fast mode into another syntax for C#. Well yeah! C# was designed to be fast and understandable, so if you want to make a slow, unpredictable language fast and understandable, making it act more like C# is a sensible way to do that. However, the commenter makes a great point: the only selling point of JS.NET over C# then becomes “attractive to people who know JS but not C#”. But if you know JS already you can easily pick up C#.
  • It would be nice to have a language like JS.NET in the browser. Yes, it really would; a lot of the JS.NET features eventually made it into ES6 so that wish was fulfilled a mere couple decades later.

 

 

Michael’s Security Blog is online

Michael Howard has started blogging. If you’re interested in writing secure code (and these days, who isn’t?) you could do worse than to read anything he writes.


Commentary from 2019

Michael was a lot of fun to work with over the years; he has a deep understanding of security, strong opinions, and a willingness to share both. I was particularly honoured to be asked to review the C# sections of Writing Secure Code 2, which is excellent.

I have not read his blog for years but I am delighted to discover that he is still writing it in 2019; the link above has been updated. I have many years of posts to catch up on it seems!

Attention passengers: Flight 0703 is also known as Flight 451

I hate octal.  Octal causes bugs.  I hate bugs, particularly stupid “gotcha” bugs. C programmers do things like

int arr_flight = 0703;

not realizing that this does not assign the number 703, but rather 7 * 64 + 3 = 451.

Even worse, JScript programmers do things like

var arr_flight = 0708;
var dep_flight = 0707;

not realizing that the former is a decimal literal but the latter is an octal literal.

Yes, in JScript it really is the case that if a literal begins with 0, consists of only digits and contains an 8 or a 9 then it is decimal but if it contains no 8 or 9 then it is octal!  The first version of the JScript lexer did not implement those rules, but eventually we changed it to be compatible with Netscape’s implementation.

This is in keeping with the design principle that I mentioned earlier, namely “Got a problem? Muddle on through!”  However, since this problem can be caught at compile time, I think that the decision to make illegal octal literals into decimals was a poor one.

It’s just a mess. Octal literals and escape sequences have been removed from the ECMAScript specification, though of course they live on in actual implementations for backwards compatibility.

This is why I added code to JScript .NET so that any use of an integer decimal or octal literal that begins with zero yields a compiler warning, with one exception. Obviously x = 0; does not produce a warning!


Commentary from 2020

I still hate octal. Fortunately it seems to have fallen out of favour when designing new programming languages.

A commenter asked how warnings work in JScript .NET; I noted that in JScript Classic there was a facility to report errors back to the host but not warnings. JScript .NET’s hosting APIs supported both errors and warnings.

Making Sense of HRESULTS

Every now and then — like, say, this morning — someone sends me this mail:

I’m getting an error in my JScript program. The error number is -2147024877. No description. Help!

Making sense of those error numbers requires some delving into the depths of how COM represents errors — the HRESULT.

An HRESULT is a 32 bit unsigned integer where the high bit indicates whether it is an
error or a success. The remaining bits in the high word indicate the “facility” of the error — into what broad category does this error fall? The low word indicates the specific error for that facility.

HRESULTS are therefore usually talked about in hex, as the bit structure is a lot easier to read in hex! Consider 0x80070013, for example. The high bit is set, so this is an error. The facility code is 7 and the error code is 0x0013 = 19 in decimal.

Unfortunately, JScript interprets the 32 bit error code as a signed integer and displays it in decimal. No problem — just convert that thing back to hex, right?

var x = -2147024877;
print(x.toString(16))

Whoops, not quite. JScript doesn’t know that you want this as an unsigned number, so it converts it to a signed hex number, -0x7ff8ffed. We need to convert this thing to the value it would have been had JScript interpreted it as an unsigned number in the first place. A handy fact to know is that the difference between an unsigned number interpreted as a signed number and the same number interpreted as an unsigned number is always 0x100000000 if the high bit is set, 0 otherwise.

var x = -2147024877;
print((x<0?x+0x100000000:x).toString(16))

There we go. That prints out 80070013. Or, even better, we could just write a program that takes the error apart:

function DumpHR(hr)
{
  if (hr < 0) hr += 0x100000000;
  if (hr & 0x80000000)
    print("Error code");
  else 
    print("Success code");
  var facility = (hr & 0x7FFF0000) >> 16;
  print("Facility" + facility);
  var scode = hr & 0x0000FFFF;
  print("SCode" + scode);
}
DumpHR(-2147024877);

The facility codes (in decimal) are as follows

FACILITY_NULL 0
FACILITY_RPC 1
FACILITY_DISPATCH 2
FACILITY_STORAGE 3
FACILITY_ITF 4
FACILITY_WIN32 7
FACILITY_WINDOWS 8
FACILITY_SECURITY 9
FACILITY_CONTROL 10
FACILITY_CERT 11
FACILITY_INTERNET 12
FACILITY_MEDIASERVER 13
FACILITY_MSMQ 14
FACILITY_SETUPAPI 15
FACILITY_SCARD 16
FACILITY_COMPLUS 17
FACILITY_AAF 18
FACILITY_URT 19
FACILITY_ACS 20
FACILITY_DPLAY 21
FACILITY_UMI 22
FACILITY_SXS 23
FACILITY_WINDOWS_CE 24
FACILITY_HTTP 25
FACILITY_BACKGROUNDCOPY 32
FACILITY_CONFIGURATION 33
FACILITY_STATE_MANAGEMENT 34
FACILITY_METADIRECTORY 35

So you can see that our example is a Windows operating system error (facility 7), and looking up error 19 we see that this is ERROR_WRITE_PROTECT — someone is trying to write to a write-protected floppy probably.

All the errors generated by the script engines — syntax errors, for example — are FACILITY_CONTROL, and the error numbers vary between script engines. VB also uses FACILITY_CONTROL, but fortunately VBScript assigns the same meanings to the errors as VB does. But in general, if you get a FACILITY_CONTROL error you need to know what control generated the error — VBScript, JScript, a third party control, what? Because each control can define their own errors, and there may be collisions.

Finally, here are some commonly encountered HRESULTs:

  • E_UNEXPECTED 0x8000FFFF “Catestrophic failure” — something completely unexpected has happened
  • E_NOTIMPL 0x80004001 “Not implemented” — the developer never got around to writing the method you just called!
  • E_OUTOFMEMORY 0x8007000E pretty obvious what happened here (remember that out of memory means you ran out of address space, not RAM!)
  • E_INVALIDARG 0x80070057 you passed a bad argument to a method
  • E_NOINTERFACE 0x80004002 COM is asking an object for an interface it does not support. This can happen if you try to script an object that doesn’t support IDispatch.
  • E_ABORT 0x80004004 whatever you were doing was terminated
  • E_FAIL 0x80004005 something failed and we don’t know what.

And finally, here are three that you should see only rarely from script, but script hosts may see them moving around in memory and wonder what is going on:

  • SCRIPT_E_RECORDED 0x86664004 this is how we internally track whether the details of an error have been recorded in the error object or not. We need a way to say “yes, there was an error, but do not attempt to record information about it again.”
  • SCRIPT_E_PROPAGATE 0x80020102 another internal code that we use to track the case where a recorded error is being propagated up the call stack to a waiting catch handler.
  • SCRIPT_E_REPORTED 0x80020101 the script engines return this to the host when there has been an unhandled error that the host has already been informed about via OnScriptError.

That’s a pretty bare-bones look at error codes, but it should at least get you started next time you have a confusing error number.


Commentary from 2020

First off: write-protected floppies were a real thing! Honest!

There were a number of good user comments on this article with advice extending mine:

  • FACILITY_ITF means “the interface you’re calling defines the meaning of the error you’re getting” which can be confusing
  • You can use Windows Calculator in scientific mode to quickly convert decimals to hex DWORDs
  • Look in winerror.h for more predefined error codes
  • The HRPLUS utility is good for HRESULT analysis
  • Visual Studio has an “hr” format specifier that will convert numeric values to their text equivalents. Making a watch on @EAX,hr and @ERR,hr is useful! @ERR shows the value of a call to GetLastError.
  • For a great explanation of how the script engines propagate errors around, see this SO question.

 

Constant Folding and Partial Evaluation

A reader asks “is there any reason why VBScript doesn’t change

str = str & "1234567890" & "hello"

to

str = str & "1234567890hello"

since they are both constants?

Good question.  Yes, there are reasons.

The operation you’re describing is called constant folding, and it is a very common compile-time optimization.  VBScript does an extremely limited kind of constant folding.  In VBScript, these two programs generate exactly the same code at the call site:

const foo = "hello"
print foo

is exactly the same as

print "hello"

That is, the code generated for both says “pass the literal string “hello” to the print subroutine”.  If foo had been a variable instead of a constant then the code would have been generated to say “pass the contents of variable foo…”

But the VBScript code generator is smart enough to realize that foo is a constant, and so it does not generate a by-name or by-index lookup, it just slams the constant right in there so that there is no lookup indirection at all.

The kind of constant folding you’re describing is compile-time evaluation of expressions which have all operands known at compile time. For short, let’s call it partial evaluation. In C++ (or C#) for example, it is legal to say

const int CallMethod = 0x1;
const int CallProperty = 0x2;
const int CallMethodOrProperty = CallMethod | CallProperty;

The C++ compiler is smart enough to realize that it can compute the third value itself.  VBScript would produce an error in this situation, as the compiler is not that smart.  Neither VBScript nor JScript will evaluate constant expressions at compile time.

An even more advanced form of constant folding is to determine which functions are pure functions — that is, functions which have no side effects, where the output of the function depends solely on the arguments passed in.  For example, in a language that supported pure functions, this would be legal:

const Real Pi = 3.14159265358979;
const Real Sine60 = sine( Pi / 3);  // Pi / 3 radians = 60 degrees

The sine function is a pure function — there’s no reason that it could not be called at compile time to assign to this constant.  However, in practice it can be very difficult to identify pure functions, and even if you can, there are issues in calling arbitrary code at compile time — like, what if the pure function takes an hour to run?  That’s a long compile!  What if it throws exceptions?  There are many practical problems.

The JScript .NET compiler does support partial evaluation, but not pure functions.  The JScript .NET compiler architecture is quite interesting.  The source code is lexed into a stream of tokens, and then the tokens are parsed to form a parse tree.  Each node in the parse tree is represented by an object (written in C#) which implements three methods: EvaluatePartialEvaluate and TranslateToIL.

When you call PartialEvaluate on the root of the parse tree, it recursively descends through the tree looking for nodes representing operations where all the sub-nodes are known at compile time.  Those nodes are evaluated and collapsed into simpler nodes.  Once the tree has been evaluated as much as is possible at compile time, we then call TranslateToIL, which starts another recursive descent that emits the IL into the generated assembly.

The Evaluate method is there to implement the eval function.  JScript Classic (which everyone thinks is an “interpreted” language) always compiles the script to bytecode and then interprets the bytecode — even eval calls the bytecode compiler in JScript Classic.  But in JScript Classic, a bytecode block is a block of memory entirely under control of the JScript engine, which can release it when the code is no longer callable.

In JScript .NET, we compile to IL which is then jitted into machine code.  If
JScript .NET’s implementation of eval emitted IL, then that jitted code would stay in memory until the appdomain went away!  This means that a tight loop with an eval in it is essentially a memory leak in JScript .NET, but not in JScript Classic.  Therefore, JScript .NET actually implements a true interpreter!  In JScript .NET, eval generates a parse tree and does a full recursive evaluation on it.

I’m digressing slightly.  You wanted to know why the script engines don’t implement partial evaluation.  Well, first of all, implementing partial evaluation would have made the script engines considerably more complicated for very little performance gain.  And if the author does want this gain, then the author can easily fold the constants “by hand”.

But more important, partial evaluation makes the process of compiling the script into bytecode much, much longer as you need to do yet another complete recursive pass over the parse tree.  That’s great, isn’t it?  I mean, that’s trading increased compilation time for decreased run time.  What could be wrong with that?  Well, it depends who you ask.

From the ASP implementers’ perspective, that would indeed be great.  An ASP page, as I’ve already discussed, only gets compiled once, on the first page hit, but might be run many times.  Who cares if the first page hit takes a few milliseconds longer to do the compilation, if the subsequent million page hits each run a few microseconds faster?  And so what if this makes the VBScript DLL larger?  ASP updates are distributed to people with fast internet connections.

But from the IE implementers’ perspective, partial evaluation is a step in the wrong direction.  ASP wants the compilation to go slow and the run to go fast because they are generating the code once, calling it a lot, and generating strings that must be served up as fast as possible.  IE wants the compilation to be as fast as possible because they want as little delay as possible between the HTML arriving over the network and the page rendering correctly.  They’re never going to run the script again after its generated once, so there is no amortization of compilation cost.  And IE typically uses scripts to run user interface elements, not to build up huge strings as fast as possible.  Every microsecond does NOT count in most UI scenarios — as long as the UI events are processed just slightly faster than we incredibly slow humans can notice the lag, everyone is happy.