Unknown's avatar

About ericlippert

http://ericlippert.com

Michael’s Security Blog is online

Michael Howard has started blogging. If you’re interested in writing secure code (and these days, who isn’t?) you could do worse than to read anything he writes.


Commentary from 2019

Michael was a lot of fun to work with over the years; he has a deep understanding of security, strong opinions, and a willingness to share both. I was particularly honoured to be asked to review the C# sections of Writing Secure Code 2, which is excellent.

I have not read his blog for years but I am delighted to discover that he is still writing it in 2019; the link above has been updated. I have many years of posts to catch up on it seems!

Attention passengers: Flight 0703 is also known as Flight 451

I hate octal.  Octal causes bugs.  I hate bugs, particularly stupid “gotcha” bugs. C programmers do things like

int arr_flight = 0703;

not realizing that this does not assign the number 703, but rather 7 * 64 + 3 = 451.

Even worse, JScript programmers do things like

var arr_flight = 0708;
var dep_flight = 0707;

not realizing that the former is a decimal literal but the latter is an octal literal.

Yes, in JScript it really is the case that if a literal begins with 0, consists of only digits and contains an 8 or a 9 then it is decimal but if it contains no 8 or 9 then it is octal!  The first version of the JScript lexer did not implement those rules, but eventually we changed it to be compatible with Netscape’s implementation.

This is in keeping with the design principle that I mentioned earlier, namely “Got a problem? Muddle on through!”  However, since this problem can be caught at compile time, I think that the decision to make illegal octal literals into decimals was a poor one.

It’s just a mess. Octal literals and escape sequences have been removed from the ECMAScript specification, though of course they live on in actual implementations for backwards compatibility.

This is why I added code to JScript .NET so that any use of an integer decimal or octal literal that begins with zero yields a compiler warning, with one exception. Obviously x = 0; does not produce a warning!


Commentary from 2020

I still hate octal. Fortunately it seems to have fallen out of favour when designing new programming languages.

A commenter asked how warnings work in JScript .NET; I noted that in JScript Classic there was a facility to report errors back to the host but not warnings. JScript .NET’s hosting APIs supported both errors and warnings.

Making Sense of HRESULTS

Every now and then — like, say, this morning — someone sends me this mail:

I’m getting an error in my JScript program. The error number is -2147024877. No description. Help!

Making sense of those error numbers requires some delving into the depths of how COM represents errors — the HRESULT.

An HRESULT is a 32 bit unsigned integer where the high bit indicates whether it is an
error or a success. The remaining bits in the high word indicate the “facility” of the error — into what broad category does this error fall? The low word indicates the specific error for that facility.

HRESULTS are therefore usually talked about in hex, as the bit structure is a lot easier to read in hex! Consider 0x80070013, for example. The high bit is set, so this is an error. The facility code is 7 and the error code is 0x0013 = 19 in decimal.

Unfortunately, JScript interprets the 32 bit error code as a signed integer and displays it in decimal. No problem — just convert that thing back to hex, right?

var x = -2147024877;
print(x.toString(16))

Whoops, not quite. JScript doesn’t know that you want this as an unsigned number, so it converts it to a signed hex number, -0x7ff8ffed. We need to convert this thing to the value it would have been had JScript interpreted it as an unsigned number in the first place. A handy fact to know is that the difference between an unsigned number interpreted as a signed number and the same number interpreted as an unsigned number is always 0x100000000 if the high bit is set, 0 otherwise.

var x = -2147024877;
print((x<0?x+0x100000000:x).toString(16))

There we go. That prints out 80070013. Or, even better, we could just write a program that takes the error apart:

function DumpHR(hr)
{
  if (hr < 0) hr += 0x100000000;
  if (hr & 0x80000000)
    print("Error code");
  else 
    print("Success code");
  var facility = (hr & 0x7FFF0000) >> 16;
  print("Facility" + facility);
  var scode = hr & 0x0000FFFF;
  print("SCode" + scode);
}
DumpHR(-2147024877);

The facility codes (in decimal) are as follows

FACILITY_NULL 0
FACILITY_RPC 1
FACILITY_DISPATCH 2
FACILITY_STORAGE 3
FACILITY_ITF 4
FACILITY_WIN32 7
FACILITY_WINDOWS 8
FACILITY_SECURITY 9
FACILITY_CONTROL 10
FACILITY_CERT 11
FACILITY_INTERNET 12
FACILITY_MEDIASERVER 13
FACILITY_MSMQ 14
FACILITY_SETUPAPI 15
FACILITY_SCARD 16
FACILITY_COMPLUS 17
FACILITY_AAF 18
FACILITY_URT 19
FACILITY_ACS 20
FACILITY_DPLAY 21
FACILITY_UMI 22
FACILITY_SXS 23
FACILITY_WINDOWS_CE 24
FACILITY_HTTP 25
FACILITY_BACKGROUNDCOPY 32
FACILITY_CONFIGURATION 33
FACILITY_STATE_MANAGEMENT 34
FACILITY_METADIRECTORY 35

So you can see that our example is a Windows operating system error (facility 7), and looking up error 19 we see that this is ERROR_WRITE_PROTECT — someone is trying to write to a write-protected floppy probably.

All the errors generated by the script engines — syntax errors, for example — are FACILITY_CONTROL, and the error numbers vary between script engines. VB also uses FACILITY_CONTROL, but fortunately VBScript assigns the same meanings to the errors as VB does. But in general, if you get a FACILITY_CONTROL error you need to know what control generated the error — VBScript, JScript, a third party control, what? Because each control can define their own errors, and there may be collisions.

Finally, here are some commonly encountered HRESULTs:

  • E_UNEXPECTED 0x8000FFFF “Catestrophic failure” — something completely unexpected has happened
  • E_NOTIMPL 0x80004001 “Not implemented” — the developer never got around to writing the method you just called!
  • E_OUTOFMEMORY 0x8007000E pretty obvious what happened here (remember that out of memory means you ran out of address space, not RAM!)
  • E_INVALIDARG 0x80070057 you passed a bad argument to a method
  • E_NOINTERFACE 0x80004002 COM is asking an object for an interface it does not support. This can happen if you try to script an object that doesn’t support IDispatch.
  • E_ABORT 0x80004004 whatever you were doing was terminated
  • E_FAIL 0x80004005 something failed and we don’t know what.

And finally, here are three that you should see only rarely from script, but script hosts may see them moving around in memory and wonder what is going on:

  • SCRIPT_E_RECORDED 0x86664004 this is how we internally track whether the details of an error have been recorded in the error object or not. We need a way to say “yes, there was an error, but do not attempt to record information about it again.”
  • SCRIPT_E_PROPAGATE 0x80020102 another internal code that we use to track the case where a recorded error is being propagated up the call stack to a waiting catch handler.
  • SCRIPT_E_REPORTED 0x80020101 the script engines return this to the host when there has been an unhandled error that the host has already been informed about via OnScriptError.

That’s a pretty bare-bones look at error codes, but it should at least get you started next time you have a confusing error number.


Commentary from 2020

First off: write-protected floppies were a real thing! Honest!

There were a number of good user comments on this article with advice extending mine:

  • FACILITY_ITF means “the interface you’re calling defines the meaning of the error you’re getting” which can be confusing
  • You can use Windows Calculator in scientific mode to quickly convert decimals to hex DWORDs
  • Look in winerror.h for more predefined error codes
  • The HRPLUS utility is good for HRESULT analysis
  • Visual Studio has an “hr” format specifier that will convert numeric values to their text equivalents. Making a watch on @EAX,hr and @ERR,hr is useful! @ERR shows the value of a call to GetLastError.
  • For a great explanation of how the script engines propagate errors around, see this SO question.

 

Constant Folding and Partial Evaluation

A reader asks “is there any reason why VBScript doesn’t change

str = str & "1234567890" & "hello"

to

str = str & "1234567890hello"

since they are both constants?

Good question.  Yes, there are reasons.

The operation you’re describing is called constant folding, and it is a very common compile-time optimization.  VBScript does an extremely limited kind of constant folding.  In VBScript, these two programs generate exactly the same code at the call site:

const foo = "hello"
print foo

is exactly the same as

print "hello"

That is, the code generated for both says “pass the literal string “hello” to the print subroutine”.  If foo had been a variable instead of a constant then the code would have been generated to say “pass the contents of variable foo…”

But the VBScript code generator is smart enough to realize that foo is a constant, and so it does not generate a by-name or by-index lookup, it just slams the constant right in there so that there is no lookup indirection at all.

The kind of constant folding you’re describing is compile-time evaluation of expressions which have all operands known at compile time. For short, let’s call it partial evaluation. In C++ (or C#) for example, it is legal to say

const int CallMethod = 0x1;
const int CallProperty = 0x2;
const int CallMethodOrProperty = CallMethod | CallProperty;

The C++ compiler is smart enough to realize that it can compute the third value itself.  VBScript would produce an error in this situation, as the compiler is not that smart.  Neither VBScript nor JScript will evaluate constant expressions at compile time.

An even more advanced form of constant folding is to determine which functions are pure functions — that is, functions which have no side effects, where the output of the function depends solely on the arguments passed in.  For example, in a language that supported pure functions, this would be legal:

const Real Pi = 3.14159265358979;
const Real Sine60 = sine( Pi / 3);  // Pi / 3 radians = 60 degrees

The sine function is a pure function — there’s no reason that it could not be called at compile time to assign to this constant.  However, in practice it can be very difficult to identify pure functions, and even if you can, there are issues in calling arbitrary code at compile time — like, what if the pure function takes an hour to run?  That’s a long compile!  What if it throws exceptions?  There are many practical problems.

The JScript .NET compiler does support partial evaluation, but not pure functions.  The JScript .NET compiler architecture is quite interesting.  The source code is lexed into a stream of tokens, and then the tokens are parsed to form a parse tree.  Each node in the parse tree is represented by an object (written in C#) which implements three methods: EvaluatePartialEvaluate and TranslateToIL.

When you call PartialEvaluate on the root of the parse tree, it recursively descends through the tree looking for nodes representing operations where all the sub-nodes are known at compile time.  Those nodes are evaluated and collapsed into simpler nodes.  Once the tree has been evaluated as much as is possible at compile time, we then call TranslateToIL, which starts another recursive descent that emits the IL into the generated assembly.

The Evaluate method is there to implement the eval function.  JScript Classic (which everyone thinks is an “interpreted” language) always compiles the script to bytecode and then interprets the bytecode — even eval calls the bytecode compiler in JScript Classic.  But in JScript Classic, a bytecode block is a block of memory entirely under control of the JScript engine, which can release it when the code is no longer callable.

In JScript .NET, we compile to IL which is then jitted into machine code.  If
JScript .NET’s implementation of eval emitted IL, then that jitted code would stay in memory until the appdomain went away!  This means that a tight loop with an eval in it is essentially a memory leak in JScript .NET, but not in JScript Classic.  Therefore, JScript .NET actually implements a true interpreter!  In JScript .NET, eval generates a parse tree and does a full recursive evaluation on it.

I’m digressing slightly.  You wanted to know why the script engines don’t implement partial evaluation.  Well, first of all, implementing partial evaluation would have made the script engines considerably more complicated for very little performance gain.  And if the author does want this gain, then the author can easily fold the constants “by hand”.

But more important, partial evaluation makes the process of compiling the script into bytecode much, much longer as you need to do yet another complete recursive pass over the parse tree.  That’s great, isn’t it?  I mean, that’s trading increased compilation time for decreased run time.  What could be wrong with that?  Well, it depends who you ask.

From the ASP implementers’ perspective, that would indeed be great.  An ASP page, as I’ve already discussed, only gets compiled once, on the first page hit, but might be run many times.  Who cares if the first page hit takes a few milliseconds longer to do the compilation, if the subsequent million page hits each run a few microseconds faster?  And so what if this makes the VBScript DLL larger?  ASP updates are distributed to people with fast internet connections.

But from the IE implementers’ perspective, partial evaluation is a step in the wrong direction.  ASP wants the compilation to go slow and the run to go fast because they are generating the code once, calling it a lot, and generating strings that must be served up as fast as possible.  IE wants the compilation to be as fast as possible because they want as little delay as possible between the HTML arriving over the network and the page rendering correctly.  They’re never going to run the script again after its generated once, so there is no amortization of compilation cost.  And IE typically uses scripts to run user interface elements, not to build up huge strings as fast as possible.  Every microsecond does NOT count in most UI scenarios — as long as the UI events are processed just slightly faster than we incredibly slow humans can notice the lag, everyone is happy.

I’m not stringing you along, honest

JScript and VBScript are often used to build large strings full of formatted text, particularly in ASP. Unfortunately, naïve string concatenations are a major source of performance problems.

Before I go on, I want to note that it may seem like I am contradicting my earlier post, by advocating some “tips and tricks” to make string concatenation faster.  Do not blindly apply these techniques to your programs in the belief that they will magically make your programs faster!  You always need to first determine what is fast enough, next determine what is not fast enough, and THEN try to fix it!

JScript and VBScript use the aptly-named naïve concatenation algorithm when building strings. Consider this silly JScript example:

var str = "";
for (var count = 0 ; count < 100 ; ++count)
  str = "1234567890" + str;

The result string has one thousand characters so you would expect that this would copy one thousand characters into str. Unfortunately, JScript has no way of knowing ahead of time how big that string is going to get and naïvely assumes that every concatenation is the last one.

On the first loop str is zero characters long. The concatenation produces a new ten-character string and copies the ten-character string into it. So far ten characters have been copied. On the second time through the loop we produce a new twenty-character string and copy two ten-character strings into it. So far 10 + 20 = 30 characters have been copied.

You see where this is going. On the third time thirty more characters are copied for a total of sixty, on the fourth forty more for a total of one hundred. Already the string is only forty characters long and we have copied more than twice that many characters. By the time we get up to the hundredth iteration over fifty thousand characters have been copied to make that one thousand character string. Also there have been ninety-nine temporary strings allocated and immediately thrown away.

Moving strings around in memory is actually pretty darn fast, though 50000 characters is rather a lot.  Worse, allocating and releasing memory is not cheap.  OLE Automation has a string caching mechanism and the NT heap is pretty good about these sorts of allocations, but still, large numbers of allocations and frees are going to tax the performance of the memory manager.

If you’re clever, there are a few ways to make it better.  (However, like I said before, always make sure you’re spending time applying cleverness in the right place.)

One technique is to ensure that you are always concatenating small strings to small strings, large strings to large strings.  Pop quiz: what’s the difference between these two programs?

for (var count = 0 ; count < 10000 ; ++count)
  str += "1234567890" + "hello";

and

For count = 1 To 10000
  str = str & "1234567890" & "hello"
Next


?

I once had to debunk a widely distributed web article which claimed that VBScript was orders of magnitude slower than JScript because the comparison that the author used was to compare the above two programs.  Though they produce the same output, the JScript program is much faster.  Why’s that?  Because this is not an apples-to-apples comparison. The VBScript program is equivalent to this JScript program:

for (var count = 0 ; count < 10000 ; ++count)
  str = (str + "1234567890") + "hello";

whereas the JScript program is equivalent to this JScript program

for (var count = 0 ; count < 10000 ; ++count)
  str = str + ("1234567890" + "hello");

See, the first program does two concatenations of a small string to a large string in one line, so the entire text of the large string gets moved twice every time through the loop.  The second program concatenates two small strings together first, so the small strings move twice but the large string only moves once per loop.  Hence, the first program runs about twice as slow.  The number of allocations remains unchanged, but the number of bytes copied is much lower in the second.

In hindsight, it might have been smart to add a multi-argument string concatenation opcode to our internal script interpreter, but the logic actually gets rather complicated both at parse time and run time.  I still wonder occasionally how much of a perf improvement we could have teased out by adding one.  Fortunately, as you’ll see below, we came up with something better for the ASP case.

The other way to make this faster is to make the number of allocations smaller, which also has the side effect of not moving the bytes around so much.

var str = "1234567890"; // 10
str = str + str;        // 20
var str4 = str + str;   // 40
str = str4 + str4;      // 80
str = str + str;        // 160
var str32 = str + str;  // 320
str = str32 + str32;    // 640
str = str + str32;      // 960
str = str + str4;       // 1000

This program produces the same result, but with 8 allocations instead of 100, and only
moves 3230 characters instead of 50000+.   However, this is a rather contrived example — in the real world strings are not usually composed like this!

Those of you who have written programs in languages like C where strings are not first-class objects know how to solve this problem efficiently.  You build a buffer that is bigger than the string you want to put in, and fill it up.  That way the buffer is only allocated once and the only copies are the copies into the buffer.  If you don’t know ahead of time how big the buffer is, then a double-when-full strategy is quite optimal — pour stuff into the buffer until it’s full, and when it fills up, create a new buffer twice as big.  Copy the old buffer into the new buffer and continue.  (Incidentally, this is another example of one of the “no worse than 200% of optimal” strategies that I was discussing earlier — the amount of used memory is never more than twice the size of the memory needed, and the number of unnecessarily copied bytes is never more than twice the size of the final buffer.)

Another strategy that you C programmers probably have used for concatenating many small strings is to allocate each string a little big, and use the extra space to stash a pointer to the next string.  That way concatenating two strings together is as simple as sticking a pointer in a buffer.  When you’re done all the concatenations, you can figure out the size of the big buffer you need, and do all the allocations and copies at once.  This is very efficient, wasting very little space (for the pointers) in common scenarios.

Can you do these sorts of things in script?  Actually, yes.  Since JScript has automatically expanding arrays you can implement a quick and dirty string builder by pushing strings onto an array, and when you’re done, joining the array into one big string.  In VBScript it’s not so easy because arrays are fixed-size, but you can still be pretty clever with fixed size arrays that are redimensioned with a “double when full” strategy.  But surely there is a better way than these cheap tricks.

Well, in ASP there is.  You know, I used to see code like this all the time:

str = "<blah>"
str = str + blah
str = str + blahblah
str = str + whatever
' the string gets longer and longer, we have some loops, etc.
str = str + "</blah>"
Response.Write str

Oh, the pain.  The Response object is an efficient string buffer written in C++.  Don’t build up big strings, just dump ’em out into the HTML stream directly.  Let the ASP implementers worry about making it efficient.

Hold on just a minute. Mister Smartypants,” I hear you say, “Didn’t you just tell us last week that eliminating calls to COM objects is usually a better win than micro-optimizing small stuff like string allocations?

Yes, I did.  But in this case, that advice doesn’t apply because I know something you don’t know. We realized that all the work that the ASP implementers did to ensure that the string buffer was efficient was being overshadowed by the inefficient late-bound call to Response.Write.  So we special-cased VBScript so that it detects when it is compiling code that contains a call to Response.Write and there is a named item in the global namespace called Response that implements IResponse::Write


Commentary from 2019

I have no memory of why I did not link to Joel’s article on the same thing; I was certainly aware of it in 2003.

There were a number of good reader questions:

  • Is the version of Mid that goes on the left side of an assignment not supported in VBScript?

That’s correct. We never got around to it and it was not a high priority request from any users. It’s super weird, and would have been a small bang compared to the bucks spent on it.

  • Does a double-when-full strategy actually waste memory? It might never be touched and therefore never committed.

In theory we could build a double-when-full strategy that allocated pages and didn’t hit the pages until they were written to, but that doesn’t buy you much. First, the pages are still reserved, so they’re still eating virtual memory, which is the scarce resource in 32 bit processes; physical memory isn’t the problem. Second, we use malloc, and the implementation we’re using calls HeapAlloc, and it commits.

  • Can strings be constant-folded in VBScript or JScript?

In theory, sure. In practice, they are not.

  • Is there any limit on the size of a string?

VBScript and JScript both used BSTRs internally to store strings, which are limited to two billion bytes, which is one billion characters. Since there is only a two billion byte user addressable space in 32 bit Windows, that means that in practice there is no limitation other than available address space.

 

The Malware of Ultimate Destruction

The other day Peter was talking about the ActiveX Control of Ultimate Destruction — a hostile control which, the moment it is loaded immediately formats your hard disk. The
aim of the ACoUD is to do “as much damage as possible in a short amount of time”.

Well, Peter’s not the only one who’s kept up at night worrying about this stuff.  Last night I couldn’t sleep because I was thinking about how that characterization of the ACoUD really isn’t bad enough.  If this is going to be the ULTIMATE in destruction, let’s think about just how evil we can get.

For the purposes of this discussion, let’s not care about how the evil code gets on your machine.  Perhaps you download and trust a malicious ActiveX control.  Perhaps a worm takes advantage of a buffer overrun in the operating system.  Perhaps you got an email virus, or ran a bad macro, or whatever.  Let’s just call all those things malware.  Furthermore, let’s assume that all attempts to stop the malware from running — like never logging in as administrator, etc, — have failed and that the malware has elevated its privilege to administrator.  Let’s assume that the malware author is brilliant and has unlimited time to craft something incredibly devious.  Remember, we’re going for the ultimate here.

Here’s the worst I could come up with:

  • When the malware is run, first it waits until some point when no one is using the machine.
  • When the coast is clear, it compresses and backs up the entire hard disk.
  • It then installs a minimal linux kernel on the box along with a cleverly written Windows emulator.
  • The state of the emulator is set up to exactly mimic the state of the machine as it was before the attack.
  • The linux boot sequence is rewritten to exactly resemble the Windows boot sequence, except that of course what is really happening is that linux is loading a windows emulator during the boot.

The net result: you are not even running Windows anymore so nothing is trustworthy. The emulator could be logging every keystroke, sending your files to Brazil, whatever the emulator writer wants.  The emulator could be reporting that no, there is no linux boot partition on this disk!  You don’t own that box anymore.  The chain of trust has been broken. 

How could you detect this attack?  Since you’re not running Windows, you can’t assume that the operating system will report anything about the machine correctly.  You’d have to boot off of trustworthy media and run utility programs to examine the real state of the disk.

How could you prevent this ultimate attack?  Remember, we’re assuming that all the usual good stuff has failed, like keeping up-to-date with patches, not running as administrator, maintaining firewalls, not opening suspicious email attachments, and so on.  What is the final line of defense against this sort of ultimate malware?  Really the only line of defense that remains is the hardware.  To solve this problem the chain of trust needs to be rooted in the hardware, so that when the machine boots it can tell you whether you are loading code that has been signed by a trusted authority or not.  The possibility of constructing such chips has met with considerable controversy over the last few years, and it remains to be seen whether they are technically and economically feasible.

Regardless, the point is that though this is in many ways a ridiculous “overkill” attack, it is in principle possible. This is why trustworthy computing is so incredibly important.  At the very least, you need to have confidence that when you boot up your machine, you are actually running the operating system that you installed!

I was thinking about all of this last night because of the recent successful attack against Valve, a local software company that made the popular video game “Half Life”.  I don’t know the exact details — and probably no one does except for the attackers who perpetrated the attack — but what seems likely is that attackers exploited a known vulnerability in Outlook, and a high-ranking Valve employee was vulnerable to the attack.  The malware installed a key press logger, and from that point, it was pretty much game over, so to speak.  By monitoring key presses they’d quickly learn all kinds of details such as the administrator passwords to other machines, compromise them, and eventually manage to “own” as much of the network as possible.  The attackers didn’t have to emulate all of Windows, they just had to tweak it a tiny bit by installing a key logger.

The fact that this underscores the importance of keeping up to date on patches is not particularly relevant, and I do not ever want to blame the victim for failing to patch a machine.  The important point which this illustrates is that there is a spectrum of malware out there.  Consider the Blaster worm, which simply tries to cause as much havoc as possible and spread as fast as possible — that thing wasn’t targeted against anyone in particular, and though it was very costly, it was akin to a hurricane that blows through and wrecks a lot of stuff.  But it certainly announces itself.  I mean, it might as well put up a big dialog box that says YOU ARE OWNZORD BY BLASTER — Blaster was about as subtle as a brick through a window.

The Valve attackers were far cleverer and subtler. Their attack was focused on a particular individual at a particular company and depended on slowly and carefully gathering the information needed to launch further attacks, avoiding detection until the job was finished.  You can rapidly write and disseminate a virus checker for “broad distribution” worms, viruses and Trojans, but it is very hard to write a “virus checker” for custom built malware that only attacks a single person!

This second kind of attack I suspect is far, far more common than the first and ultimately costlier.  But since the first kind is by its nature highly visible, and the second is by its nature as invisible as possible, the former gets a lot more press.

We need to solve this problem and produce a more trustworthy digital infrastructure.  It will not happen overnight, but I am very confident that we are on the right track.


 

Commentary from 2019

The “Peter” in question is of course my scripting partner in crime Peter Torr; unfortunately, I cannot find the original blog post either on MSDN or in the wayback machine.

A number of readers pointed out that indeed, “spear phishing” — crafting an attack against a specific, high-value target — is the truly dangerous attack, not the widespread chaos of viruses. Moreover, these attacks may be under-reported in the media; no one wants to advertise that they’ve been hacked. And once you are “rooted”, there’s not much you can do but burn the whole thing down and start over.

Looking back, it certainly betrays my bias as a Microsoft employee that the worst thing I could think of was running a non-Windows operating system and not knowing it! And unfortunately, it appears that we’ve made little progress as an industry in building a trustworthy computing platform.

 

 

How Bad Is Good Enough?

I keep talking about script performance without ever actually giving my rant about why most of the questions I get about performance are pointless at best, and usually downright harmful.

Let me give you an example of the kind of question I’ve gotten dozens of times over the last seven years.  Here’s one from the late 1990s:

We have some VBScript code that DIMs a number of variables in a well-used function.  The code never actually uses those variables and they go out of scope without ever being touched.  Are we paying a hidden price with each call?

What an interesting performance question!  In a language like C, declaring n bytes total of local variables just results in the compiler generating an instruction that moves the stack pointer n bytes.  Making n a little larger or smaller doesn’t change the cost of that instruction.  Is VBScript the same way?  Surprisingly, no!  Here are my analyses:


Bad Analysis #1

You Dim it, you get it.  VBScript has no idea whether you’re going to do this or not:

Function foo()
  Dim Bar 
  Execute("Bar = 123")

In order to enable this scenario the script engine must at runtime bind all of the names of the local variables into a local binder.  That causes an added per-variable-per-call expense.

(Note that JScript .NET does attempt to detect this scenario and optimize it, but that’s another post.)

Anyway, what is the added expense?  I happened to have my machine set up for perf measuring that day, so I measured it:

On my machine, every additional variable which is dimensioned but not used adds a 50 nanosecond penalty to every call of the function.  The effect appears to scale linearly with the number of unused dimensioned variables; I did not test scenarios with extremely large numbers of unused variables, as these are not realistic scenarios.  Note also that I did not test very long variable names; though VBScript limits variable names to 256 characters, there may well be an additional cost imposed by long variable names.

My machine is a 927 MHz Pentium III, so that’s somewhere around fifty processor cycles each.  I do not have VTUNE installed right now, so I can’t give you an exact processor cycle count.

That means that if your heavily used function has, say, five unused variables then every four million calls to your function will slow your program down by an entire second, assuming of course that the target machine is my high-end dev machine.  Obviously a slower machine may exhibit considerably worse performance.

However, you do not mention whether you are doing this on a server or a client.  That is extremely important when doing performance analysis!

Since the penalty is imposed due to a heap allocation, the penalty on the server may scale differently based on the heap usage of other threads running in the server. There may be contention issues – my measurements measured only “straight” processor cost; a full analysis of the cost for, say, an 8 proc heavily loaded server doing lots of small-string allocations may well give completely different results.

Now let me take this opportunity to tell you that all the analysis I just described is almost completely worthless because it obscures a larger problem.  There’s an elephant in the room that we’re ignoring.  The fact that a user is asking me about performance of VBScript tells me that either

(a) this user is a hard-core language wonk who wants to talk shop, or, more likely,

(b) the user has a program written in script which they would like to be running faster.  The user cares deeply about the performance of their program.

Whoa!  Now we see why this perf analysis is worthless.  If the user cares so much about performance then why are they using a late-bound, unoptimized, bytecode-interpreted, weakly-typed, extremely dynamic language specifically designed for rapid development at the expense of runtime performance?


Bad Analysis #2

If you want a script to be faster then there are way more important things to be optimizing away than the 50-nanosecond items.  The key to effective performance tuning is finding the most expensive thing and starting with that.

single call that uses an undimensioned variable, for example, is hundreds of times more expensive than that dimensioned-but-unused variable.  A single call to a host object model method is thousands of times more expensive. I could list examples all day.

Optimizing a script by trimming the 50 ns costs is like weeding your lawn by cutting the already-short grass with nail scissors and ignoring the weeds.  It takes a long time, and it makes no noticeable impact on the appearance of your lawn.  It epitomizes the difference between “active” and “productive”. Don’t do that!

But even better advice that that would be to throw away the entire script and start over in C if performance is so important.

Now, let me just take this opportunity to interrupt myself and say that yes, script performance is important.  We spent a lot of time optimizing the script engines to be pretty darn fast for late-bound, unoptimzed, bytecode-interpreted, weakly-typed dynamic language engines. Eventually you come up against the fact that you have to pick the right tool for the job — VBScript is as fast as its going to get without turning it into a very different language or reimplementing it completely.


Unfortunately, this second analysis is hardly better than the first, because again, there is an elephant in the room.  There’s a vital piece of data which has not been supplied, and that is the key to all perf analysis:

How Bad Is Good Enough?

 

I was going easy on myself — I actually consider this sort of “armchair” perf analysis
to be not merely worthless,  I consider it to be actively harmful.

I’ve read articles about the script engines that say things like “you should use And 1 to determine whether a number is even rather than Mod 2 because the chip executes the AND instruction faster“, as though VBScript compiled down to tightly optimized machine code. People who base their choice of operator on utterly nonsensical rationales are not going to write code that is maintainable or robust.  Those programs end up broken, and “broken” is the ultimate in bad performance, no matter how fast the incorrect program is.

If you want to write fast code — in script or not — then ignore every article you ever see on “tips and tricks” that tell you which operators are faster and what the cost of dimensioning a variable is.  Writing fast code does not require a collection of cheap tricks, it requires analysis of user scenarios to set goals, followed by a rigorous program of careful measurements and small changes until the goals are reached.

What should you do? Here’s what you should do:

  1. Have user-focussed goals. Know what your performance goals are, by knowing what is acceptable to your users.
  2. Know what to measure to test whether those goals are met.  Are you worried about throughput?  Time to first byte?  Time to last byte?  Scalability?
  3. Measure the whole system, not just isolated parts. Measure carefully and measure often. You have to actually do the measurements!

That’s what the MSN people do, and they know about scalable web sites.

I know that’s not what people want to hear.  People have these ideas about performance analysis which as far as I can tell, last applied to PDP-11’s.  Script running on web servers cannot be optimized through micro-optimization of individual lines of code — it’s not C, where you can know the exact cost of every statement. With script you’ve got to look at the whole thing and attack the most expensive things. Otherwise you end up doing a huge amount of work for zero noticable gain.

You’ve got to know what your goals are.  Figure out what is important to your users.  Applications with user interfaces have to be snappy — the core processing can take five minutes or an hour, but a button press must result in a UI change in under .2 seconds to not feel broken.  Scalable web applications have to be blindingly fast — the difference between 25 ms and 50 ms is 20 pages a second.  But what’s the user’s bandwidth?  Getting the 10kb page generated 25 ms faster will make little difference to the guy with the 14000 bps modem.

Once you know what your goals are, measure where you’re at.  You’d be amazed at the number of people who come to me asking for help in making there things faster who cannot tell me how they’ll know when they’re done.  If you don’t know what’s fast enough, you could work it forever.

And if it does turn out that you need to stick with a scripting solution, and the script is the right thing to make faster, look for the big stuff.  Remember, script is glue.  The vast majority of the time spent in a typical page is in either the objects called by the script, or in the Invoke code setting up the call to the objects.

If you had to have one rule of scripting performance, it’s this:  manipulating data is really bad, and code is even worse. Don’t worry about the Dims, worry about the calls.  Every call to a COM object that you eliminate is worth tens of thousands of micro-optimizations.

And don’t forget also that right is better than fast.  Implement the code to be extremely straightforward. Code that makes sense is code which can be analyzed and maintained, and that makes it performant. 

Consider our “unused Dim” example — the fact that an unused Dim has a 50 ns cost is irrelevant.  It’s an unused variable.  It’s worthless code. Worse than worthless: it’s a distraction to maintenance programmers.  That’s the real performance cost — it makes it harder for the devs doing the perf analysis to do their jobs well!


Reflections from 2019:

This was a popular article, but I later realized that it was too specifically targeted towards script engine performance. I later wrote a much more general rant.

Jeff Atwood mentioned this article a few times over the years, and told an interesting war story.

There were a few good comments; paraphrasing:

If you’re worried about the user with the 14K modem, it’s much better to get your page down to 6kb than to make anything on the server faster.

Absolutely; find the bottleneck! Of course it seems crazy nowadays that we ever worried about people with 14kbps modems, but back in the day we absolutely had to.

My first modem was 300 bits per second; that’s slow enough to watch the characters render one by one on the screen. Nowadays of course we think nothing of putting megabytes of images and video on any old web page.

Since invoking is one of the slowest operations in script, a good rule of thumb is to eliminate the dots. mybutton.name is better than window.document.forms[0].mybutton.name if you’re calling it in a tight loop.

Though that sounds like one of the “tips and tricks” that I say to avoid, this is reasonably good advice provided that you do everything else: make a performance goal, measure your performance, test whether making these changes improves performance.

 

Long jumps considered way more harmful than exceptions

Bob Congdon’s blog points out that in the dark days before exception handling you could always use setjmpand longjmp to do non-local gotos.

In fact, the script engines are compiled in C++ with exception handling turned off (for performance reasons), and the mainline loop of the bytecode interpreter uses setjmplongjmp exception handling to implement error handling.  When you have a script that calls an object that returns an error, we longjmp back to the start of the interpreter loop and then figure out what to do next.

In VBScript of course it depends on whether On Error Resume Next is on or not, and in JScript we construct an exception object and start propagating it back up the stack until we find an interpreter frame that has a catch block.  (If there are multiple script engines on the stack then things get extremely complicated, so I won’t even go there.)

Since a long jump does not call any destructors, it was very important that we design our interpreter loop to not put anything on the system stack that required destructing.  Fortunately, since we were designing the interpreter to be an interpreter for a garbage-collected language, it was pretty easy.  Everything that the interpreter does that requires memory either takes the memory out of the area reserved for the script’s stack (which will be cleaned up when the frame goes away) or heap-allocates it and adds the memory to the garbage collector.

Not everyone has the luxury of having a longjmp-safe garbage collector already implemented, so kids, don’t try this at home!  If you must use exception handling in C++, take my advice and use real C++ exception handling.


Reflections from 2019:

As I noted in my previous article on this subject, it’s important to think about what kind of costs exception handling imposes; why did we turn off exception handling in the C++ compiler to get a performance win?

The overall cost, not the per-exception cost, killed our performance on the server. Exception handling adds additional code to every function, and that additional code has a nonzero run time. Since the purpose of the code — handling exceptions — was never going to be fulfilled because we never threw an exception, we turned it off.

A reader asked me for the numbers justifying that decision, which I had measured on a Pentium 1 seven years prior to writing this article in 2003, so I declined to speculate. But it was significant.

Some people consider the “X considered harmful” trope to be overplayed or, irony, harmful. Obviously I don’t feel the same way, as I’ve used it many times. It’s playful, it points to history, and it takes a stand. Like anything, it can be overused.

VBScript : VB :: ping-pong : volleyball

It’s my girlfriend Leah’s 30th birthday today! Happy birthday Leah!

Leah is a tester in the Mobility division here at Microsoft, where she works on the software that moves email between servers and cellphones. Right now she and her fellow testers are boning up on their C# skills so that they can write better testing automation scripts. Surprisingly, a lot of the sync software testing is still done “by hand”, rather than writing a program to run the same test suites over and over again every day.

The other day Leah pointed out to me something that I’ve been intending to write about — that often it really doesn’t matter what language you learn.  For many programmers, the language is just the thing that stands between you and the object model.  People writing application automation need to be able to create instances of classes, call methods, listen to events, run loops, and dump strings to log files.  Learning the syntax for those operations in JScript, VBScript, VB, C#, C++, perl (shudder), Python, etc, is trivial.  The complexity lies in understanding the object model that you’re manipulating so that you can use it well to test the product.

The same goes for the vast majority of scripting applications.  People ask me “Eric, I’m a newbie programmer — should I learn VBScript or JScript?”  I tell them that script is glue and that what matters is that you glue the right things together, not that you pick the right glue.

That’s not to say that there aren’t important differences between languages.  As I mentioned the other day, some languages are designed to support programming in the large, some are designed to facilitate rapid development of small, throwaway programs. Some are for writing device drivers, some are for research, some are for science, some are for artificial intelligence, some are for games.  If you have complex structures that you wish to model, it’s a good idea to pick a language that models them well.  Prototype classes (JScript) are quite different from inheritance classes (C#), which in turn are different from simple record classes (VBScript).  But my point is that by the time you’re ready to write programs that require these more advanced features, you’ll be able to pick up new languages quickly anyway.

And this is also not to say that automation testing is just glue code. I’ve had many long conversations with the testers on my team on the subject of writing automation tools.  When you move from the low level automation code (call this method, did we get the expected response?) to the higher-level (run tests 10, 13 and 15 against the VB and C# code generators for the Word and Excel object models on 32 and 64 bit machines but omit the 64 bit flavours of test 15, we already know that its broken) then you start to run into much deeper problems that may require their own object models to represent.  Or even their own languages!  One of the testers I work with is kicking around some ideas for a “test run definition language” that would cogently express the kinds of massive test matrices that our testers struggle with.

But these are not newbie programmer problems.  If you’re just getting into this whole scripting thing, pick a language and use it to learn the object model inside-out.  Once you know how the OM works, doing the same thing in a different language should be pretty straightforward.

It’s kind of like table tennis.  If you know the rules of table tennis, learning the rules of real tennis is pretty easy — it’s table tennis, just with a larger board.  And you stand on the board, and the ball is bigger, as are the racquets.  But, as George Carlin said, it’s basically the same game.  And if you know tennis, volleyball is pretty easy — it’s just tennis with nine people on a side and you can hit the ball three times and it can’t hit the ground.  And there are no racquets, and the ball is bigger and the net is higher, and you play to 21 points. But it’s basically the same game.

OK, maybe that’s not such a good analogy.  But you take my point, I’m sure.  Don’t stress about choice of language, but learn the object model cold.  The question shouldn’t be “what language should I learn” but rather “what object framework solves my problem, and what languages are designed to efficiently use that framework?”


Reflections from 2019:

Leah is of course no longer my girlfriend, as we’ve been married since 2005. The mobile devices division at Microsoft has been reorganized so many times that any vestige of its 2003 state is I’m sure long gone.

It was a fascinating time to be involved in mobile devices; the iPhone was still years away, and there was real competition for who was going to be a player in this market. There was always a tension between the hardware manufacturers, the operating system vendors like Microsoft, and the cell service providers; each one had an interest in differentiating their offerings from their competitors while commoditizing the others.

That is to say, AT&T wished desperately to be seen as different than, say, Sprint, but not differentiated at all on the basis of the quality of the hardware or the software; they wanted people to differentiate based on cell service, since that is what they provided. The hardware manufacturers were in the same boat; they wanted customers to differentiate on the basis of hardware features, not what networks were supported. And the same for Microsoft, which wanted customers to think about the OS provider as the differentiating factor when making a purchasing decision, not the hardware or the cell service.

This misalignment of interests led to some considerable lack of trust between the various parties — who ought to have been collaborating for each other’s mutual success.

This article produced a number of thoughtful reader responses:

Sometimes it does matter what glue you pick, and for those times, that’s why we have http://www.thistothat.com.

No doubt.

Explaining the difference between client and server programming for VBScript, VB, VBA, JScript, JavaScript, JSP, J2EE and Java to someone who doesn’t understand is too hard.

No doubt!

Are there any applications written in JScript?

My response was that I knew of zero shrink-wrapped applications that were 100% JS, and I new lots and lots of applications where some part of the application was written in JS. Of course, the idea of “shrink-wrapped app” is now hopelessly stone-aged.

That’s a George Carlin bit

When I originally wrote this article I had forgotten that the bit I’d alluded to was Carlin. The original was: “Tennis is nothing but a form of ping-pong. Tennis is ping-pong played while standing on the table. In fact, all net sports are just derivatives of ping-pong; even volleyball is racquetless team ping-pong played with an inflated ball and a raised net while standing on the table.”

I note that, golf is just ping pong with a huge, grassy board that you stand on, sand and trees instead of a net, you have clubs instead of paddles, the scoring system is slightly different, and you try to get the ball into a tiny hole rather than past the opponent. But basically it is the same.

If the novice programmer should focus on learning the object model, then surely a relevant question is “how much junk does the language make you put between you and the object model?” Syntax is not just trivial; it can make a difference. My experience is that Java is much easier to learn than VB, for instance.

I took five years of French in high school, and I recently took a beginner Spanish course. And you know what struck me? English is the only language in which the words come IN THE SAME ORDER THAT YOU THINK THEM! I mean, imagine the extra mental effort that French speakers have to go through — they think the thoughts for “the white dog” but then have to re-order those thoughts into “le chien blanc”.

Seriously now, I understand your point. But I came to professional programming having spent many years as a hobbyist using basic-like languages. At the time, languages like VB had a much more “natural” syntax to me; C-like languages were these godawful clunky messes with their |&^ operators and weird bracing rules. I mean, what’s clearer: “}” or “End Function”?

I agree that some languages are easier to learn than others. We designed VBScript and JScript to be easy for novices to pick up quickly, for example. But I don’t buy the argument that VB has a weird syntax. It has oddities, certainly, but in general I find it very pleasant and easy to read VB, given my long background in basic-like languages.

Your sports analogy can be extended; consider trying to explain soccer scoring to a novice compared to tennis scoring. Similarly, some languages go out of their way to be easier for novices to pick up, and some do not.

Sure, I’ll grant that.

Your sports analogy can be extended further; just because someone is an expert in one sport does not mean that their expertise will carry over into another.

Well, we can take analogies too far. I’m a much slower Java and Scala programmer than I am a C# programmer just because I constantly have to look up how to do simple things that I already know how to do in C#. But I don’t see programming in Java or Scala as in any way fundamentally different than programming in C#; the skills that I already have transfer over very well. Even if sometimes what is “the right way” in one language is the wrong way in another.

 

Dead Trees vs. Bits

Speaking of books, people keep telling me and Peter and Raymond that we should write books based on our blogs.

I probably am going to write another book this winter, but it will have pretty much nothing to do with the stuff in this blog.   The natures of a blog and a book are very different.  Let me list some differences:

Creative control

I can blog what I want, when I want, at what length I want, and can say whatever I want. In particular, I like to ramble on about the scripting technologies — which, though they are widely used, are clearly a 20th century technology.  .NET is the future.  A book has to be on a specific topic, finished by a specific time, at a specific length.  A book has to be about a current technology topic and have a clear beginning-middle-end structure. Books both allow editing and require editing.  Blogs resist editing.

Business model

Books work on the ink-on-dead-trees business model.  Weblogs work on the “bits are free” business model. If I went to a publisher and said “I want to write a short but rambling book about random, arcane, trivial details of the history and internals of a 1996 technology that is presently being made increasingly irrelevant and outmoded by .NET” then the publisher would say “thanks, but no thanks”.   People buy computer books because they have a problem that needs solving, not because they enjoy learning my opinions about proper Hungarian usage.

Books must make money to exist.  My aim for this blog isn’t to make money, it is to dump my vast collection of arcane knowledge into some searchable location.

Scope of readership

My blog is available to everyone in the world with a web browser, and given the subject matter, that’s everyone I want to reach.  Books are available to only the very small number of people who actually buy the book.  If you like my book and you want your friend in Europe to read it, you can’t just send them a link.  Again, books cost money and that limits the potential readership.

Permanence

My book is no longer available because of circumstances beyond my control.  Now, Microsoft isn’t going to go out of business, but if they did, I could just move the blog file to another machine in about five minutes and be back up and running. This blog will be archived and therefore part of the permanent searchable record of knowledge on the internet. The copies of my book in the Library of Congress (and whatever the British equivalent is) aren’t going to help a whole lot of devs.

And finally, apropos of nothing in particular, this is hilarious:  http://mama.indstate.edu/users/bones/WhyIHateWebLogs.html, mostly because it is so self-referential.  One wonders what category the author himself falls into.  Thank goodness my blog falls under one of his acceptable uses of blogs!  I don’ t know how I could continue to face myself in the mirror every day without this guy’s approval.


Commentary from 2019:

The most obvious thing I missed in this rant was the rise of electronic books as a viable business model, which mitigates many of the anti-book factors I mentioned here.

Raymond Chen did of course write a book based on his blog. Peter Torr I believe never did.

The book I mentioned that I was going to be working on was my first VSTO book.

I still edit other people’s books, but I am down to mostly my two favourites: Essential C#, and C# In Depth.

I’m still not super bullish on writing more programming books; I feel like in a world where we’re connected to the internet all the time, that writing a book about learning to program is no longer the best approach. Online interactive tutorials seem like a much better way to go; the question is, how to monetize them? It is an enormous amount of work to develop such a curriculum, and that should be compensated.