A Whole Lot Of Nothing

Occasionally I get questions from people who are confused over the semantics of data that are not even there. Usually they’ve written code something like

If Blah = Nothing Then

or

If Blah = Empty Then

or

If Blah = Null Then

all three of which almost certainly do not correctly express the actual intention of the
programmer. Why does VBScript have Null, Nothing and Empty, and what are the differences between them?

Let’s start with Empty. When you declare a variable in C, the variable’s value before the first assignment is undefined:

int index;
printf("%d", index); /* could print any integer */

In C, the declaration reserves space for the variable, but does not clear the contents of that space. After all, why would it need to? You’re just going to initialize it to some value yourself, right? Why should the compiler waste time by initializing it only to have that initialization overwritten?

That might seem like a sensible attitude if you are one of those people who prefers that your program be twenty or even thirty nanoseconds faster in exchange for causing any accidental use of uninitialized memory to make your program’s behaviour completely random.

The designers of VB knew that their users were not hard core bit twiddling performance wonks, but rather line-of-business developers who prefer a predictable programming environment. Thus, VB initializes variables as they are declared and eats a few processor cycles here and there. When you declare an integer in VB, it’s initialized to zero, strings are initialized to empty strings, and so on.

But what about variants? Should an uninitialized variant be initialized to zero? That seems bogus; why should an uninitialized variant automatically become a number?

Really what we need is a special of-no-particular-type “I’m an uninitialized variant” value, and that’s Empty. And since in VBScript, all variables are variants, all variables are initialized to Empty.

What if in VB you compare an uninitialized variant to an uninitialized integer? It seems sensible that the comparison would return True, and it does. Empty compares as equal to 0 and the empty string, which might cause false positives in our example above. If you need to detect whether a variable actually is an empty variant and not a string or a number, you can use IsEmpty. (Alternatively, you could use TypeName or VarType, but I prefer IsEmpty.)

Nothing is similar to Empty but subtly different. Empty says “I am an uninitialized variant”; Nothing says “I am an object reference that refers to no object”. Since the equality operator on objects checks for equality on the default property of an object, any attempt to say

If Blah = Nothing Then

is doomed to failure — Nothing does not have a default property, so this will produce a run-time error. To check to see if an object reference is invalid, use

If Blah Is Nothing Then

Null is weirder still. The semantics of Null are very poorly understood, particularly amongst people who have little experience with relational databases. Empty says “I’m an uninitialized variant”, Nothing says “I’m an invalid object” and Null says “I represent a value which is not known.”

Let me give an example. Suppose you have a database of sales reports, and you ask the database “what was the total of all sales in August?” but one of the sales staff has not reported their sales for August yet. What’s the correct answer? You could design the database to ignore the fact that data is missing and give the sum of the known sales, but that would be answering a different question. The question was not “what was the total of all known sales in August, excluding any missing data?” The question was “what was the total of all sales in August?” The answer to that question is “I don’t know — there is data missing”, so the database returns Null.

What happens when you say

If Blah = Null Then

?

Let’s try printing out the value of the comparison and see:

Sales = 123
WScript.Echo Sales = Null

You get not True, not False, but Null! Why’s that? Well, think about the semantics
of it. You’re saying “is the unknown quantity equal to 123?” The answer to that is not “yes”, it’s not “no”, it’s “I don’t know what the unknown quantity is, so, uh, maybe?”

Nulls propagate themselves. Any time you numerically manipulate a Null,
you get a Null right back. Any sum containing an unknown addend has an unknown sum, obviously! The correct way to check for Null is much as you’d do for Empty:
use IsNull (or TypeName or VarType.)

The sharp-eyed among you will have noticed that I never actually answered the question. What does happens when you say

If Blah = Null Then

does VBScript run the consequence block or the alternative (“else”) block? Obviously it has to do one of the two. When it comes right down to it, VBScript will assume falsity in this situation.

The way JScript and JScript .NET handle nulls is a little bit weird; I’ll talk about that in my next entry.


Commentary from 2019:

Most of the comments of this posting were a discussion with a reader who used to use the information in this blog post as interview questions; I pushed back on that, as I want interview questions to elicit signal on skills, not specific trivial knowledge. Sure, you can tell to what level a developer actually knows their tools by that sort of probing, but I am much more interested in a developer’s ability to analyze problems, find solutions, and write solid code. The original reader was mostly looking to interview for contract positions, where there was an expectation of expertise that could lead to immediate productivity; I tend to interview not for immediate productivity, but ability to learn.

Peter Torr pointed out that I had missed one on my list of “missing data” types. VBScript represents a call to a function with an omitted optional argument as a variant of type VT_ERROR with value DISP_E_PARAMNOTFOUND.

Advertisements

In, Out, In-Out, Make Up Your Mind Already

I was talking about reference types vs. by-reference variables a while back. Recall that both JavaScript and VBScript have reference types (“objects”) but JavaScript does not have by-reference variables.

COM supports passing variable references around, but unfortunately the intersection of early-bound COM and late-bound IDispatch is a little bit goofy. There are a few problems that you need to be aware of when you’re trying to get VBScript code to talk to COM objects that expect to be passed references to variables. (Obviously attempting to do so from JavaScript is just a non-starter, as JavaScript will not pass variable references at all.)

The most common problem happens when you are trying to call a method on a dispinterface where the vtable entry for the method looks something like this

HRESULT MyFunction([in, out] BSTR * pbstrBlah );

Calling this method via VBScript like this produces a type mismatch error:

Dim MyString
MyString = "foo"
MyObj.MyFunction MyString

And calling it like this does not produce an error but also does not actually pass the value by reference:

Dim MyString
MyString = "foo"
MyObj.MyFunction CStr(MyString)

The latter behaviour is completely expected — what you’re passing is the output of a function, not a reference to a variable. There is no reference available, so no value is filled in. But why does the former fail?

Well, VBScript does not know what the callee is expecting as far as types go. That’s what “late bound” means — the callee has to do the work of determining how to suck the relevant data out of the variants passed to Invoke, and the callee has to somehow call the underlying vtable with the correct types. So VBScript sees

MyObj.MyFunction MyString

and passes a reference to a variant. All variables are variants in VBScript.

Why does VBScript produce a type mismatch error here? VBScript doesn’t! The object produces the type mismatch error, which VBScript dutifully reports. The object’s implementation of Invoke calls the default implementation of Invoke provided for you by the type library implementation. That thing says “I’ve got a reference to a variant, and that variant is a string. I need a reference to a string. That’s a type mismatch.”

This seems like a missed trick; if I were designing such a system, I’d put some additional smarts into it that would handle this case. Clearly from a reference to a variant that contains a string one can obtain a reference to a string — just take the address of the string field of the variant! However, for reasons which are lost to the mists of time, the default implementation does not do that.

My advice to you would therefore be

  • If you want to write an object that can be easily used from script, do not have any [in, out] parameters, because JScript can’t pass references.
  • If you must have in/out parameters, make them variants, because VBScript can’t pass any other kind of reference.
  • If you must have nonvariant in/out parameters, write some fixer-upper code for your IDispatch implementation which transforms byref variants pointing to strings into byref strings. (Or whatever byref type you require.) But if you do that, make sure you get it right.
  • Do not attempt to write your own IDispatch implementation, as there are many pitfalls (which I may discuss at another time).

That’s the most common problem I see. The other common problem involves out parameters which are not in/out or out/retval parameters. Just-plain-out parameters cause memory leaks. Consider our earlier example:

Dim MyString
MyString = "foo"
MyObj.MyFunction MyString

Suppose MyFunction takes an in/out variant, and fills in the byref variant with the string “bar”. The implementor expects that something will come in and something will go out, and the rule is that the callee frees the coming-in memory before replacing it with the going-out memory. The caller is then responsible for freeing the going-out value.

But if MyFunction takes a just-plain-out variant then the callee does not free the incoming memory. It assumes that the incoming memory is garbage because it has specifically been told that nothing is coming in.

How does VBScript know whether the callee is in-out or out? VBScript doesn’t! Knowing that requires either compile time knowledge or a very expensive run-time lookup (the results of which are difficult to cache for the same reasons that dispatch ids are difficult to cache.)

The practical result is that if you pass an object or string to a callee expecting an out variant, the string or object pointer will be overwritten without first being freed, and the memory will leak. You should ensure that you always pass empty variants if you must call an object method that takes an out parameter. Yes, this is a violation of the design principle that late-bound calls must have the same semantics as early bound calls, but unfortunately, that’s just the way OLE Automation is and there’s nothing we can do about it now.


Commentary from 2019

Most of the responses to this article were of the form “I wish this had been documented earlier”. Indeed, that is the primary reason why I started this blog! I felt that a lot of this stuff was under-documented in the 1990s, and I belatedly wanted to fix that for posterity.

Now that posterity has arrived, we can see that the design of IDispatch had a number of flaws. An interface specifically designed for “late binding” does not need to provide a mechanism for callees to interrogate callables to determine the types expected, though it sure would be nice to have. But the fact that the caller needs to know the calling convention in order to avoid memory leaks, and there is no cheap way for the caller to interrogate the object to find out what it expects, is terrible! It forces the script engine developer to trade off between bad performance on the one hand, and a memory leak on the other.

This also illustrates one of the costs of runtimes that lack garbage collection; think about how much effort have we put into designing and implementing memory management protocols in COM APIs. None of that work had to be done for C# APIs, freeing the developer to concentrate on the business semantics rather than the storage mechanisms.

What could numeric rounding possibly have to do with MS-DOS?

After the previous episode, a reader pointed out to me that FormatNumber uses yet a different rounding algorithm. FormatNumber rounds numbers ending in .5 away from zero, not towards evens. The reader also asked whether FormatNumber, FormatCurrency, and so on, actually call into the VBA Format$ code.

It does not. The VBA runtime is freely redistributable, but it is also large. Back in 1996 we did not want to force people to download the large VBA runtime. IE was on our case for every kilobyte we added to the IE download.

However, FormatNumber, and the other related functions were written by the same developer who implemented Format$ in VBA. That was the legendary Tim Paterson, who you may recall wrote a little program called QDOS that was eventually purchased by Microsoft and called MS-DOS.

And let me tell you, when I was an intern in 1993 I had to debug Format$. There are very good reasons why we decided to go with a stripped-down let’s-solve-the-90%-usage-case version for VBScript! Format$ is one of the most enormously complicated hard-core-bit-twiddling functions that I’ve ever seen. That’s what happens to these “everything but the kitchen-sink” functions that tries to be all things to all callers — they get enormously bloated and complex. Porting that thing to a new codebase and maintaining it independently would have been nightmarish.

Here’s a comment that Tim wrote into the FormatNumber code on September 16th, 1996 which confirms the reader’s observation:

// We have more digits that we're not using. Round if needed.
// This is simplistic rounding that does not use IEEE
// rules (rounding even if exactly x.5). This simple
// way is how VBA Format$ works, so it's good enough for us.

When I was first an intern who barely knew how to program real-world code, Tim Paterson was very intimidating to me, and he did not suffer foolish questions gladly. But once I got to know him, it turned out that he’s a great guy. I had many enjoyable conversations with him about the early days of Microsoft. I particularly remember walking with him to the launch event for Windows 95, reminiscing about how much operating systems had changed since the QDOS days.

Incidentally, Format$ was about 5500 lines of C++ code (counting blank lines and comments). About 1300 of those lines are the parser that turns the “format picture” into an internal representation. It’s an entire programming language of its own, practically.

I cordially hate the “format picture” approach to formatting strings for this reason; we’ve created a programming language that follows none of the standards, rules or idioms of the “host” language. Much of the work of Format$ was parsing a language that never should have existed in the first place; instead we should have had an object model of combinators that could be combined programmatically. But we’re stuck with this 1970s era idiom now.

 

Bankers’ Rounding

A number of people have pointed out to me over the years that VBScript’s Round function is a bit weird. It seems like it should be pretty straightforward — you pick the integer closest to the number you’ve got, end of story. But what about, say, 1.5? There are two closest integers. Do you go up or down?

The Round function goes to the nearest integer, and if there are two nearest integers then it goes to the even one. 1.5 rounds to 2, 0.5 rounds to 0.

Why’s that? Why not just arbitrarily say that we always round down in this situation? Why round down sometimes and up some other times? There actually is a good reason!

This algorithm is called the Bankers’ Rounding algorithm because, unsurprisingly, it’s (allegedly) used by bankers. Suppose a data source provides data which is often in exactly split quantities — half dollars, half cents, half shares, whatever — but they wish to provide rounded-off quantities. Suppose further that a data consumer is going to derive summary statistics from the rounded data — an average, say.

Ideally when you are taking an average you want to take an average of the raw data with as much precision as you can get. But in the real world we often have to take averages of data which has lost some precision. In such a situation the Banker’s Rounding algorithm produces better results because it does not bias half-quantities consistently down or consistently up. It assumes that on average, an equal number of half-quantities will be rounded up as down, and the errors will cancel out.

If you don’t believe me, try it. Generate a random list of numbers that end in 0.5, round them off, and average them. You’ll find that Bankers’ Rounding gives you closer results to the real average than “always round down” averaging.

The Round, CInt and CLng functions in VBScript all use the Banker’s Rounding algorithm.

There are two other VBScript functions which turn floats into integers. The Int function gives you the first integer less than or equal to its input, and the Fix function gives you the first integer closer to zero or equal to its input. These functions do not round to the nearest integer at all, they simply truncate the fractional part.


This article generated a lot of feedback pointing out:

  • Format and FormatNumber do not use Bankers’ Rounding.
  • A great resource for people trying to understand how pre-.NET Visual Basic proper does rounding is here
  • Bankers’ Rounding has an additional nice property that I did not mention. Suppose we have a principal payment of $123.755 and an interest payment of $1.245.  The true sum is $125.00. If we round off the quantities first, then we get $123.76 and $1.24, and the sum is still $125.00. Other rounding algorithms would produce an error of a penny.
  • Bankers’ Rounding also has nice properties when dealing with the sums of quantities that have been computed by a round-then-subtract operation.
  • There was some debate over whether Bankers’ Rounding is “fair” because there are five possible odd digits: 1, 3, 5, 7 and 9, but only four possible even digits, 2, 4, 6 and 8.  I leave it as an exercise to the reader to determine whether this objection has merit. (The commenter later returned to say that what they meant was that the distribution of odd and even digits was not necessarily uniform.)
  • The ROUND function in Excel may not use Bankers’ Rounding; there was some confusion on this point and I have never bothered to actually check.
  • Bankers do not actually use Bankers’ Rounding, apparently; rounding ties “upwards” is the standard method. I do not know if “upwards” means “towards positive infinity” or “away from zero”, since those are different for negative numbers.

There was also an irritating back-and-forth with readers who believed they had found “flaws” in the tie-breaking algorithm; in every case, the algorithm was not tie-breaking because there was a clearly “closer” value, and that was the value chosen.

Thank goodness, the .NET Round method takes an argument so you can say what kind of rounding you want, rather than hoping for the best.

More on Certificates and Trust Decisions

I said earlier that certificates could be used to establish identity and hence the right
to run trustworthy software on your machine. I want to emphasize that this is not all that certificates are used for.

A lot of people are confused by this, including a lot of very smart, technically savvy developers. Cryptographic security is complex and poorly understood.

Let me give you an example. One time I went to a friend’s web site. My friend is a smart and highly experienced developer. His web site happens to use HTTPS, the “secure protocol”, and I was very surprised when I got an error message from IE saying that the certificate associated with the site could not be trusted because it was self-signed. In other words, the certificate says “The bearer’s name is Sven Svenson. The bearer’s identity has been confirmed by Sven Svenson.”

Obviously that is not any kind of identity-establishing evidence! To trust the cert you have to trust the certifying authority, but the certifying authority is the person whose identity is in question! This chain of trust is not rooted in an explicitly trusted authority, so it is worthless. What is stopping, say, Bjorn Bjornson from typing up the same certificate? Nothing!

I asked my friend why he didn’t use a Verisign certificate and his answer surprised me: “Nobody’s told me that they won’t use the service unless I got a Verisign cert. And if they did, I’d probably tell them that they shouldn’t use my system if they don’t trust me.”

My friend was confusing code signing with secure HTTP. HTTPS does not use certs to establish a trust relationship between the client and the server by establishing the identity of the author! HTTPS uses certs to establish secure communication over an insecure network by establishing identity of the web site.

Those sure sound similar but they are in fact completely different uses of certs. Why? Because in the first case, it is the code author who is (potentially) untrusted. In the second case, it is the network that is (definitely) untrusted.

Before I continue let me briefly explain what the goal of HTTPS is, and how it works.

The goal of HTTPS is to ensure that when you send information over the internet, two things happen. First, the information is encrypted so that if it is intercepted, the eavesdropper cannot read it. Second, the server that you’re talking to really is the server that you think you’re talking to.

We have these two goals because there are two threats: first, someone might be listening in on the traffic between the client and the server, so you need to ensure that they can’t make sense of what they overhear. Second, someone may have set up a “fake” server that fools your client into believing that it is talking to the “real” server, so you need to ensure that you are talking to the “real” server.

When you go to a secure web site — like your bank, or Amazon or some other web site involving transmission of credit card numbers over the internet — the first thing the web site does is send you a certificate containing the name of the web site. The certificate has been signed by a trusted root — Verisign, for example. Your browser can then compare the certificate’s web site name with the web site you actually went to, and because it is vouched for by Verisign, you know that you really are talking to the right server. The certificate also contains information about how to encrypt messages so that only the server can decrypt them. That is then enough information to mitigate both threats and establish a secure channel. (The details of how that channel is established are not particularly germane; I may discuss this in a future post.)

I’ll say it again: the certificate is not there to establish a trust relationship between the client and the server. The certificate is there so that you know that you are talking to the real server, and no one can eavesdrop. This is the internet we’re talking about: for all you know, evil geniuses have cut every line between you and the entire internet and have inserted their own evil routers and servers that look just like the real internet. (For all you know, this is someone else’s blog!) It’s Verisign’s job to ensure that those evil geniuses never get their hands on a Foo.com certificate vouched for by Verisign, because that is the only evidence you have that you are actually talking to the real Foo.com.

So when you go to a web site with a self-signed HTTPS certificate, IE pops up an “abort
or continue?” dialog box. What that box is trying to communicate is:

“You are trying to communicate securely with FooCorp’s web server but because the certificate does not establish a chain of trust, IE is unable to determine if the insecure internet between you and FooCorp has been taken over by evil genius hackers who are presently spoofing you into believing that you are on the real internet. Do you want to go ahead and possibly be spoofed, or stop right now?”

Unfortunately, the dialog box is not particularly clear on this point, and besides, everyone clicks “OK” without reading dialog boxes. But that’s another story.

To sum up: whether you trust FooCorp or not is not the issue — if you’re going to send them your credit card number, presumably you already trust them! What you don’t trust is the wiring between you and them. There might be hackers at their ISP, or your ISP, or at any node on the network. There might be people listening with packet sniffers and packet replacers. There might be people listening to your wireless network traffic. HTTPS is designed to protect users against untrustworthy, insecure or hostile network providers by ensuring identity and encrypting traffic.


There were a lot of good responses to this article, mostly on the topic of the correct uses of self-signed certificates, and the business model and trustworthiness of VeriSign.

There are of course valid uses of self-signed certificates; trust chains typically end in a self-signed certificate after all. The point is that you must have some tamperproof way of obtaining a self-signed root and some way of verifying the identity associated with that certificate; obviously neither transport nor verification must involve checking the signature of the certificate! But if you have some other way to do those things, then a self-signed cert is fine; put it in your root store and you’re all set.

Self-signed certs also solve the problem of “I don’t care about the identity of who I’m talking to, but I do care that this identity remains constant over time”. Whether that is a realistic scenario or not, I can’t say.

VeriSign at the time charged $1000 for a certificate, which many people noted was high for individuals, though perhaps not for corporations. Theses days of course there are free certifying authorities.

The suggestion was made that “loop signed” certificates could be a thing; you could have a group of certifying authorities get together and X signs Y’s certificate, who signs Z’s certificate, who signs X’s certificate, and this is in a sense stronger than a self-signed certificate. I don’t know; that sounds complicated and like it doesn’t actually solve the problem, but I have not done a detailed analysis.

People also pointed out that VeriSign had made some high-profile mistakes: granting a Microsoft certificate to someone who didn’t work for Microsoft, messing up DNS, and so on.

 

Evil Security Twin Powers… Activate!

One day Peter Torr and I walked into Herman Venter’s office (Herman was the architect and primary implementer of JScript.NET). We were grinning. Herman knew what that meant. He took one look at us and said “Uh oh, here come the Evil Security Twins.” And indeed, we had found another potential vulnerability in JScript .NET — fortunately, before we shipped.

My Evil Security Twin has an interesting post today about the need to trust both code
and the code container. I thought I might discuss the semantics of some of the words Peter uses. Security experts often use word pairs such as safe/dangerous, trusted/untrusted, and benign/hostile without clearly distinguishing them from each other. These distinctions are both important and frequently misunderstood. Essentially the differences are these:

A particular assembly is safe if it is unable to harm you: modify your disk, steal your data, deny service, and so on. The safety or dangerousness of an assembly is fundamentally a technical question: what does the assembly attempt to do? For instance, does it read the disk? Then it might steal secrets. Does it create dialog boxes? Then it might create hundreds of them and make it hard to use your machine.

The level of trust you assign to an assembly is essentially the set of permissions you are willing to grant it based on evidence. For instance, if you believe that if an assembly from the Internet is granted permission to access your disk then it might misuse that ability to do you harm, you should not grant the permission.

Administrators express their opinions about trust by configuring their policies appropriately. “Do not trust code from the Internet to access the disk” is an example of such a policy.

Whether an assembly is hostile or benign depends on the mindset and goals of the assembly’s author. An assembly that creates a file might be perfectly benign, or it might be creating the file in order to consume all the hard disk space on your machine as a denial-of-service attack. An assembly that creates a dialog box that looks just like the Windows password dialog might be attempting to steal your password.

Unfortunately, there is no way to detect the intent of the programmer. If there were then security would be easy: use your magical ESP powers to see what the programmer was thinking and then just prevent code written by hostile people who hate you from running! In the real, non-magical world all that the .NET security system can do is restrict the abilities of an assembly based on the available evidence.

Notice that essentially we use these words technically in the same way we do in everyday speech. A tennis ball is inherently quite safe and an axe is inherently quite dangerous. You trust that the axe maker made a trustworthy product that accurately advertises its dangerous nature (rather than masquerading as a safe object which you could then accidentally misuse).

Whether someone carrying an axe is hostile or benign has nothing to do with the axe, but everything to do with their intentions. And whether you trust that person to do yard work is a question of what evidence you have that they’re trustworthy. A policy that codifies this trust decision might be “Don’t give axes to unidentified strangers.”

When we get into cryptographic evidence, things get even more confusing. What are all of these certificates for, and who signed them, and why do we trust them? We can continue our analogy.

Some guy shows up at your door to do the yard work. You’re considering handing him an axe. What evidence do you have that this is a wise thing to do?

You could start by asking for some ID. The guy hands you a card that says “Hi, my name is Sven the Lumberjack”. Does this establish identity? Obviously not. Does it establish trustworthiness? No! So what’s missing?

What’s missing is the imprimatur of a trusted authority. If instead Sven hands you a driver’s license with a photo on it that matches him and a name that matches the name he claims is his, then that’s pretty good evidence of identity. Why? Because you have a trust relationship with the entity that issued the document! You know that the department of transportation doesn’t give a driver’s license out to just anyone, and that they ensure that the picture and the name match each other correctly.

So far we’ve established identity but have we established trustworthiness? Maybe, maybe not. Identity is a crisp, technical question. Trustworthiness is a squishy, human question. Maybe you’re willing to say that anyone who is willing to identify themselves is trustworthy. Maybe you demand extra credentials, like a lumberjack union membership card — credentials that certify relevant abilities, not just identity. Maybe you’re worried about fake ID’s.

All that is up to you — you’re the one handing this guy an axe. How paranoid are you? Set your policies appropriately. Some of us are more paranoid than others, like Peter “I only read ASCII email” Torr. That’s hard-core, dude.

But I digress. My point is that we construct chains of trust. Every link in that chain must hold up, and the chain has to end in some authority which you explicitly trust.

In the computer world, your machine has a certificate store containing a list of “trusted roots“, like Verisign. By putting Versign in the trusted root store, you are saying “I explicitly trust Verisign to do their job of issuing and revoking credentials that establish identity and trustworthiness.”

If Sven goes to Verisign and asks for a “code signing” cert, Verisign will confirm his identity and issue a certificate that says “this guy is known to Verisign and this certificate can be used to sign executable code.” It’s a driver’s license for coding.

When you download some code off the internet, the security system checks to see if you trust it — does it come from someone with an identity verified by Verisign? Do you trust that individual? And does the certificate they’re showing you say that they have the right to sign executable code? If so, then the code runs. If not, then the chain of trust is broken somewhere, and it doesn’t run.


This article had some good reader feedback. Two highlights:

First off, a reader noted that good security systems get the transitivity relationships correct. If I trust Alice, and Alice trusts Bob, that does not necessarily mean that I trust Bob.

Second, Peter pointed out that reading HTML-rendered email is a “horrible” user experience regardless of any security concerns; remember, it was 2003; there were real security concerns when reading email still!

As I mentioned before, Peter is still at Microsoft as of this reading. Oddly enough, Herman sits just a few desks away from me, and is once again working on JavaScript optimizers. Plus ça change.

 

I’m a traveling man, don’t tie me down

While I was waiting for a shuttle the other day (Microsoft has a fleet of shuttle busses that take people all over the campuses) I was thinking about optimization heuristics.

Often we want to find an algorithm that determines the best way to do something. For example, the traveling salesman problem: given a set of cities and the cost of driving between all of them, what is the cheapest route that visits every city at least once? Such problems can be extremely difficult to solve! But often we miss out on the fact that we don’t need to find the optimal solution, we need to find the “good enough” solution. Or we need to find a solution that has bounded inoptimality. Often finding the best solution is impractical, but it would be nice to know that you can find a solution that doesn’t suck  too bad.

For the traveling salesman problem, for instance, you can easily construct the minimum spanning tree of the graph. Then the route becomes trivial: pick any node as the root, do a depth-first traversal of the spanning tree and go back to the root. You’ll visit every edge twice and end up where you started from. The total cost is twice the cost of traversing the spanning tree once, and obviously the cost of the optimal solution can’t possibly be better than the cost of traversing the minimum spanning tree once.

This doesn’t give you an optimal solution (unless your graph is already a tree, of course) but it does give you a solution that is guaranteed to be between 100% and 200% of the optimal solution, and that might be good enough.

This sort of 200% heuristic crops up all the time. Do you rent DVDs or buy them? Suppose a DVD costs $24 to buy and $6 to rent. Obviously the optimal solution is to buy it if you are going to watch it four or more times, rent otherwise — but that requires perfect knowledge of the future.

Can we come up with an algorithm that minimizes the maximum suckiness, given no information about the future? Well, the worst possible situation is buying a DVD and watching it once — you just overpaid by a factor of four. Actually, no, a worse situation still is renting the DVD so many times that you end up paying way more than the purchase price.

The solution? Rent the DVD the first four times, and then the fifth time, buy it. No matter how many times you watch it, the worst you can do is pay 200% of the optimal price and typically you’ll do a lot better. (Particularly if you can bring to bear more information, such as being able to forecast the number of future viewings.)

This same heuristic applies to waiting for shuttles, which is why I was thinking about this in the first place. How long should you wait before you give up and walk? If the shuttle is going to be right there 30 seconds after you call, it’s foolish to walk. But it is also foolish to wait ten times as long as it would have taken you to walk.

I waited as long as it would have taken me to walk, and then I walked. Fortunately it was a nice day yesterday.


I noticed immediately after posting this in 2003 that I got the algorithm wrong! It would be better to buy the DVD on the fourth viewing. Then the worst case is 4 viewings for $42, which is less than twice the optimal case. But whatever, you take my point I’m sure.

Don’t forget people, my degree is in mathematics, not arithmetic.

As I write this update in 2019 it has been many years since I last rented a DVD, so the scenario is no longer topical, but I hope the general point is still of interest. And I really hope that Scarecrow Video, the greatest video store in the world, does not go out of business — but I just have no incentive to go there anymore.

Readers pointed out that in many scenarios the parameters of the problem are known at compile time, and that they are willing to solve the problem once and bake the solution into the program. For example, optimize route-finding for AI characters in games doesn’t typically depend on details that change during play, and so can be pre-computed with long-running, more accurate algorithms, rather than approximated at runtime with faster algorithms.

Another reader pointed out that attempts to predict the future are best modeled as probabilistic algorithms, and those algorithms can consume evidence (what do reviewers think? How correlated are my opinions with reviewers? What did people who also bought this also buy that I bought? and so on) to compute the posterior probability that I’ll watch it again. At the time I am porting this article over I am in the middle of my “Fixing Random” series on probabilistic programming in C#, so that’s been on my mind.