Eval is evil, part two

As I promised, more information on why eval is evil. (We once considered having T-shirts printed up that said “Eval is evil!” on one side and “Script happens!” on the other, but the PM’s never managed to tear themselves away from their web browsing long enough to order them.)

Incidentally, a buddy of mine who is one of those senior web developer kinda guys back in Waterloo sent me an email yesterday saying “Hello, my name is Robert and I am an evalaholic”. People, it wasn’t my intention to start a twelve step program, but hey, whatever works!

As I discussed the other day, eval on the client is evil because it leads to sloppy, hard-to-debug-and-maintain programs that consume huge amounts of memory and run unnecessarily slowly even when performing simple tasks. But like I said in my performance rant, if it’s good enough, then hey, it’s good enough.  Maybe you don’t need to write maintainable, efficient code. Seriously! Script is often used to write programs that are used a couple of times and then thrown away, so who cares if they’re slow and inelegant?

But eval on the server is an entirely different beast. First off, server scenarios are generally a lot more performance sensitive than client scenarios.  On a client, once your code runs faster than a human being can notice the lag, there’s usually not much point in making it faster.  But  as I mentioned earlier, ASP goes to a lot of work to ensure that for a given page, the compiler only runs once. An eval defeats this optimization by making the compiler run every time the page runs! On a server, going from 25 ms to 40 ms to serve a page means going from 40 pages a second to 25 pages a second, and that can be expensive in real dollar terms.

But that’s not the most important reason to eschew eval on the server.  Any use of eval (or its VBScript cousins Eval, Execute and ExecuteGlobal) is a potentially enormous security hole:

<%
  var Processor_ProductList;
  var Software_ProductList;
  var HardDisk_ProductList;
  // ...
  CategoryName = Request.QueryString("category");
  ProductList = eval(CategoryName & "_ProductList");
  // ...

What’s wrong with this picture?  The server assumes that the client is not hostile.  Is that a warranted assumption?  Probably not!  You know nothing about the client that sent the request.  Maybe your client page only sends strings like “Processor” and “HardDisk” to the server, but anyone can write their own web page that sends

((new ActiveXObject('Scripting.FileSystemObject')).
DeleteFile('C:*.*',true)); 
Processor

which will cause eval to evaluate

((new ActiveXObject('Scripting.FileSystemObject')).
DeleteFile('C:*.*',true)); 
Processor_ProductList

Obviously that’s a pretty unsophisticated attack.  The attacker can put any code in there that they want, and it will run in the context of the server process.  Hopefully the server process is not a highly privileged one, but still, there’s vast potential for massive harm here just by screwing up the logic on your server.

Never trust the input to a server, and try to never use eval on a server.  Eval injection makes SQL injection look tame!

To try and mitigate these sorts of problems, JScript .NET has some restrictions on its implementation of eval, but that’s a topic for another entry.


Notes from 2020

The attack I’m briefly describing here is of course only one of a great many “hostile client gets bad string onto the server” attack patterns. During my time at Coverity I got to take a deep look at the tools which attempt to detect code paths where a string goes from an untrusted source to a dangerous sink. In terms of the complexity of the analyzed control flows, these were probably the most sophisticated checkers we had, and among the most prone to false positives.

Defeating injection attacks is a hard problem, and I wish we at Microsoft had solved it in the type system, rather than creating a market for expensive third-party solutions that use symbolic execution to effectively do the work of imposing a type system post hoc on an existing program.

A parable

Once upon a time I was in high school. Ah, the halcyon days of my youth. One day I was sitting in class, minding my own business when the teacher said: “Does anyone have a thin metal ruler?”

No answer. Apparently no one had a thin metal ruler.

“No? How about a nail file?”

No answer. Now, I cannot imagine that of all the girls in the class, not one of them had a nail file. But I can well imagine that none of them wanted to share it with a teacher.

“No? Hmm.”

So I piped up: “What do you need a nail file for?”

“I have this big staple in this document that I need to remove.”

Upon which point one of my classmates mentioned that he had a staple remover. Problem solved.

Over and over again I find that script customers (both internal consumers here at Microsoft and third-party developers) frequently ask questions like my teacher. That is, they have a preconceived notion of how the problem is going to be solved, and then ask the necessary questions to implement their preconceived solution. And in many cases this is a pretty good technique! Had someone actually brought a thin metal ruler to class, the problem would have been solved. But by choosing a question that emphasizes the solution over the problem, the questioner loses all ability to leverage the full knowledge of the questionees.

When someone asks me a question about the script technologies I almost always turn right around and ask them why they want to know. I might be able to point them at some tool that better solves their problem. And I might also learn something about what problems people are trying to solve with my tools.

Joel Spolsky once said that people don’t want drills, they want holes. As a drill provider, I’m fascinated to learn what kinds of holes people want to put in what kinds of materials, so to speak. Sometimes people think they want a drill when in fact they want a rotary cutter.


Commentary from 2019:

First off, I misattributed that quotation. “People don’t want to buy a quarter-inch drill, they want a quarter-inch hole.” is a quote from the economist Theodore Levitt. At the time I wrote this, I was sure that I had read about this idea in a Joel On Software article, but if I did, I cannot find it now. Apologies for the error.

Second, I did not know at the time that we have a name for this pattern of “have a problem, get a crazy idea about a solution, ask baffling questions about the crazy idea, rather than stating the problem directly” that we see so often on StackOverflow. It is an “XY problem“, which strikes me as a terrible name.

Third, I am reminded of a story about the time I was helping Morton Twillingate put a roof on his shed. “Hand me the screw driver there b’y,” he said so I handed him a Philips head screwdriver. “Sweet t’underin’ Jaysus b’y, give me the screw driver!” he said, pointing at the hammer in my other hand, “If I’d wanted the screw remover I’d have said so!”

 

Eval is evil, part one

The eval method — which takes a string containing JScript code, compiles it and runs it — is probably the most powerful and most misused method in JScript. There are a few scenarios in which eval is invaluable. For example, when you are building up complex mathematical expressions based on user input, or when you are serializing object state to a string so that it can be stored or transmitted, and reconstituted later.

However, these worthy scenarios make up a tiny percentage of the actual usage of eval. In the majority of cases, eval is used like a sledgehammer swatting a fly — it gets the job done, but with too much power. It’s slow, it’s unwieldy, and tends to magnify the damage when you make a mistake. Please spread the word far and wide: if you are considering using eval then there is probably a better way. Think hard before you use eval.

Let me give you an example of a typical usage.

<span id="myspan1"></span>
<span id="myspan2"></span>
<span id="myspan3"></span>
function setspan(num, text)
{
  eval("myspan" + num + ".innerText = '" + text + "'");
}




Somehow the program is getting its hands on a number, and it wants to map that to a particular span. What’s wrong with this picture?

Well, pretty much everything. This is a horrid way to implement these simple semantics. First off, what if the text contains an apostrophe? Then we’ll generate

myspan1.innerText = 'it ain't what you do, it's the way thacha do it';

Which isn’t legal JScript. Similarly, what if it contains stuff interpretable as escape sequences? OK, let’s fix that up.

eval("myspan" + num).innerText = text;

If you have to use eval, eval as little of the expression as possible, and only do it once. I’ve seen code like this in real live web sites:

if (eval(foo) != null && eval(foo).blah == 123)
  eval(foo).baz = "hello";


Yikes! That calls the compiler three times to compile up the same code! People, eval starts a compiler. Before you use it, ask yourself whether there is a better way to solve this problem than starting up a compiler!

Anyway, our modified solution is much better but still awful. What if num is out of range? What if it isn’t even a number? We could put in checks, but why bother? We need to take a step back here and ask what problem we are trying to solve.

We have a number. We would like to map that number onto an object. How would you solve this problem if you didn’t have eval? This is not a difficult programming problem! Obviously an array is a far better solution:

var spans = new Array(null, myspan1, myspan2, myspan3);
function setspan(num, text)
{
  if (spans[num] != null)
    spans[num].innertext = text;
}

Since JScript has string-indexed associative arrays, this generalizes to far more than just numeric scenarios. Build any map you want. JScript even provides a convenient syntax for maps!

var spans = { 1 : mySpan1, 2 : mySpan2, 12 : mySpan12 };

Let’s compare these two solutions on a number of axes:

Debugability: what is easier to debug, a program that dynamically generates new code at runtime, or a program with a static body of code? What is easier to debug, a program that uses arrays as arrays, or a program that every time it needs to map a number to an object it compiles up a small new program?

Maintainability: What’s easier to maintain, a table or a program that dynamically spits new code?

Speed: which do you think is faster, a program that dereferences an array, or a program that starts a compiler?

Memory: which uses more memory, a program that dereferences an array, or a program that starts a compiler and compiles a new chunk of code every time you need to access an array?

There is absolutely no reason to use eval to solve problems like mapping strings or numbers onto objects. Doing so dramatically lowers the quality of the code on pretty much every imaginable axis.

It gets even worse when you use eval on the server, but that’s another post.


Notes from 2020

This was my first deliberately-multi-episode topic.

There were many great comments on this article on the original blog site; to summarize a few of them:

  • There are a number of scenarios where you want to dynamically create a new function, but “new Function” is the appropriate choice rather than “eval” most of the time.
  • However, the scoping rules for “new Function” and “eval” are different — thanks, JavaScript — and so sometimes there are scenarios where you are forced to eval a new function.
  • I was not then and am not now an expert on the browser’s object model. I have many times noted the irony that as a developer of the JS compiler, I was an expert on the inner workings of the JS compiler, and not on how it was used in practice. A reader pointed out that none of my solutions were good practice compared with the expediency of:
var span = document.all("myspan" + num);
if (span != null) span.innertext = text;

or, equivalently, getElementById, on browsers which supported it at the time.