While I was waiting for a shuttle the other day (Microsoft has a fleet of shuttle busses that take people all over the campuses) I was thinking about optimization heuristics.
Often we want to find an algorithm that determines the best way to do something. For example, the traveling salesman problem: given a set of cities and the cost of driving between all of them, what is the cheapest route that visits every city at least once? Such problems can be extremely difficult to solve! But often we miss out on the fact that we don’t need to find the optimal solution, we need to find the “good enough” solution. Or we need to find a solution that has bounded inoptimality. Often finding the best solution is impractical, but it would be nice to know that you can find a solution that doesn’t suck too bad.
For the traveling salesman problem, for instance, you can easily construct the minimum spanning tree of the graph. Then the route becomes trivial: pick any node as the root, do a depth-first traversal of the spanning tree and go back to the root. You’ll visit every edge twice and end up where you started from. The total cost is twice the cost of traversing the spanning tree once, and obviously the cost of the optimal solution can’t possibly be better than the cost of traversing the minimum spanning tree once.
This doesn’t give you an optimal solution (unless your graph is already a tree, of course) but it does give you a solution that is guaranteed to be between 100% and 200% of the optimal solution, and that might be good enough.
This sort of 200% heuristic crops up all the time. Do you rent DVDs or buy them? Suppose a DVD costs $24 to buy and $6 to rent. Obviously the optimal solution is to buy it if you are going to watch it four or more times, rent otherwise — but that requires perfect knowledge of the future.
Can we come up with an algorithm that minimizes the maximum suckiness, given no information about the future? Well, the worst possible situation is buying a DVD and watching it once — you just overpaid by a factor of four. Actually, no, a worse situation still is renting the DVD so many times that you end up paying way more than the purchase price.
The solution? Rent the DVD the first four times, and then the fifth time, buy it. No matter how many times you watch it, the worst you can do is pay 200% of the optimal price and typically you’ll do a lot better. (Particularly if you can bring to bear more information, such as being able to forecast the number of future viewings.)
This same heuristic applies to waiting for shuttles, which is why I was thinking about this in the first place. How long should you wait before you give up and walk? If the shuttle is going to be right there 30 seconds after you call, it’s foolish to walk. But it is also foolish to wait ten times as long as it would have taken you to walk.
I waited as long as it would have taken me to walk, and then I walked. Fortunately it was a nice day yesterday.
I noticed immediately after posting this in 2003 that I got the algorithm wrong! It would be better to buy the DVD on the fourth viewing. Then the worst case is 4 viewings for $42, which is less than twice the optimal case. But whatever, you take my point I’m sure.
Don’t forget people, my degree is in mathematics, not arithmetic.
As I write this update in 2019 it has been many years since I last rented a DVD, so the scenario is no longer topical, but I hope the general point is still of interest. And I really hope that Scarecrow Video, the greatest video store in the world, does not go out of business — but I just have no incentive to go there anymore.
Readers pointed out that in many scenarios the parameters of the problem are known at compile time, and that they are willing to solve the problem once and bake the solution into the program. For example, optimize route-finding for AI characters in games doesn’t typically depend on details that change during play, and so can be pre-computed with long-running, more accurate algorithms, rather than approximated at runtime with faster algorithms.
Another reader pointed out that attempts to predict the future are best modeled as probabilistic algorithms, and those algorithms can consume evidence (what do reviewers think? How correlated are my opinions with reviewers? What did people who also bought this also buy that I bought? and so on) to compute the posterior probability that I’ll watch it again. At the time I am porting this article over I am in the middle of my “Fixing Random” series on probabilistic programming in C#, so that’s been on my mind.