Which is faster?

Posted on December 17, 2012 by ericlippert

Which is faster, QueryLightBulbFrobStatusEx() or __WGetBulbFrobberState2()?

Hold it right there, buddy. Before answering that question I must give you my standard six-part rant about why I probably cannot sensibly answer questions that begin “which is faster“.

Part the first: Why are you even asking me?

If you have two horses and you want to know which of the two is the faster then race your horses. Don’t write short descriptions of the horses, post them on the Internet, and ask random strangers to guess which is faster! Even if by sheer chance you got an accurate answer, how would you have any confidence in its accuracy? You can easily and accurately discover which of two programs is faster by running both yourself and measuring them with a stopwatch.

(Caveat: writing performance benchmarks that provide high-quality, meaningful results can be somewhat tricky in a garbage-collected, JIT-compiled runtime. I see this done wrong a lot. I will return to this topic in a future blog post.)

Moreover: performance is highly sensitive to things like what hardware you are using, what software you have installed, what other processes are doing, and a host of other factors. You’re the only person who knows what environment the code is going to be run in, so you’re the only person who can do realistic performance testing in that environment.

Part the second: Do you really need to answer that question?

The question presupposes that there actually is a performance problem to be solved. If the code as it stands — the working, debugged, tested code — is already fast enough for your customer then knowing which of two ways to write the code is the faster is just trivia. Spend your valuable time worrying about something else, like testing, robustness, security, and so on. Unnecessary code changes are expensive and dangerous; don’t make performance-based changes unless you’ve identified a performance problem.

Part the third: Is that really the bottleneck?

Suppose that you really do have an identified performance problem: asking which of two things is faster is premature if neither thing is the cause of the problem! The most important thing that I’ve learned about performance analysis is that my highly educated and experienced guess about the root cause of a performance problem is dead wrong probably more than a third of the time. Use a profiler or other analysis tool to determine empirically where the bottleneck is before you start investigating alternatives.

Part the fourth: Is the difference relevant?

Suppose that you have actually identified a bottleneck: now the relevant question is not actually “which horse is faster?” Rather, the relevant question is actually “are either of these horses fast enough to meet my customer’s needs?” If neither horse is fast enough for your purposes then knowing which is faster is irrelevant. And if both are fast enough then you can base your decision on important factors other than performance, as discussed in part the second.

(The question also presupposes that the two alternatives proposed are semantic equivalents. If one of those is a horse and the other is a refrigerator, asking which one runs faster is maybe a non-starter.)

Part the fifth: What is this “faster” you speak of?

There are lots of kinds of speed, and optimizing for one kind can deoptimize for another. I’m sure you’ve all encountered situations where normal-case-scenario performance is acceptable but worst-case-scenario performance is terrible. Anyone who has been unable to get a long-distance phone call placed on Mother’s Day knows what I’m talking about; the telephone network was designed to handle slightly-higher-than-average load, not highest-likely load. Unfortunately, implementing code that ensures an upper bound on your worst-possible-scenario behaviour often ends up making the typical-case-scenario unacceptably slower, and vice versa.

Leaving the difference between best, worst and typical aside, there are all kinds of speed metrics. When I was working on Visual Studio Tools For Office we did comparatively little making the customization framework code run faster because our tests showed that it typically ran fast enough to satisfy customers. But we did an enormous amount of work making the framework code load faster, because our research showed that Office power users were highly irritated by noticable-by-humans delays when loading customized documents for the first time. Similarly, in the web services realm there is a big difference between optimizing for time-to-first-byte, time-to-last-byte and throughput. You have to know what kind of speed is really important to the customer.

Part the sixth: Are you looking at the big picture?

Almost all performance analysis questions I see are solely about improving speed. There are lots of non-speed metrics like memory usage, disk usage, network usage, processor usage, and so on, that might be more relevant than raw speed to your customer. Making code faster often involves trading less time for more memory, which might be a bad trade. Don’t over-focus on speed metrics; look at the big picture.

Well, that rant used up this whole episode. Next time on FAIC: Which is faster, Nullable<T>.Value or Nullable<T>.GetValueOrDefault()?

The photo credits are on Wikimedia Commons.

83 thoughts on “Which is faster?”

Anthony Pegram on December 17, 2012 at 8:04 am said:

Which blog is faster? MSDN or WordPress?

Reply ↓
- Eric Lippert on December 17, 2012 at 8:08 am said:
  
  LOL
  
  I must say, I am very pleased with WordPress so far.
  
  Reply ↓
  - Sam on December 17, 2012 at 2:01 pm said:
    
    You really need a favicon Eric.
    
    Reply ↓
    - Kevin Pilch-Bisson on December 17, 2012 at 6:26 pm said:
      
      How come you don’t put the full article in the RSS though?
      
      Reply ↓
      - Jef Claes on December 19, 2012 at 7:42 am said:
        
        I second this!
      - Roel on August 7, 2023 at 3:45 am said:
        
        I third this!
Andrew on December 17, 2012 at 8:12 am said:

I might suggest an addendum question of “Are these things even comparable?” I had a guy once ask me which was faster: DSL or a Pentium 4. I didn’t know how to respond…

Reply ↓
- Svick on December 17, 2012 at 8:17 am said:
  
  Eric mentioned that in Part the fourth:
  
  > If one of those is a horse and the other is a refrigerator, asking which one runs faster is maybe a non-starter.
  
  Reply ↓
  - Pavel Minaev on December 17, 2012 at 10:10 am said:
    
    Then again, maybe not. Assuming the same starting velocity (ignoring the means by which it is imparted) and Earth atmosphere, horse is probably faster because it’s more aerodynamic. A straightforward experiment would be dropping both the horse and the refrigerator from a sufficiently tall building, and timing the moment they hit the ground.
    
    Reply ↓
    - Eric Lippert on December 17, 2012 at 10:21 am said:
      
      Then the question would be which one falls faster, not which one runs faster.
      
      Reply ↓
      - Mike Edenfield on December 17, 2012 at 6:13 pm said:
        
        my refrigerator has a 3/4 horsepower motor so, clearly, the horse runs faster.
      - Hugh on February 28, 2013 at 8:25 am said:
        
        If you hang around long enough, the horse will prolly end up IN the fridge ~ problem solved.
    - Random832 on December 21, 2012 at 6:08 am said:
      
      Except generally when you ask how fast a horse is, you want its running speed, not its falling speed. And, while people don’t normally talk about it as “speed”, the obvious meaningful ‘speed’ is how fast it can chill a given amount of air or other material to a given temperature. This quantity is normally thought of as a power quantity, since that’s what dimensional analysis gives you if you factor out the need to specify particular materials and temperatures.
      
      Reply ↓
    - Mario Dupuis on January 15, 2013 at 5:00 pm said:
      
      A falling horse will fall at the same speed as of a falling refigerator, since they are both submited to the same gravity. But the heavier will hit the ground with more force then the other. But in the end, it will not matter because you will loose both in devastating terms.
      
      Reply ↓
      - Mario Dupuis on January 15, 2013 at 5:05 pm said:
        
        Every object that falls on the ground, no matter from how high, will fall at an average of 10m/sec^2, depending on where you are on the earth. And considering the 2 objects we are talking about, the athmosphere influence will be unsignificant.
Steven Jeuris on December 17, 2012 at 8:26 am said:

Well, that rant used up this whole episode. Next time on FAIC: Which is faster, Nullable.Value or Nullable.GetValueOrDefault()?

Haha, brilliant. 🙂 That ending remark sums up exactly one big point you left out. It’s still interesting to ask about such things just to increase your general understanding. Curiosity is a powerful motivator.

For minor optimizations I tend to prefer readability over performance, following the mantra “Write code for humans, not machines”. Where possible optimization should be handled by the compiler anyhow.

Reply ↓
Jesse C. Slicer on December 17, 2012 at 9:19 am said:

“Next time on FAIC: Which is faster, Nullable.Value or Nullable.GetValueOrDefault()?”

It depends.

Reply ↓
- Jesse C. Slicer on December 17, 2012 at 9:22 am said:
  
  Dangit. I really want a button to not eat my plain text as HTML markup. Is there a WordPress plugin that makes the editor like StackOverflow?
  
  Reply ↓
  - Eric Lippert on December 17, 2012 at 9:27 am said:
    
    I don’t know. I’ve used WordPress for only a few weeks. I’ll investigate.
    
    Reply ↓
    - Jesse C. Slicer on December 17, 2012 at 12:38 pm said:
      
      You are a gentleman and a scholar, sir.
      
      Reply ↓
      - James Manning on December 17, 2012 at 3:06 pm said:
        
        Every comment should be persisted as a GitHub gist allowing GitHub-flavored markdown and forks/branches/etc 🙂
- Eric Lippert on December 17, 2012 at 9:26 am said:
  
  The funny thing is that of course I am going to give the lie to this article in my next article: GetValueOrDefault() is clearly going to be faster. 🙂
  
  Reply ↓
  - CodesInChaos on December 17, 2012 at 2:03 pm said:
    
    I’d guess `GetValueOrDefault()` simply returns a field, `Value` needs a `if(HasValue)` check. While branch prediction will obviously work well, that’s certainly a bit more expensive.
    
    Reply ↓
    - Jesse C. Slicer on December 18, 2012 at 7:51 am said:
      
      Winner winner chicken dinner! That’s precisely what the implementation does. So when I say “it depends”, I mean it’s comparing apples and oranges. If your requirements are such that you need an exception/conditional if there’s no value, then only `Value` will do. If your requirements are set to deal with `default(TValue)`, then `GetValueOrDefault()` is your huckleberry. And yes, the latter will be “faster” because of its complete lack of branching logic.
      
      Reply ↓
      - John Payson on December 18, 2012 at 4:06 pm said:
        
        If `GetValueOrDefault` simply returns the value field, does that imply that a `Nullable` may simultaneously be `null` and yet return an arbitrary value from `GetValueOrDefault` (even if properly written code would never generate such a value, such values could be passed in from outside code)?
      - Eric Lippert on December 18, 2012 at 4:18 pm said:
        
        Can you demonstrate a program that has the behaviour you describe?
      - John Payson on December 19, 2012 at 3:51 pm said:
        
        I expect that copying/pasting could would produce a mess, but the behavior can be demonstrated by having one thread read a Nullable(of somethingBig) will another thread is setting it to null. The fact that copying a struct in one thread while another thread is mutating may yield a struct with a mix of old and new field values it is hardly surprising; the slightly “surprising” aspect is that GetValueOrDefault doesn’t ignore the backing field for Value if the backing field for HasValue is false. Of course, if one grabs a Nullable(of T) at an inopportune time, one should expect that the field values may be garbage even if `HasValue` is true; that doesn’t mean that one should expect that a “null” instance won’t report a default value for GetValueOrDefault().
Igor Dvorkin on December 17, 2012 at 9:47 am said:

The answer can be found trivially using: http://measureitdotnet.codeplex.com/

Reply ↓
Michael Perrenoud on December 17, 2012 at 10:04 am said:

Eric, I value the fact that you think through the motive before trying to address the problem. I am working very hard at learning to do the same, not only in my career, but in my life in general. Reading things like this enhance my life as a whole.

Thanks Eric!

Reply ↓
John Payson on December 17, 2012 at 10:20 am said:

In many cases, someone who asks “Which is faster–XX or YY” isn’t really interested in knowing, “On my own system, under some nominal set of conditions, which is faster–XX or YY”, but rather “Are there likely to be any systems or conditions where XX is apt to be significantly faster than YY, or vice versa”? If, for example, it turns out that method XX runs 5% faster than YY on Intel processors, but 50 times slower than YY on AMD processors, and if the overall time taken by XX is significant in the latter situation but not the former, someone who is writing a program for the general market should probably use YY. I’m not sure how someone should expect to determine that, though, other than by change, or other than by asking whether anyone knows of any situations where the performance of XX and YY are drastically different.

Reply ↓
- Michael Perrenoud on December 17, 2012 at 11:46 am said:
  
  I think what Eric is trying to say is that the “what if” scenario is so difficult to pin down that the conversation becomes academic and much less practical and the solutions generally have holes in them because they haven’t been implemented.
  
  Performance is affected by so many variables, and generally a significant number of them out of your control, that you need to know “what is wrong” and “what is my motive for optimizing” before even concerning yourself with it.
  
  Reply ↓
  - John Payson on December 19, 2012 at 11:23 am said:
    
    It’s true that one can’t take all possible scenarios into account, but there are a number of factors which will often affect relative performance of two different methods. For example, in some programs Gen0 collections will tend to be relatively cheap, while in others they will tend to be relatively expensive. If a program’s heap usage patterns are such that Gen0 collections will be expensive but rare, methods which are fast but generate and abandon multiple heap objects may perform much worse than those which take longer but don’t generate any garbage. If a benchmark loop doesn’t touch any complex objects on the heap, any gen0 collections after the first are going to be fast, but that won’t necessarily be the case when code is running in the real world.
    
    Reply ↓
Andreas Kromann on December 17, 2012 at 11:34 am said:

What leads me to start thinking about performance is usually not the question “which is faster?”. It is “why the hell is this thing so slow?”. And what I find after profiling is something I had no idea was there.

Reply ↓
Dan M on December 17, 2012 at 1:06 pm said:

I generally agree with all your “part the X’s”, but I do find myself contemplating “Big O” types of speed questions every now and then, particularly when dealing with LINQ. Things like, should I use ToArray() on my query or just leave it as an enumerable? If you get in the habit of giving a little thought to these things, and maybe doing some quick and dirty stopwatch tests when you’re not sure, it seems like you would save yourself a good number of future bottlenecks, particularly if your app scales beyond your initial expectations.

Reply ↓
Charlie on December 17, 2012 at 2:32 pm said:

Thanks as always for the great tech writings, and I hope the new work adventure goes great! Totally unrelated to this post, the new blog platform is somewhat less friendly to RSS readers, or at least the Newsify reader on iOS. The RSS content only contains the first few sentences and a link to the main site, so we lose RSS reader capabilities like offline reading, dyslexia-friendly fonts, and other such things. Is there a WordPress setting that might change the feed to include the full article text? Worst case I can use Instapaper, but if it was easy to change it seems worth it.

Reply ↓
- BC_Programming on December 17, 2012 at 6:28 pm said:
  
  The setting would be in the wordpress control panel, Settings->Reading, then change the radio button for “For each article in a feed, show ” from “summary” to “full text”.
  Assuming of course Eric want’s to appease you at all 😛
  
  related to the Post itself, I had somebody ask me what the fastest sort algorithm was. I was going to say “Quick sort, usually”, but then stopped myself and said that the quickest way to sort is to not have to sort at all.
  
  Reply ↓
Nivky Numpson on December 18, 2012 at 3:50 am said:

Which is *better*?
The *faster* program that contains bugs, or the *slower* program that doesn’t? 🙂

Reply ↓
- Rick C on December 18, 2012 at 9:50 am said:
  
  Well now, that depends on whether the bugs affect you, doesn’t it?
  
  Reply ↓
  - Nivky Numpson on December 19, 2012 at 5:03 am said:
    
    Ok, let’s assume the bugs do affect you, but that they can be worked around, albeit inconveniently.
    
    Now which is better?
    
    Reply ↓
    - Rick C on December 19, 2012 at 9:50 am said:
      
      I don’t really mean to be pedantic but it kind of depends on how you mean “better.” If the workaround still take half the time of the bugfree version I might prefer the faster one.
      
      Obviously if they slow the process down to the point where the faster program is no longer faster, there’s no point in using it unless there is some mitigating advantage.
      
      Reply ↓
    - imfrancisd on December 19, 2012 at 12:58 pm said:
      
      If the program is a part of a much bigger process, and that process has a lot of constraints outside the world of computers, then I would choose the program that has a more predictable running time and a more predictable quality of results. At the very least, I’ll know for sure whether or not the program is good enough for a specific project, and I’ll know what must be done in order to compensate for the program’s deficiencies.
      
      Reply ↓
- Brian on December 18, 2012 at 10:00 am said:
  
  I forget where I heard this; some details may be inaccurate: One day, a group of students was given a task: Find the 10 most frequent words within an input document. They were then told to optimize for speed. Programmers used various clever solutions (e.g., hashing). The fastest program in the class ignored all but the first 100 words, but still passed all of the test cases.
  
  Reply ↓
Rob Kent on December 18, 2012 at 4:30 am said:

I know that for complex Windows programs, classic VB was a lot faster than C++ – by about three months (dev time).

Reply ↓
Daniel Bryars on December 18, 2012 at 12:37 pm said:

And, don’t always assume faster is better….

We worked really hard on the performance of an insurance search web site and were super pleased and very proud when it presented the results within milliseconds of clicking the search button.

Our client made us put in an artificial delay and a spinning progress indicator so that the “customer would really think the site was working hard for him!”

Reply ↓
Ian on December 18, 2012 at 3:50 pm said:

In part 5 you mention deoptimizing other areas and in part 6 you mention looking at the big picture. What’s even more insidious (and harder for a novice to spot) is the combination of these two: when the area you are deoptimizing is somewhere else in the application. One common mistake I’ve seen is where someone saves a calculated result in a private member variable to ‘save having to recalculate it’. Net result the whole system has less memory for the cache and runs slower even though their code appears to run faster.

Reply ↓
Rico Mariani on December 18, 2012 at 6:50 pm said:

Your post is just full of great advice 🙂

Reply ↓
- Eric Lippert on December 18, 2012 at 9:50 pm said:
  
  Well to hear that coming from you Rico I am honoured!
  
  (For readers unfamiliar with Rico: he is one of the world’s great experts on performance tuning.)
  
  Reply ↓
Yves Daoust on December 20, 2012 at 12:35 am said:

Perfect article, all parts are indeed relevant.

In face of this, my metaphysical interrogation is : why is Windows so damn slow ?

Reply ↓
Harold on December 20, 2012 at 3:07 am said:

Then again, sometimes it’s pretty clear-cut which is faster. That happens sometimes on, say, stackoverflow, and then most of the time, the poor sod who dared ask a question about performance gets utterly lambasted.
You’re certainly not wrong, but “optimization is hard and should be done carefully and analytically” has somehow morphed into “optimization is bad and you should feel bad” for a lot of people. They’ll throw around Knuth’s famous quote about premature optimization (often implicitly assuming that all optimization is premature), and forget that the article in which Knuth wrote that was actually about the importance of optimization, not the unimportance.

Reply ↓
- John Atten on December 20, 2012 at 4:51 am said:
  
  Right? And often, the same guy who would rattle the poor sod’s cage about asking such a question will be the first to chime in post-facto about what a poor choice a poster made of the options available. Can be a no win for people still learning the ropes.
  
  Reply ↓
John Atten on December 20, 2012 at 4:48 am said:

I think that often what someone means when they ask ” which is faster?” is actually “which is more efficient?”

More specifically, often there are multiple reasonable-sounding (by name) choices within a language or framework. Prime example might be the .net string vs. stringbuilder class. While string will tend to feel a vinous to a newcomer for performing concatenations and such, the reality is that the stringbuilder class was purpose-built to overcome limitations related to multiple concatenations using plain old string. Fortunately, this specific case is well-documented.

Not-so clear, in both java and c#, are the differences between various array and list-type structures, some of which are more efficient for specific use-cases than others. Further, the names of these are not always helpful in figuring out which one to use for a given situation.

I think most of the time people realize you can’t possibly know the profiling conditions for their system when they pose this type of question. Instead they are seeking to understand if there is something they should know about the implementation options they have. This is especially true among newcomers (I am one of those, although I try to do my research first!).

Reply ↓
Steve Sadler on December 20, 2012 at 7:41 am said:

I guess it depends upon the world you live in. I’m an embedded systems engineer so which is faster or which is smaller turns out to be important a lot of the time. Due to processor (cheap as possible to do the job) and memory constraints, I’m always asking these kinds of questions. Things like, should I use a sin table to generate tones or should I use a 2nd order digital oscillator? I’m almost always fighting the low memory, low performance processor battle. I don’t do premature optimization but on most of the projects I’ll usually end up have to either optimize for space or speed. This is invariable due to feature creep after the 1.0 release.

Reply ↓
- Charles A. Wolfe on December 22, 2012 at 6:49 pm said:
  
  The fight between space and time is “classic”. When working on software for space craft and ground data systems, one could find oneself writing “spaghetti” code or equally obscure code on purpose because some compiled (assembled) instruction’s numeric value is the constant you need for some arithmetic operation, or you can’t quite get the 5 needed instructions in because there is only room for 3 or 4.
  
  Reply ↓
Jesus Caro V on December 20, 2012 at 8:50 am said:

Good one!

Reply ↓
Dan Sutton on December 20, 2012 at 10:13 am said:

Interesting article. I’d think the best approach (and the one I usually take) would be to make everything as fast as it can be: any remaining bottlenecks, then, will be in the infrastructure, not the code.

Erratum: in Part The Fourth, the relevant question is actually, “Is either of these horses fast enough to meet my customer’s needs?”

Reply ↓
- Eric Lippert on December 20, 2012 at 10:29 am said:
  
  I guarantee you that you do not make anything as fast as it can be, and you certainly do not make everything as fast as it possibly can be. Rather, since you are a sensible person with finite resources, you make things faster until making them faster still would cost more than it is worth the effort to do so.
  
  Making code faster costs effort, and there is only a finite amount of effort available in the world. Effort costs time and money.
  
  Reply ↓
Alois Kraus on December 20, 2012 at 10:45 am said:

Great and often ignored advice. I have seen architects pushing redesigns into teams based on flawed measurements which did have no effect on an real installed machine.

You should mention which profilers you can use. The Windows Performance Toolkit is a great piece of work (http://geekswithblogs.net/akraus1/archive/2012/12/19/151604.aspx). The only strange this is that the call stacks seem to be broken with .NET 4.0 (http://social.msdn.microsoft.com/Forums/en-US/wptk_v4/thread/b7098a78-6600-456a-813e-79a84b994f32).

Reply ↓
Pingback: Liens de la semaine #9 « frenchcoding
Pingback: How to time code « Jim's Random Notes
Pingback: How to Create Early Bound Entity Classes with the Code Generation Tool? | Zsolt Zombik's Professional Blog
Pingback: How to Create Early Bound Entity Classes with the Code Generation Tool? - Microsoft Dynamics CRM Community
Pingback: How to: Ternary operator is twice as slow as an if-else block? | SevenNet
Pingback: Fixed Ternary operator is twice as slow as an if-else block? #dev #it #asnwer | Good Answer
Pingback: Solution: Ternary operator is twice as slow as an if-else block? #answer #programming #dev | IT Info
borealissw on March 25, 2015 at 9:08 am said:

Great read and many, many valid points. Thanks

Re point 6 though and the relative importance of the various performance measurements, I think your “bigger picture” is too small.

Its worth pointing out that as far as the end-user who’s using your (or your customer’s) system, raw “speed” is invariably the *only* thing that matters to end-users. They don’t care one bit about your carefully balanced server load vs cost so I think you also need to expand your big picture to include the end-users needs – that may well bump up speed (with any potential costs) in your priorities.

Reply ↓
Pingback: » Python:memoization library for python 2.7
Pingback: Which is faster? | Tim's Dev Blog
Pingback: How I Found CUDA, Or: Rewriting the Tag Engine – Teknoids News
Pingback: How I found CUDA, or: Rewriting the Tag Engine–part 2 | 神刀安全网
Pingback: SUT Double | 神刀安全网
Pingback: SUT Double | Tech News
Pingback: Nullable micro-optimizations, part one | Fabulous adventures in coding
Pingback: 三元运算符的速度是其他块的两倍吗？ – CodingBlog
Pingback: System.Math.Round – Round to zero digits and divide vs. Round to one or more digits – program faq
Pingback: Microbenchmarking and the Importance of Being Thorough – The Developing Coder
Pingback: Avoiding async/await if I know most of the time the result will be cached2019 Community Moderator ElectionUsing async/await for multiple tasksHow and when to use ‘async’ and ‘await’Do you have to put Task.Run in a method to make it async?.NET asyn
Pingback: How Bad Is Good Enough? | Fabulous adventures in coding
Pingback: boolean - La forma más rápida de generar una aleatorias booleanas
Pingback: boolean - Modo più veloce per generare casualmente un booleano
David Wallace on July 17, 2022 at 11:16 pm said:

You have delighted me with the sentence containing “which of two ways … is just trivia”. From an etymological point of view, you’re 2/3 correct.

Reply ↓
Pingback: [C#] Ternary operator is twice as slow as an if-else block? - Pixorix
Pingback: 在领域层结构中使用 Int64 而不是 DateTime / TimeSpan 是否合理？ | 开发者交流平台
Pingback: JPA TypedQuery#getResultStream vs Hibernate Query#setResultListTransformer in terms of Performance? ❤️ {UPDATED 2023} » LoginVast.Com