Benchmarking mistakes, part two

Part two of my Tech.pro beginner-level series on how to write bad benchmarking code can be found here.

Advertisements

13 thoughts on “Benchmarking mistakes, part two

  1. (I was going to comment on the techpro article itself, but it looks like they’re still the Dark Ages of having to create an account specifically for their site just to leave a single comment.)

    There looks to be an (unintentional) error in your Mistake 5 example: you have stopwatch.Stop(); while the context of the example suggests it should be var stopTime = DateTime.Now;.

  2. While I don’t disagree with the general theme, there is in fact DateTime.UtcNow which addresses your complaint about a DST change occurring during timing.

    I would also disagree with the statement that one should use the System.Diagnostics namespace objects only for non-production code. That namespace includes things such as interacting with the event log and other logging mechanisms, retrieving information about file versions, and interacting with other processes on the system, all of which can be perfectly valid activities for production code.

    Beware generalizations. They are always wrong. 🙂

    • Whereas DateTime.UtcNow will solve the problem for the specific case of DST changes, it doesn’t address NTP updates or the user changing the clock. My experience is that using the system clock (anything associated with DateTime, in this case) to measure elapsed time will get you into trouble.

  3. Unless I’ve missed something obvious, both DateTime and Stopwatch will show you the change in “real-world” time between the two statements. Because of thread/process scheduling is this really what you want?

    I guess this depends on your definition of “execution time” (hopefully this isn’t the obvious thing I missed), but my understanding is that what most benchmarks want to know how long your task (and it alone) kept a processing unit busy. It seems easy at first because {amount_of_cpu_cycles} is directly related to {real_time_elapsed}.

    But in a multi-threaded task this becomes more difficult to judge. Do you add up all the cores execution times (which is what Process.TotalProcessorTime does)? Probably not. Use just the longest-running thread that’s a part of the task? That seems better, but probably has it’s own problems.

    An older (and likely now out-of-date) article that talks about some of this is here: http://kristofverbiest.blogspot.com/2008/10/beware-of-stopwatch.html

    Sorry for my naive ramblings… I’m looking forward to the rest of this series 🙂

  4. A bit off-topic, but that article got me thinking about what is the best class to use when implementing an ID generation algorithm similar to what Instagram has done (http://instagram-engineering.tumblr.com/post/10853187575/sharding-ids-at-instagram). I coded a similar algorithm in .NET using DateTime.UtcNow and since a server ID and sequence is used in addition to a timestamp in the ID, I suspect the actual precision of DateTime matters less, but still, would you advise using another class in that case ?

  5. Interestingly, when a high frequency timer is now available, StopWatch falls back to DateTime.UtcNow.

    That means, if you somehow found a machine without a high frequency timer, you could change the result of your benchmark by changing your system date as the benchmark ran. The operating system might also change the result if it syncs the current time over NTP while your benchmark runs.

  6. I would disagree that System.Diagnostics is well-named and not for production use; Process is essential for production code and firing off new processes, and has got nothing to do with diagnostics. Some things like Debug asserts and DebuggableAttribute are useful to put in production code when you’re debugging an issue in it.

  7. The real reason to use DateTime.UtcNow in a benchmark (vs. DateTime.Now) is that it doesn’t have the thousands-of-cycles overhead of computing the timezone offset.

    The other thing to keep in mind is that, while Stopwatch is likely to be far more precise, DateTime is likely to be more accurate. In other words, if your computer’s clock runs slow, the mechanism DateTime uses may kept accurate via NTP, while the Stopwatch mechanism won’t be.

    • NTP updates won’t make things more accurate unless the thing you’re timing takes a really long time. The default polling interval for NTP is usually quite long. Windows default, for example, is 7 days. Even if you use “constant synchronization,” updates aren’t going to occur more than once per minute, and usually much longer than that.

      If NTP updates can change the clock, then there is some bit of uncertainty in your timings. Given two start/end timestamp pairs, you can’t compare them reliably because one pair might include adjustments to the clock and the other doesn’t. I can’t reliably compare two measurements obtained by taking the difference of DateTime.UtcNow values because NTP updates or other changes to the system clock might have occurred.

      The advantage of using Stopwatch (the high frequency timer) is that even though it might be inaccurate, it’s consistent. I know that its report of 100 milliseconds, for example, represents the same time period every time. It might be only 99.987 milliseconds, but it’s *always* 99.987 milliseconds. I can reliably compare two Stopwatch values obtained from the same machine.

  8. “DateTime.Now pays attention to Daylight Savings Time changes.”
    First of all, that seems absurd. Also, looking at the .Net DateTime documentation (http://msdn.microsoft.com/en-us/library/system.datetime.aspx) it appears to be false, given the existence of the DateTime.Ticks. It would not be feasible to support a Ticks property if a DateTime was inextricably tied to the current timezone and daylight savings state.

    If “elapsed = stopTime.Ticks – startTime.Ticks” would give a consistent answer across DST boundaries, then surely it is the operation of subtracting times from one another which pays attention to DST changes, not DateTime.Now.

    I would also argue that the behaviour is wrong, given that “A TimeSpan object represents a time interval (duration of time or elapsed time)”, and I cannot think of any logical argument that would support someone reasonably claiming that the duration of time or elapsed time across a DST boundary can be negative.

  9. Hello there! I know this is kinda off topic but I was wondering if you knew where I could locate a captcha plugin for my
    comment form? I’m using the same blog platform as yours and I’m having difficulty
    finding one? Thanks a lot!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s