Monday, July 2, 2007

Timer Troubles

Last updated Mar 10, 2006.

Conventional wisdom is that Windows and .NET applications should use the QueryPerformanceCounter Windows API function to do any High Resolution Timing that is required. Other Windows timing services do not provide resolution better than about 12 milliseconds (if that). It’s possible to use hardware-based timing devices, but those methods are non-standard and can’t be guaranteed to exist on all machines.

One hardware-based timing device that is on all machines is the Pentium’s RDTSC (Read Time Stamp Counter) instruction. The TSC is a 64-bit value that increments at each clock cycle. It is possible, using RDTSC and some assembly language code that determines the CPU’s clock frequency, to provide very accurate high-frequency timing services. This was common practice among games developers, even though Microsoft’s recommendation was to use QueryPerformanceCounter.

We found out a couple of years ago why Microsoft recommends using QueryPerformanceCounter rather than RDTSC. The reason, in short, is Intel’s SpeedStep technology, which reduces the CPU’s clock frequency during idle times in order to save power. Unless the timing code knows about the clock frequency change, timing loops stop working correctly. If the code thinks that the processor is running at 4 gigahertz and then SpeedStep drops the speed, the timing loop will become inaccurate.

There are ways to detect the clock frequency changes caused by SpeedStep and other similar technologies, but existing code that depends on a steady clock frequency won’t work if you replace your system with one that has a variable-frequency CPU. Code that uses QueryPerformanceCounter, on the other hand, would work fine because QueryPerformanceCounter understands SpeedStep and the Power Management settings in Windows.

What QueryPerformanceCounter doesn’t understand, we’re finding out, is multi-core processors. On an Athlon64 X2 processor, QueryPerformanceCounter will sometimes report negative elapsed time, for reasons that are as yet unclear. One theory was that the timing code was being shuttled between the processor cores and that the cores’ time bases weren’t always in sync. That might indeed be the problem, but just setting the timer thread’s affinity to a single core doesn’t appear to be the solution.

Microsoft documented this problem in their Knowledge Base as article id 896256, with the unwieldy title "Computers that are running Windows XP Service Pack 2 and that are equipped with multiple processors that support processor power management features may experience decreased performance." That article describes the problem in some detail and includes a link to a hotfix that provides a solution. However, the hotfix is intended to be used only on systems that exhibit the problem, and is not currently (as of March 2006) being pushed out to all users through Windows Update. If you have users who are experiencing this problem with your code, you must notify them to download and apply the hotfix.

If you’re writing games that require high-resolution timing services, you should also check Microsoft’s Game Timing and Multicore Processors article for tips and techniques that will help you make the most of the timing services provided by Windows.

No comments: