A Performance Issue in Windows Timer Management?

Windows Research Kernel @ HPI

I was just crawling through the Windows sources to find out some details about Windows timer management when I figured out the following comment on lines 621 thru 624 in file base\ntos\inc\ntosdef.h:

00618 // 
00619 // Define timer table size. 
00620 // 
00621 // N.B. The size of the timer table must be less than or equal to 256 and a 
00622 // power of 2 in size. 
00623
00624 #define TIMER_TABLE_SIZE 512

Obviously, the definition of TIMER_TABLE_SIZE does not follow the comment. But, as the code evolved over the years, I thought it was just another comment inconsistency.

Diving deeper into the Windows timer management, I found the reason why TIMER_TABLE_SIZE has to be a less than or equal to 256. Let me first tell you some principles of the Windows timer management.

Windows maintains TIMER_TABLE_SIZE lists of timers where each list corresponds to a residue class of the current tick count modulus TIMER_TABLE_SIZE. That is, the first timer list contains timers that expire at a tick count that is equivalent 0 modulus TIMER_TABLE_SIZE. (Tick count values like 0, 512, 1024, etc.) The second timer list contains timers that expire at a tick count value that is equivalent 1 modulus TIMER_TABLE_SIZE, like 1, 513, 1024, etc. These lists are aggregated in KiTimerTableListHead which is actually an array with TIMER_TABLE_SIZE entries of type KTIMER_TABLE_ENTRY. The structure is shown below.

01936 typedef struct _KTIMER_TABLE_ENTRY {
01937       LIST_ENTRY Entry;
01938       ULARGE_INTEGER Time;
01939 } KTIMER_TABLE_ENTRY, *PKTIMER_TABLE_ENTRY;

The Time member represents the absolute due time in interrupt time of the first timer in the doubly linked list pointed to by Entry. When a timer object is set (KeSetTimer() in ntos\inc\ke.h), Windows inserts the corresponding timer object in an appropriate timer list. To find the appropriate timer list, it uses the internal helper function KiComputeDueTime() (see base\ntos\ke\ki.h) that computes the absolute due time from a given relative due time. This function has an output parameter Hand, that identifies the appropriate index of the timer in the timer table. Hand is computed by the function KiComputeTimerTableIndex() with the following formula:

Hand = (Due Time / Maximum time) % TIMER_TABLE_SIZE,

where Maximum time is equivalent to the system's tick count period length.

Now, refer to the following lines of KeComputeDueTime():

01219        *Hand = KiComputeTimerTableIndex(DueTime);
01220      Timer->Header.Hand = (UCHAR)*Hand;
01221      Timer->Header.Inserted = TRUE;
01222      return TRUE;

Here it is worth noting that KiComputeTimerTableIndex() returns a ULONG value. As you can see, *Hand is cast into a UCHAR. That is no problem as long as TIMER_TABLE_SIZE is less than or equal to 256. But remember the code in the first listing, it is 512! That is, the Hand member of the timer's DISPATCHER_HEADER structure is likely to overflow.

Now I tried to figure out where this member is referenced and whether any overflow check is done.

When a timer expires, Windows sets up a DPC that handles this timer. In that DPC Windows scans thru the current list(s) of timers to find all expired ones. When an expired timer is found, KiRemoveEntryTimer() (see base\ntos\ke\ki.h) is called:

01968 ULONG Hand;
01969 PKTIMER_TABLE_ENTRY TableEntry;
01970
01971 // 
01972 // Remove the timer from the timer table. If the timer table list is 
01973 // empty, then set the respective timer table due time to an infinite 
01974 // absolute time. 
01975 // 
01976 // N.B. It is the responsibility of the caller to set the timer inserted 
01977 // state. 
01978 // 
01979
01980 Hand = Timer->Header.Hand;
01981 if (RemoveEntryList(&Timer->TimerListEntry) != FALSE) {
01982       TableEntry = &KiTimerTableListHead[Hand];
01983       if (&TableEntry->Entry == TableEntry->Entry.Flink) {
01984               TableEntry->Time.HighPart = 0xffffffff;
01985       }
01986 }

What this routine needs to do is the following. It removes the timer from its timer list (line 1981) and if that list becomes empty after removing the timer, the Time.HighPart member of the timer's list head must be set to 0xffffffff (line 1984). Unfortunately, the routine uses the Header.Hand member of the timer (line 1982) to calculate the timer's list head. This may indeed lead to a wrong list head, if Hand was overflown at timer insertion! The developers of Windows probably figured out that they modified the wrong timer list, but they did not figured out why. This assumption is supported by line 1983. This line is actually redundant code, because RemoveEntryList() returns TRUE if the list is empty after the entry was removed. So if we are in the body of the if statement, we actually know that the list is empty. But what happened is that if the timer to be removed is in list 257, Hand contains 1 - a completely different timer list. So line 1983 prevents the routine from setting the Time.HighPart of a completely different timer list that might not even be empty to an expiration time that is nearly infinite. But it also prevents the routine from indicating the actual timer list to be empty. So we have an inconsistency in the timer table, that is, we have an empty timer list, but the list's header still contains a past due time.

Of course, this will not lead to a show stopping bug in Windows; however, we figured out the consequences of this inconsistency.

The Windows timing is managed by the function KeUpdateSystemTime() (see base\ntos\ke\architecture\clockint.asm) which is actually an Interrupt Service Routine (ISR) for the clock interrupt. For the sake of simplicity, we concentrate on the lines shown in the following listing.

00212 ; 
00213 ; Check to determine if a timer has expired. 
00214 ; (edi:esi) = KiInterruptTime 
00215 ; (eax) = bucket 
00216 ; (ebx) = KeTickCount.LowPart 
00217 ; 
00218
00219 kust10:        and eax,TIMER_TABLE_SIZE-1 ; isolate current hand value 
00220             shl eax, 4 ; compute timer entry offset 
00221             cmp esi,[eax]+_KiTimerTableListHead+TtTime+4 ; compare high due time 
00222             jb kustxx ; if below, timer has not expired 
00223             ja short kust15 ; if above, timer has expired 
00224             cmp edi,[eax]+_KiTimerTableListHead+TtTime ; compare low due time 
00225             jb kustxx ; if below, timer has not expired 
00226 kust15: ; 
00227
00228 ; 
00229 ; Timer has expired, put timer expiration DPC in the current processor's DPC 
00230 ; queue. 
00231 ; 
00232 ; (ebx) = KeTickCount.LowPart 
00233 ; 
00234
00235             mov ecx,PCR[PcPrcb] ; get processor control block address 
00236             cmp dword ptr [ecx]+PbTimerRequest, 0 ; check if expiration active 
00237             jne short kustxx ; if ne, expiration already active 
00238             mov [ecx]+PbTimerRequest, esp ; set timer request 
00239             mov [ecx]+PbTimerHand, ebx ; set timer hand value 
00240             mov ecx, DISPATCH_LEVEL ; request dispatch interrupt 
00241             fstCall HalRequestSoftwareInterrupt ;

To determine if a timer has expired Windows first computes the current Hand value. This is done in line 219, where and eax, TIMER_TABLE_SIZE-1 equals eax = eax % TIMER_TABLE_SIZE. First, the HighPart member of the current table index is examined, afterwards the LowPart member. If the timer has expired or expires at the current tick count, we fall through or jump to the code beginning in line 235. These few lines update the PRCB (processor control block) structure of the current processor and request a DPC (line 241).

The DPC handling is done by KiRetireDpcList() (see base\ntos\ke\dpcsup.c). Before any DPC enqueued in a processor's DPC queue is executed, the function checks whether a timer has expired and if so calls the function KiTimerExpiration() (see base\ntos\ke\dpcsup.c). This function is responsible for scanning through a set of KTIMER_TABLE_ENTRY structures in the global timer list that might contain timers having expired since the last timer interrupt. (This is because DPCs are scheduled asynchronously when all ISRs executing at IRQL > DPC_LEVEL have completed. So, when KiTimerExpiration() is finally executed, the current timer table index might be different from that when the DPC originally was enqueued.)

Remember the inconsistency that might occur in timer table entries with an index greater than 255. If the Time member of such a timer table entry keeps beeing set to a time that already has past, KeUpdateSystemTime() will always indicate timer expiration although the timer has already been removed from the timer list and the timer list is empty. This will always lead to KiTimerExpiration() which then figures out that there is actually no timer in the list (surprise, surprise). To make things even worse, KiTimerExpiration() only determines an empty list, but does not update Time.HighPart of that list as KiRemoveEntryTimer() is responsible for that. So the inconsistency remains in the timer table until a new timer is inserted in this particular timer list!

Also, the timer expiration check is not only executed at every tick count but also at every occurence of the timer interrupt! So, if there are many of these inconsistent timer table entries, a quiet significant amount of redundant DPCs might be scheduled.

We are currently measuring the amount of these "senseless" issued DPCs and provide further information in one of our next articles.

Comments

#comments 15 Responses to "A Performance Issue in Windows Timer Management?"

  1. Windows Research Kernel @ HPI - news, howtos and labdescriptions » Getting OS Information - The KUSER_SHARED_DATA Structure on August 9th, 2007 10:40

    [...] A Performance Issue in Windows Timer Management? [...]

  2. max on December 18th, 2007 15:00

    Also another error in Timer logic is exist:
    00222 jb kustxx ; if below, timer has not expired
    00223 ja short kust15 ; if above, timer has expired
    00224 cmp edi,[eax]+_KiTimerTableListHead+TtTime ; compare low due time
    00225 jb kustxx ; if below, timer has not expired
    eror in compare type (jb,ja - unsigned compare) but must be (jl,jg - signed compare).
    we compare ULARGE_INTEGER Time; from timer with InterruptTime;(time since the system was booted).
    but when we chage system time on dT, system update time in all timers : Time -= dT;as result
    Time can state negative(Time

  3. Alexander Schmidt on December 30th, 2007 14:43

    Max,

    you are right. In lines 222 and 223, they use the wrong version of compare instructions, i.e., the unsigned compare operation while high due time is actually a signed value (KSYSTEM_TIME).

    However, this will not lead to some faulty behavior of the OS! The due time of a timer is compared with the Windows interrupt time. The interrupt time counts the time in units of 100 ns since your computer was switched on and Windows was booted. That means, the interrupt time will remain positive appr. 30,000 years from the moment on you booted Windows.

    Regards,
    Alexander

  4. max on January 2nd, 2008 12:11

    i test one program wich set timer(to absolute time) at start, and stop work when time is expired.
    then i set clock forward to 1 week, but timer not fire.i begin discover what happens.at this time i
    found your “A Performance Issue in Windows Timer Management?”.then i convert timer handle to
    object pointer to _KTIMER, and look for this under debbuger.and i see that DueTime become negative
    (higest bits is fff…).but becouse time is compared as unsigned the timer is not fire.
    also i see that you right about Timer->Header.Hand, really timer inserted in
    &KiTimerTableListHead[Timer->Header.Hand & 0xff] entry, but Timer->Header.Hand can be lage than 0xff.
    p.s. sorry for bad english

  5. Alexander Schmidt on January 2nd, 2008 22:20

    Max,

    I have scanned through the code responsible for setting the system time (function KeSetSystemTime in base\ntos\ke\miscc.c). It is true that the due time of an absolute timer is recalculated, if the system time is modified. (This is because absolute timers are set in system time while the kernel timer object uses interrupt time.)

    However, I did not find any evidence that the system modifies the timer in such a way that the most significant bit (MSB) of the due time is set. (At least not, if you only advance the system time by one week.) So, maybe you can provide some code snippets or send me the whole sourc code of your test program by email that we can further discuss what might going on with your timer?

    Alexander

  6. max on January 3rd, 2008 13:37

    i not look for src(simply not have), i look for kernel code under debbuger. the exaple code is very simply:
    void CALLBACK TimerProc(HANDLE hTimer)
    {
    ZwWaitForSingleObject(hTimer, TRUE, 0);
    MessageBox(0, L"Time is expired", L"", MB_ICONINFORMATION);
    RtlExitUserThread(__LINE__);
    }

    void Entry()
    {
    HANDLE hTimer;
    if (0

  7. max on January 3rd, 2008 14:13

    if (0

  8. Alexander Schmidt on January 3rd, 2008 14:29

    Can you provide the code where your timer is set? Also, I would like to know when your thread TimerProc is started. Or is it just a callback function that is invoked when the timer was signaled?

    Alex

  9. max on January 3rd, 2008 14:30

    void Entry()
    {
    HANDLE hTimer;
    if (0 .. ZwCreateTimer(&hTimer, TIMER_ALL_ACCESS, 0, SynchronizationTimer))
    {
    LARGE_INTEGER li;
    GetSystemTimeAsFileTime((LPFILETIME)&li);
    li.QuadPart += ((DWORD64)10000000 * 15);
    ZwSetTimer(hTimer, &li, 0, 0, FALSE, 0, 0);
    if (0 .. RtlCreateUserThread(NtCurrentProcess(), 0, 0, 0, 0, 0, TimerProc, hTimer, 0, 0))
    {
    MessageBox(0, L"Wait 15 second", L"", MB_ICONWARNING);
    }
    ZwClose(hTimer);
    }
    }
    if you run this,normaly you see second message box ~15 sec later, but if during this period you
    increment system time on sevral days(the increment value must be grether than time value from
    last boot) the timer is not expired.if you convert timer handle to _KTIMER pointer and see this,
    you can see the negative value in DueTime member.also this value changed each time when you change
    system time.when you set back system time , timer is fire

  10. max on January 3rd, 2008 14:32

    sorry symbol LGE is break all, after self

  11. Alexander Schmidt on January 3rd, 2008 16:28

    Is there a reason why you are using undocumented API functions?

  12. max on January 3rd, 2008 17:16

    i very often use in my user mode programs functions from ntdll,more recent than from kernel32.this is simply my style, in example ofcouse can be used related funcs from kernel32

  13. Windows Research Kernel @ HPI - news, howtos and labdescriptions » Windows Timers: A Due Time Wrap-Around on January 4th, 2008 18:53

    [...] one of our readers, who commented on my previous post on an anomaly in Windows timer management (read the article). So, thanks again [...]

  14. Alexander Schmidt on January 4th, 2008 19:01

    Max,

    I successfully could reproduce the problem you stated above. I ran the test program on Vista and Server 2003. Only on Server 2003, the timer did not fire. So, I think the bug has been fixed.

    I wrote another article summarizing the problem. In my opinion, this problem is not the wrong comparison instruction as due time is unsigned by design. However, this fact has not been considered everywhere, so the due time will really wrap-around. Please have a look to what I figured out and thanks again for your comment.

  15. Windows Research Kernel @ HPI - news, howtos and labdescriptions » Resolved: A Performance Issue in Windows Timer Management on October 26th, 2009 08:07

    [...] Issue in Windows Timer Management By Alexander Schmidt Two years ago, I wrote an article about spurious timer table entries. Those entries occurred in the global timer table and resulted in unnecessarily scheduled DPCs. [...]