NtCreateThread - memory allocations in kernel mode

Windows Research Kernel @ HPI

In this post we try to determine how much kernel memory is required when creating a new thread. This amount of memory is relevant for the upper bound of the number of possible threads in the system as investigated in detail by Mark Russinovich.

For a starting point we looked at the system service call implementation of NtCreateThread and followed every possible code path down to memory allocation functions such as ExAllocatePoolWithTag.

The following picture (click for a larger view) shows a flow chart of NtCreateThread.

Vertical connections are call relations - e.g. NtCreateThread calls the function PspCreateThread. Horizontal connections are call sequences - PspCreateThread first calls ObCreateObject and afterwards ExCreateHandle, and so on. The graph is far from complete. Only functions leading to memory allocations are shown.

As you can see there are four different places memory is allocated to whilst creating a new thread:

(1) ObCreateObject: A new thread object is created which will be managed by the object manager. The allocated memory contains the thread data structure ETHREAD and object metainformation.

(2) ExCreateHandle: A thread handle has to be stored in the process handle table. If free entries are available in the table, an entry can be used directly for the new thread. If the handle table is full, ExpAllocateHandleTableEntrySlow allocates memory and extends the handle table of the process.

(3) MmCreateTeb: Every (user mode) thread gets a thread environment block (TEB) which contains e.g. information about thread local storage memory (see sdk/inc/pebteb.h for further details). The function MiCreatePebOrTeb allocates a virtual address descriptor and reserves memory for the TEB data structures.

(4) KeInitThread: A thread requires kernel stack space for activities in kernel mode. Such a stack is created via MmCreateKernelStack.

The actual memory size used by a new thread depends on the actual platform of the system. There are differences in 64-bit Windows compared to 32-bit Windows with regard to memory page size and data structures. Again, Mark Russinovich covers different aspects in his 'Pushing the Limits of Windows' blog post series.

The following table shows the results of our source code analysis:

In short, during thread creation on a 32-bit system around 20 kbyte of memory is used. On a 64-bit system around 40 kbyte is allocated.

Disclaimer: There might be certain memory alignment/padding effects which are not considered in the presented calculation. Furthermore, we might just have missed memory allocations. 🙂


3 Responses to "NtCreateThread - memory allocations in kernel mode"

  1. Aram H?v?rneanu on October 1st, 2009 10:48

    How did you generate that graph? Did you do it manually or did you use something like doxygen?

  2. Michael Schöbel on October 1st, 2009 11:29

    The graph was drawn manually (Visio 2003).

  3. Scalability Issues in CSRSS on July 19th, 2010 13:31

    [...] the address space of a process. As each thread requires a certain number of memory in kernel space(NtCreateThread - memory allocations in kernel mode) and user space, you can easily calculate the number of threads you can create until the address [...]