NtCreateThread - memory allocations in kernel mode
Windows Research Kernel @ HPIIn this post we try to determine how much kernel memory is required when creating a new thread. This amount of memory is relevant for the upper bound of the number of possible threads in the system as investigated in detail by Mark Russinovich.
For a starting point we looked at the system service call
      implementation of NtCreateThread and followed every
      possible code path down to memory allocation functions such as
      ExAllocatePoolWithTag.
The following picture (click for a larger view) shows a flow
      chart of NtCreateThread.
Vertical connections are call relations - e.g.
      NtCreateThread calls the function
      PspCreateThread. Horizontal connections are call
      sequences - PspCreateThread first calls
      ObCreateObject and afterwards
      ExCreateHandle, and so on. The graph is far from
      complete. Only functions leading to memory allocations are
      shown.
As you can see there are four different places memory is allocated to whilst creating a new thread:
(1) ObCreateObject: A new thread object is created
      which will be managed by the object manager. The allocated memory
      contains the thread data structure ETHREAD and object
      metainformation.
(2) ExCreateHandle: A thread handle has to be stored
      in the process handle table. If free entries are available in the
      table, an entry can be used directly for the new thread. If the
      handle table is full, ExpAllocateHandleTableEntrySlow
      allocates memory and extends the handle table of the process.
(3) MmCreateTeb: Every (user mode) thread gets a
      thread environment block (TEB) which contains e.g. information about
      thread local storage memory (see sdk/inc/pebteb.h for
      further details). The function MiCreatePebOrTeb
      allocates a virtual address descriptor and reserves memory for the
      TEB data structures.
(4) KeInitThread: A thread requires kernel stack
      space for activities in kernel mode. Such a stack is created via
      MmCreateKernelStack.
The actual memory size used by a new thread depends on the actual platform of the system. There are differences in 64-bit Windows compared to 32-bit Windows with regard to memory page size and data structures. Again, Mark Russinovich covers different aspects in his 'Pushing the Limits of Windows' blog post series.
The following table shows the results of our source code analysis:

In short, during thread creation on a 32-bit system around 20 kbyte of memory is used. On a 64-bit system around 40 kbyte is allocated.
Disclaimer: There might be certain memory alignment/padding effects which are not considered in the presented calculation. Furthermore, we might just have missed memory allocations. 🙂
Comments
3 Responses to "NtCreateThread - memory allocations in kernel mode"

How did you generate that graph? Did you do it manually or did you use something like doxygen?
The graph was drawn manually (Visio 2003).
[...] the address space of a process. As each thread requires a certain number of memory in kernel space(NtCreateThread - memory allocations in kernel mode) and user space, you can easily calculate the number of threads you can create until the address [...]