NtCreateThread - memory allocations in kernel modeWindows Research Kernel @ HPI
In this post we try to determine how much kernel memory is required when creating a new thread. This amount of memory is relevant for the upper bound of the number of possible threads in the system as investigated in detail by Mark Russinovich.
For a starting point we looked at the system service call
NtCreateThread and followed every
possible code path down to memory allocation functions such as
The following picture (click for a larger view) shows a flow
Vertical connections are call relations - e.g.
NtCreateThread calls the function
PspCreateThread. Horizontal connections are call
PspCreateThread first calls
ObCreateObject and afterwards
ExCreateHandle, and so on. The graph is far from
complete. Only functions leading to memory allocations are
As you can see there are four different places memory is allocated to whilst creating a new thread:
ObCreateObject: A new thread object is created
which will be managed by the object manager. The allocated memory
contains the thread data structure
ETHREAD and object
ExCreateHandle: A thread handle has to be stored
in the process handle table. If free entries are available in the
table, an entry can be used directly for the new thread. If the
handle table is full,
allocates memory and extends the handle table of the process.
MmCreateTeb: Every (user mode) thread gets a
thread environment block (TEB) which contains e.g. information about
thread local storage memory (see
further details). The function
allocates a virtual address descriptor and reserves memory for the
TEB data structures.
KeInitThread: A thread requires kernel stack
space for activities in kernel mode. Such a stack is created via
The actual memory size used by a new thread depends on the actual platform of the system. There are differences in 64-bit Windows compared to 32-bit Windows with regard to memory page size and data structures. Again, Mark Russinovich covers different aspects in his 'Pushing the Limits of Windows' blog post series.
The following table shows the results of our source code analysis:
In short, during thread creation on a 32-bit system around 20 kbyte of memory is used. On a 64-bit system around 40 kbyte is allocated.
Disclaimer: There might be certain memory alignment/padding effects which are not considered in the presented calculation. Furthermore, we might just have missed memory allocations. 🙂