DLL Injection and Function Hooking
For my Pandora’s Box reverse engineering project, I’m exploring methods of re-implementing the game. An interesting strategy employed by projects like OpenRCT2 (Rollercoaster Tycoon 2), re3 (GTA III, unfortunately taken down by Take-Two), and ZeldaRET (various Zelda games) involves decompiling and gradually replacing the original game’s code. Unfortunately, I couldn’t find any detailed write-ups on what the initial stages of these projects looked like. Last week however I stumbled upon Thyme, a re-implementation of Command & Conquer Generals: Zero Hour. The approach they have taken is quite clear from the initial commit on their GitHub, so I decided to do a writeup of their approach myself to determine if it could be useful for my Pandora’s Box project.
Please note that I’m far from an expert on reverse engineering, C/C++, assembly, the Win32 API, DLLs, PE and so forth, so take this post with a grain of salt, check with the original material and if you find any mistakes, please let me know 😁
Analyzing Thyme’s initial commit #
My analysis began by examining the initial commit of Thyme. Their setup consists of two parts:
- A DLL, in which the game’s functionality is gradually implemented;
- An executable that starts the original game EXE and injects the aforementioned DLL.
The flow of the injector is as follows (information based on the initial commit of Thyme and this post on DLL injection on process start):
- The injector starts the original game is started using CreateProcessA. The CREATE_SUSPENDED flag is passed to immediately suspend the new process until we call ResumeThread. This is necessary so we can inject the DLL and the DLL can setup its hooks before the game actually starts running. Unfortunately, we cannot load our DLL immediately, as the process needs to do some initialization of its environment first.
- The entry point of the process is replaced with a jump to itself (JMP $-2, i.e. an infinite loop). The original instruction(s) (bytes) at the entry point are read first, so we can undo the patch when we’re ready to resume the game. This is done by first marking the processes memory as writeable (by calling VirtualProtectEx with parameter PAGE_EXECUTE_READWRITE), reading the original bytes with ReadProcessMemory and replacing them using WriteProcessMemory, and finally restoring the original memory protection option using VirtualProtectEx again.
- The process is resumed using ResumeThread, so it can do the required initialization. Once done, it will enter the infinite loop we patched in.
- The injector periodically checks whether the process has reached the infinite loop by checking if the instruction pointer (IP) points to the address of the JMP instruction.
- When the IP has reached the loop, we know the process has been initialized and we can inject the DLL:
- Using VirtualAllocEx we allocate some memory to hold the full path to the DLL, and through WriteProcessMemory we write the path to the allocated memory.
- Through CreateRemoteThread LoadLibraryA is called from the game process, with the path to the DLL (in the allocated memory) passed as a parameter. Using WaitForSingleObject, we then wait until the library is loaded.
- When the DLL has been successfully loaded into the process, the memory from where the path was loaded is freed (by calling VirtualFreeEx).
- After loading the DLL, the process is suspended again (using SuspendThread), the original bytes from before we patched in the infinite loop are restored, and finally the process is resumed again through ResumeThread.
Next, let’s take a look at the DLL:
- The DLL contains a DllMain entry point. This entry point is called when the library is loaded through LoadLibrary and when it is unloaded through FreeLibrary. As the injector calls LoadLibrary from the game process, we can setup our hooks from DllMain.
- First, the process memory is made writeable through VirtualProtectEx (exactly like it was done when injecting the DLL).
- Next, we can start hooking functions. For this we need the address of the function, which we can find though tools like Ghidra, IDA or x64dbg, and the address of our re-implementation of the function, which we can get by casting.
- The actual hooking is then quite simple:
- Calculate the offset between the original and replacement function (replacement address – original address).
- Replace the original function with a relative jump to the calculated offset (using WriteProcessMemory we can overwrite the original function with 0xE9 – the x86 32-bit relative jump instruction – followed by the offset).
Thymes hooker DLL implements additional (convenient) features, like the ability to call an original function, and a way to access global variables from the game if the address of the variable is known, but these are out of scope for now
Minimal implementation of the injector and DLL #
To better understand the theory above we will implement an injector and DLL for a demo application.
The demo application prints ‘Hello World!’ to the screen, calculates the sum of two integers, and then prints this result as well:
If you’re following along, do not forget to build the application (and later the injector and DLL) in x86/32-bit mode (as we’re going to patch in a 32-bit relative jump, which might not be enough in a 64-bit address space), with optimizations disabled (to prevent the compiler from inlining our very simple functions) and with ASLR disabled (for obvious reasons).
After building the application we can analyze it in Ghidra to find the address of the entry point. Because our application is built with debugging symbols and without optimizations this is a lot easier than when analyzing ‘real’ software 😊 From the Symbol Tree window we can immediately browse to the main function. In the listing window we can then see that the function starts at address 0x00412510:
Next, let’s implement the injector based on our research on Thyme’s initial commit. Again, this is all heavily based on the work of the Thyme developers, so all credit goes to them!
Finally, create an empty DLL project. Before implementing the hooking itself we can build the application, injector and DLL and verify that the application still works as intended after injecting the DLL:
Now, on to the DLL. We’re going to make the following modifications to the program:
- The Add function will be replaced with a function that subtracts instead.
- The PrintHello function will print ‘Hello DLL!’ instead of ‘Hello World!’.
Let’s start by marking the .text segment as writeable so we can set up our hooks. In Ghidra, we can find the start of the .text segment from the Program Trees window:
We can see the segment starts at 0x00411000. While the .text segment is still selected, scroll all the way down to find the end of the .text segment at 0x004185ff. We can then calculate the size of the segment as 0x004185ff – 0x00411000 = 0x000075ff. We will pass the start address and size of the segment as parameters to VirtualProtectEx when the DLL is loaded (reason DLL_PROCESS_ATTACH when DllMain is called):
Next we will write our replacement Add and PrintHello functions:
In Ghidra, we will then look up the addresses of the original functions by finding them through the Symbol Tree window. The PrintHello function starts at 0x00412250, and the Add function starts at 0x00412200.
We can now setup our hook by replacing the bytes at the original function start by 0xE9 (the 32-bit relative jump instruction) followed by the relative offset to the re-implemented function (re-implemention function address – original function address – 5 (the size of the jump instruction and the offset itself)):
Finally, build the DLL and run the injector:
The full source code for the Demo Application, Injector and DLL is available on GitHub here.
- Previous post: Reverse Engineering Pandora’s Box (1)
- Next post: Running an AI pair programmer locally using open source software