- Practice WinDbg for Inspecting Kernel Data Structure
- Trace and Modify Control Flow Using IMM
- Understand the techniques employed at Max++ for hiding library loading
- Understand image loading notifier, asynchronous procedure call, kernel and normal routines.
- Operating Systems
- Assembly Language
- Operating System Security
This tutorial analyzes the malicious DLL file \\.\C2CAD....\max++.00.x86. We will see a lot of interesting techniques for hiding its malicious behaviors.
2. Lab Configuration
The first step is to take out max++.00.x86 from the hidden drive. You can follow the similar techniques presented in Tutorial 28, The basic idea is to invoke zwCopyFileA after the file is available in the hidden drive.
Then, we have to get into the entry of max++.00.x86, which is located at offset +2DFF. This information is available in IMM, when you load max++.00.x86 as a DLL file in IMM. Then View -> Executable Modules, you can find out the +2DFF DLL entry and its base is 35670000.
Notice that, you do can execute max++.00.x86 step by step in IMM, however, it will not exhibit the "correct" malicious behaviors, because it's not running in the environment where Max++ has pre-hooked certain driver entries and exception handlers as needed. We have to, instead, use WinDbg to get to the DLL entry, following the "proper" execution flow of Max++.
In the following, we show how to step into the entry of max++.00.x86.
The configuration is similar to the setting of the WinDbg instance of Tutorial 20. In the win_debug instance, you need to set a breakpoint at 0x3C230F. This is the place right before the call of CALL WS2_32.WSASocketW, as shown in Figure 1 below.
Figure 1. Something that we do not know that we do not know |
This is where Max++ is trying to access http://74.117.114.86\max++.x86.dll. Now press F8 in the IMM, you will stop in WinDbg twice on loading images. Now type the following to stop at the entry of max++.00.x86:
> bp 35672dff
Somehow WinDbg is not able to capture the breakpoint, but instead, the INT 3 is captured by the IMM instance as shown in Figure 2:
Figure 2. Entry of max++.00.x86 |
If you look at the instruction at 35672DFF (in Figure 2), you will notice that there is an 'INT 3' (single-step breakpoint) placed by WinDbg. We have to recover it to the original instruction PUSH EBP (we can learn about this information in the previous session that loads max++.00.x86 as a DLL in user mode IMM). The resetting of "PUSH EBP" has to be done, otherwise, the instruction at 0x35672E05 (MOV EAX, [EBP+C]) would not function correctly, because EBP is not set right. Please follow the instructions below in IMM:
(1) in the code pane, right click on the INT 3 instruction and select assemble, and then type "PUSH EBP".
(2) click the 2nd button (Python) on the tool bar to bring up the Python window and and run the following commands in the python window: imm.setReg("EIP", 0x35672DFF); This is to reset the EIP register. From no on, you can execute max++.00.x86 step by step.
3. Hiding DLL Loading
We now can explain why the remote DLL loading is not captured by IMM earlier in Tutorial 29. Look at figure 3 below.
Figure 3. Hide DLL Loading |
The first call (shown in Figure 3), zwTestAlert clears the APC queue, and the second call DisableThreadLibraryCalls detaches DLL_LOAD signal so that the later DLL loading event will not be captured. Then the program plays a trick, it reads out the function addresses stored from 356761E0 to 35676200 one by one, and call each of them (see the high lighted area in the memory dump in Figure 3) one by one. These calls are 3567595c, 35675979, and etc..
Challenge 1. Find out all the functions (and their semantics) being called described in the above.
The logic is roughly described below:
(1) max++ first creates an AVL tree.
(2) then it copied the entry address of several ntdll functions such as ntWriteFile into global memory.
It's basically to prepare the data section (hard code it) in global memory. Then it calls 356752D7 (at 0x35672E5E).
(3) then it saves several dll names such as "kernel32.dll", "kernel32base.dll" into its stack.
4. Manipulation of System Kernel DLLs
Now we start the analysis from 0x356762D7. Figure 4 shows its function body.
Figure 4. Processing Kernel Modules |
Function 0x356752D7 processes the related kernel DLLs one by one using a loop. The collection of DLL names is preset by Max++ (see the comments after challenge 1). The function tries to get the handle of each DLL, it proceeds to call 0x3567525C to process the header information of each DLL (only when the handle is successfully retrieved, i.e., the DLL is currently already alive in the system space).
We now look at the details of funciton 0x3657525C, as shown in Figure 5.
Figure 5. Function 0x3567525c |
The major part, as shown in Figure 5, is a call of RtlImageDirectoryEntryToData. The undocumented system call (if you search it on ReactOS, you can find the source code) takes four parameters: (1) the image base of the DLL file, (2) whether it is regarded as an image file, (3) the ID of the entry to retrieve, and (4) the entry size. In this case, our ID is "1", reading the MSDN documentation of IMAGE_DIRECTORY of PE header, we can find that ID "1" stands for import table address and size. Now it's your job to find out what is the use of this information, i.e., what does Max++ plan to do with them?
Challenge 2. Find out the use of NT header information retrieved by code at 0x35675277 (hint: use data breakpoints to find out when and where the information is used).
Following function 0x356752D7 there are many technical details related to NT header information retrieval and resetting, whereas the analysis is similar. In the following, we directly jump to the other interesting parts of max++.00.x86.
5. Resetting Exception Handler (SEH)
We now look at an interesting behavior of max++.00.x86 that resets the SEH handler. Before max++.00.x86 does the trick, the current SEH chain is shown in Figure 6 (click View -> SEH chain in IMM).
Figure 6. SEH Chain before max++.00.x86 does the trick |
As shown in Figure 6, when an exception occurs, it is first passed to ntdll.7C90E900 (3 times) and then kernel32.7C839AC0 (if 7C90E900 cannot handle it right). In fact, SEH chain is a very simple singly-listed data structure. Each SEH record has two elements as shown below:
struct SEH_RECORD{
SEH_RECORD *next;
void* handler_entry
};
For example, the first SEH record is locatd at 0x0012BF1C (this is usually at the bottom of the current stack frame of the current thread), as shown in Figure 7.
Figure 7. Memory dump of the first SEH record |
Now, in Figure 8, we present the malicious code in max++.00.x86 which resets SEH. The basic idea is to construct a new SEH record and push it into the stack, and reset the value at FS:[0] (which is part of the kernel data that points to SEH chain). The important instructions are already highlighted.
Figure 8. Manipulates SEH Chain |
Challenge 3. Explain in details (based on the instructions given in Figure 8, starting from 0x356754BC): why is 35675510 the new exception handler?
Challenge 4. Explain the logic of handler at 35675510 and investigate when it is called.
5. Loading Remote DLL
The next interesting part is located at 0x35672E6D. Figure 9 shows a call of zwCreateThread and its parameters. The thread's start routine (i.e., the function to execute when the thread is born) is 0x356718D2.
Figure 9. Create a new thread which invokes 0x356718D2 |
Let's set a software breakpoint at 0x356718D2 and trace into it, as shown in Figure 10. There are two major functions: 0x35671797 and ole32.CoInitialize.
Figure 10. Thread Start Routine |
Figure 11. First Part of 0x0035671797 |
As shown in Figure 11, The first important call in 0x35671797 is LdrFindEntryForAddress(0x0035670000, 0x009FFFA8). Search the documentation of LdrFindEntryForAddress, we could soon learn that the function finds the module entry (LDR_DATA_TABLE_ENTRY) for address 0x0035670000, and the pointer to LDR_DATA_TABLE_ENTRY is stored in 0x009FFFA8 (which contains value 0x00252A50 after the call. Dump the data structure using WinDbg (start the WinDbg inside the WIN_DEBUG VBox instance and attach the process max++ noninvasively, we have the following. Note that the FullDllName's type is _UNICODE string, while the real char array is stored at 0x002529c0.
:000> dt _LDR_DATA_TABLE_ENTRY 0x00252a50 -r1
ntdll!_LDR_DATA_TABLE_ENTRY
+0x000 InLoadOrderLinks : _LIST_ENTRY [ 0x252b10 - 0x252cd0 ]
+0x000 Flink : 0x00252b10 _LIST_ENTRY [ 0x252bd8 - 0x252a50 ]
+0x004 Blink : 0x00252cd0 _LIST_ENTRY [ 0x252a50 - 0x252900 ]
+0x008 InMemoryOrderLinks : _LIST_ENTRY [ 0x252b18 - 0x252cd8 ]
+0x000 Flink : 0x00252b18 _LIST_ENTRY [ 0x252be0 - 0x252a58 ]
+0x004 Blink : 0x00252cd8 _LIST_ENTRY [ 0x252a58 - 0x252908 ]
+0x010 InInitializationOrderLinks : _LIST_ENTRY [ 0x252d48 - 0x252be8 ]
+0x000 Flink : 0x00252d48 _LIST_ENTRY [ 0x252e08 - 0x252a60 ]
+0x004 Blink : 0x00252be8 _LIST_ENTRY [ 0x252a60 - 0x252b20 ]
+0x018 DllBase : 0x35670000 Void
+0x01c EntryPoint : 0x35672dff Void
+0x020 SizeOfImage : 0xc000
+0x024 FullDllName : _UNICODE_STRING "\\.\C2CAD972#4079#4fd3#A68D#AD34CC121074\L\max++.00.x86"
+0x000 Length : 0x6e
+0x002 MaximumLength : 0x72
+0x004 Buffer : 0x002529c0 "\\.\C2CAD972#4079#4fd3#A68D#AD34CC121074\L\max++.00.x86"
...
The next call is OLE32.LoadTypeLibrary("\\.\C2CAD...max++.00.x86", REGKIND_NONE, 0x009FFFAC), where a pointer to ITypeLib is stored at 0x009FFFAC. By dumping the RAM, we can find that the ITypeLib pointer is 0x00180F10.
Next comes the interesting part. It reads ECX = [EAX], as shown in Figure 11, then it calls [ECX+18], passing three parameters: 0x00180F10 (the "this" pointer for the ITypeLib interface itself), 0x003567696C, and 0x00356782A0 (and there might be other parameters, i.e., not necessarily 3). None of the 0x003567696c and 0x00356792a0 looks like inside code region. But which function is it calling?
We need to read the documentation of MSDN about ITypeLib carefully. According to MSDN, ITypeLib supports a number of functions such as FindName, GetDocumentation, etc.If we examine the memory contents of 0x00356782A0, before and after the call of [ECX+18], we might notice that its contents is changed. Thus, checking the list of functions available in ITypeLib, the only function that has the second parameter as output is ITypeLib::getTypeInfoOfGUID(int GUID, ITypeLib **).
Challenge 5. Verify if the above conjecture about getTypeInfoOfGUID is correct.
Now let us study the last interesting part of function 0x35671797, as shown in Figure 12.
Figure 12. Overwrite Module Information |
Now, when the control flow returns from 0x35671797 (look at Figure 10, it calls Ole32.CoInitialize. Because the current module is registered as a type library, and Ole32 modules will try to reload it again somehow, which enforces the system to load the remote DLL \\74.117.114.86\max++.x86.dll.
The file \\74.117.114.86\max++.x86.dll is not available on the Internet anymore. The call of CoInitialize fails. We have to figure out a way to explore the rest of the logic. We will continue our exploration in our next tutorial.