Thursday, January 26, 2012

Malware Analysis Tutorial 13: Tracing DLL Entry Point

Learning Goals:
  1. Understand C calling convention
  2. Practice reverse engineering
Applicable to:
  1. Operating Systems
  2. Assembly Language
1. Introduction
In Tutorial 11, we have shown you the trick played by Max++ to load its own malicious executable using the "corpse" of another DLL called "lz32.dll".Beginning from this tutorial, we will analyze the functionality of the malicious DLL. In the following, we use "lz32.dll" to refer to this malicious code starting at 0x003C24FB. (In your VBox instance, this entry address might vary. Check Tutorial 11 for how to find out the correct entry address of lz32.dll).

Today, we will discuss some basic background information related to DLL entry point and analyze the first part of lz32.dll (it's not the real "lz32.dll", but the malicious code of Max++ planted into it).

2. Lab Configuration
(1) clear all breakpoints and hardware breakpoints in IMM (see View->Breakpoints and View->Hardware Breakpoints).
(2) Go to 0x4012DC and set a hardware breakpoint there. (why not software bp? Because that region will be self-extracted and overwritten and the software BP will be lost). Pay special attention that once you go to 0x4012DC, directly right click on the line to set hardware BP (currently it's gibberish code).
(3) Press SHIFT+F9 to run to 0x4012DC. Figure 1 shows the code that should be able to see. As you can see, this is right before the call of RtlAddVectoredException, where hardware BP is set to break the LdrLoadDll call (see Tutorial 11 for details). At this point, the code at 0x3C24FB has not been extracted. If you go to 0x3C24FB at this moment, IMM will complain that this address is not accessible.
Figure 1: code at 0x4012DC
(4) Now scroll down about 2 pages and set a SOFTWARE BREAKPOINT at 0x401417. This is right after the call of LdrLoadDll("lz32.dll"), where Max++ finishes the loading of lz32.dll.

Figure 2: code at 0x401407

(6) Now we will set a breakpoint at 0x3C24FB. Follow the instructions below:
Press SHIFT+F9 several times, until you hit 0x7C90D500 (this is somwhere inside ntdll.zwMapViewSection which is being called by LdrLoadDll). Goto 0x3C24FB and set a SOFTWARE BREAKPOINT there. (You will see a warning which says your BP is out of the range. This is because the malware author did not do a good job at resetting the binary PE information (executable code section size messed up - see Tutorial 12 for details). It should be fine, just click ok.

(7) If you hit SHIFT+F9 (probably twice), you will hit 0x3C24FB. If you hit 0x401417 directly, something wrong is with IMM (strangely, I cannot explain). You have to RESTART (Debug->Restart), and repeat steps (1) to (6) [yes, clear all BP and hardBPs). The current sequence should be you hit 0x7C90D500 twice, and then hit 0x3C24FB. This is because the LdrLoadDll will try to call the entry point of the DLL.

(Figure 3 shows the code that you should be able to see. The first instruction at 0x3C24FB should be CMP DWORD PTR SS:[ESP+8], -2. If you execute several steps, you might notice that it soon returns, because the value at [ESP+8] is 1.

Figure 3: code at 0x3C24FB

(9) Shift +F9 again, you will be hitting 0x401417,  and then SHIFT+F9 again, you will be hitting 0x3C24FB again! You might notice that now [ESP+8] has value -2 and if you F7, you will trace into a lot of details of the malicious logic.

Up to this point, you are doing it right. If there is anything messed up, you have to restore the snapshot because Max++ automatically removes its binary executable from the disk drive so that you will not be able to find it again.

2. Background Information of DLL Entry

DLLs, like .exe files, can have an entry point. This entry function will be executed when the DLL is loaded by system calls such as LdrLoadDLL(). MSDN has tons of excellent articles on it and you can read [1] for details. The following sample declaration is from [1], an DLL entry function takes three parameters, see below:

  HINSTANCE hinstDLL, // handle to DLL module 
  DWORD fdwReason, // reason for calling function 
  LPVOID lpReserved ) // reserved {...}

Challenge 1: Note the code at 0x3C24FB (Figure 3) is checking the value of [ESP+8]. Which parameter is stored at ESP+8?

We are particularly interested in the fwdReason. Where is it defined? Reading [1] you can find that there are some macros defined for fwdReason such as DLL_PROCESS_ATTACH (when a process first attaches the DLL), DLL_THREAD_ATTACH (when the thread of a process has it attached) etc. [1] does not provide information on the real integer values of these macros, but a simple google search of "#define DLL_PROCESS_ATTACH" yields the values [again! do the google search in your VM. Many sites hosting MS sources can be harmful!]. These values range from 1 to 8. For example, value 1 denotes DLL_PROCESS_ATTACH. This is the value of [ESP+8] when 0x3C24FB is hit the first time (which is called by LdrLoadDLL).

3. Analysis of Max++

If you pay attention to the first couple of instructions at 0x3C24FB, one natural question is:
Challenge 2: Why is the malware compare [ESP+8] with -2? What is the motivation for doing this?

The motivation is that, when it's a legal invocation (e.g., placed by LdrLoadDLL), the code at 0x3C24FB will return immediately (without doing any harm). Why? because, recall in  Tutorial 12, Max++ cut in LoadLdrDll and actually LoadLdrDll did not finish gracefully (some kernel structure information is not set up correctly). These information has be be properly set up, because the code can NOT call external functions (e.g., those provided by ntdll). Max++ does have to set up all these information by itself, and manually call the entry function at 0x3C24FB. Well, before calling it, it sets up the second parameter (fwdReason) to -2 (which is a value that will NEVER be used by a normal call of DLL entry point), so that the code knows that it's the call from Max++.

Last challenge of the day:
Challenge 3: Can you find out which instruction calls 0x3C24FB the second time (which provides -2 for fwdReason)? [hint: check out the stack contents] Look at how 0x3C24FB is called. Can a static analysis tool find out that 0x3C24FB is called by the Max++ code?

1. Microsoft, "Dynamic-Link Library Entry-Point Function",
available at