Wednesday, August 31, 2011

Malware Analysis Tutorial 2 - Ring3 Debugging

Learning Objectives:
  • Efficiently master a Ring3 debugger such as Immunity Debugger
  • Can control program execution (step in, over, breakpoints)
  • Can monitor/change program state (registers, memory)
  • Comments annotation in Immunity Debugger
This tutorial can be used as a lab module in
  • Computer architecture
  • Operating systems
  • Discrete Maths (number system)
1. Introduction

To reverse engineer a malware, a quality debugger is essential. There are two types of debuggers: user level debuggers (such as OllyDbg, Immunity Debugger, and IDA Pro), and kernel debugger (such as WinDbg, SoftIce, and Syser). The difference between user/kernel level debuggers is that kernel debuggers run with higher privilege and hence can debug kernel device drivers and devices, while user level debuggers cannot. It is well known that modern OS such as Windows relies on the processor (e.g., Intel CPU) to provide a layered collection of protection domains. For example, on a typical Intel CPU, programs can run in four modes, from ring0 (kernel mode) to ring3 (user level). In this case, we also call user level debuggers "ring3 debuggers".

A natural question you might have is: Since ring0 debuggers are more powerful than ring3 debuggers, why not use ring0 debuggers directly? Well, there is no free lunch as always. Ring3 debuggers usually come with a nice GUI which can greatly improve the productivity of a reverse engineer. Only when necessary, we'll use the command-line ring0 debuggers (such as WinDbg). There is one exception though - recently, IDA Pro has introduced a GUI module which can drive WinDbg for kernel debugging. This is a nice feature and you'll have to pay for it.

In this tutorial, we assume that you would like to use open-source/free software tools. The following is a combination of debuggers we'll use throughout the tutorial: Immunity Debugger (IMM for short) and WinDbg.

2. Brief Tour of IMM

Now we will have a brief introduction of IMM. If you have not installed your Virtual Machine test bed, check out the first tutorial Malware Analysis Tutorial - A Reverse Engineering Approach (Lesson 1: VM Based Analysis Platform) for setting up the experimental platform.



Figure 1. Screenshot of IMM

As shown in Figure 1, from left-top anti-clockwise, we have the CPU pane (which shows the sequence of machine instructions and user comments), the Register pane (which you can watch and modify the values of registers), the Stack pane and the Memory dump. Before we try to reverse engineer the first section of Max++, it is beneficial to learn how to use the short-cut keys in the debugger efficiently.

In general, to use a debugger efficiently, you need to know  the following :
  1. How to control the execution flow? (F8 - step over, F7 - step in, F9 - continue, Shift+F9 - continue and intercept exceptions)
  2. How to examine data? (In Memroy pane: right click -> binary -> edit, in Register pane: right click -> edit)
  3. How to set breakpoints? (F2 for toggle soft-breakpoint, F4 - run to the cursor, right click on instruction -> Breakpoint -> for hardware and memory access point)
  4. Annotation (; for placing a comment)
Most of the above can be found in the Debug menu of the IMM debugger, however, it's always beneficial to remember the shortcut keys. We now briefly explain some of the functions that are very useful in the analysis.

2.1 Execution Flow Control

The difference between step over/step in is similar to all other debuggers. Step in (F7) gets into the function body of a Call instruction. Step over (F8) executes the whole function and then stops at the next immediate instruction. Notice that F8 may not always get you the result you desire -- many malware employ anti-debugging techniques and use return-oriented programming technique to redirect program control flow (and the execution will never hit the next instruction). We will later see an example in Max++.

F9 (continue) is often used to continue from a breakpoint. Notice that the debugger automatically handles a lot of exceptions for you. If you want to intercept all exceptions, you should use SHIFT+F9. Later, we will see an example that Max++ re-writes the SEH (structured exception handler) to detect the existence of debuggers. To circumvent that anti-debug trick, you will use SHIFT+F9 to manually inspect SEH code.

2.2 Data Manipulation

In general, you have three types of data to manage: (1) registers, (2) stack, and (3) all other segments (code, data, and heap).

To change the value of a register, you can right click on the register and select Edit to change its value. Notice that when a register contains a memory pointer (the address of a memory slot), it is very convenient to right click on it and select "Follow in Dump" or "Follow in Stack" to watch its value.

The design of IMM does not allow you to directly change the value of EIP register in the Register pane. However, it is possible to change EIP using the Python shell window. We leave it as a homework question for you to figure out how to change EIP.

In the Memory Dump pane, select and right click on any data, and then select Binary->Edit. You can enter data conveniently (either as a string or binary number).

You are able to reset the code (as data). In CPU pane, right click and select "Assemble", you can directly modify the code segment by typing assembly instructions! You can even modify a program using this nice feature. For example, after modifying the code segment, you can save the modified program using the following approach:

   (1) Right click in CPU pane
   (2) Copy to Executable
   (3) Copy All
   (4) Close the dialog window (list of instructions that are modified)
   (5) Then a dialog asking for "save the file" pops. Select "yes" and save it as a new executable file.
  
2.3 Breakpoints

Software breakpoints (F2) are the post popular breakpoints. It is similar to the breakpoints available in your high-level language debuggers. You can have an unlimited soft breakpoints (BPs) and you can set conditions on a soft BP (e.g., to specify that the BP should stop the program only when the value of a register is equal to a certain number).

Soft BPs are implemented using the INT 3 instruction. Basically, whenever you set a breakpoint at a location, the debugger replaces the FIRST byte of that instruction with INT 3 (a one-byte instruction), and saves the old byte. Whenever the program executes to that location, an interrupt is generated and the debugger is called to handle that exception. So the debugger can then perform the condition check on the breakpoint and stop the program.

Hareware breakpoints can be set by right click in the CPU pane and then select Breakpoints -> Hardware, on execution. Notice that there are two other types hard BPs availalbe (memory read, memory access). As its name suggests, hard BPs are accomplished by taking advantage of hardware. On a Intel CPU, there are four hardware BP registers which records the location of hard BPs. This means that at any time, you can have up to 4 hard BPs.

Hardware BPs are very useful if you need to find out which part of the code modifies a variable. Just set a memory access BP on it and you don't have to look over the entire source code to find it out.

2.4 User Annotation

Although seemingly trivial, user comments and annotation is a very important function during a reverse engineering effort. In the CPU pane, pressing ";" allows you to add a comment to a machine instruction and pressing ":" allows you to label a location. Later when the location is referred to as a variable or a function, its label will be displayed. This will greatly ease the process of analysis.

3. Challenges of the Day

It's time to roll-up your sleeves and put all we have learned into practice! The objective today is to analyze the code snippet from 0x413BC8 to 0x413BD8. Answer the following questions. We will post the solution to these questions in the comments area.

Q1. What is the value of EAX at 0x413BD5 (right before int 2d is executed)?
Q2. Is the instruction "RET" at 0x413BD7 ever executed?
Q3. If you change the value of EAX to 0 (at 0x413BD5), does it make any difference for Q2?
Q4. Can you change the value of EIP at 0x413BD5 so that the int 2d instruction is skipped?
Q5. Modify the int 2d instruction at (0x413BD7) to "NOP" (no operations) and save the file as "max2.exe". Execute max2.exe. Do you observe any difference of the malware behavior? (check tutorial 1 Malware Analysis Tutorial - A Reverse Engineering Approach (Lesson 1: VM Based Analysis Platform) for monitoring the network traffic)






14 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Q1. The value of the EAX register is 1.

    Q2. The ret is never executed, it is skipped.

    Q3. Yes, the ret will be executed.

    Q4. Yes, with a the python shell window you can change the EIP.

    Q5. Yes, the malware does not run. Without the int 2d instruction, the ret is executed and the program does not call the MAX++ function.

    ReplyDelete
  3. at Q5 you have put wrong address.

    ReplyDelete
  4. The address in Q5 is correct. Check Figure 1 again.

    ReplyDelete
  5. Outstanding stuff here Dr Fu. Great work and great contribution to the malware analysis community.
    Thank you,
    David aka IndiGenus

    ReplyDelete
  6. It would be great if you could post how to change the EIP here in comments.

    ReplyDelete
  7. From a Python Shell

    >>> imm = immlib.Debugger()
    >>> imm.setReg('EIP',int("hex value",16))

    ReplyDelete
  8. Thank you Dr.Fu !
    I am eager to learn moar!

    ReplyDelete
  9. I think I've downloaded another sample, I mean, the entry point is totally different from yours in Figure 1. Could someone please upload the sample again?...

    ReplyDelete
  10. Hello Dr. Fu, first thanks for this great tutorial.

    As you said "However, it is possible to change EIP using the Python shell window".

    However, I found another simple solution for this, similar like on Ollydbg, by right click the new EIP destination address on CPU window and select "New origin here". Is this a different?

    ReplyDelete
  11. Hi,
    I am having a bit of trouble with the Q3.
    If I change the value of EAx to zero, still the RETN is NOT executed.
    However, if I put a soft-breakpoint and also change the value of EAX then RETN is executed.

    Please let me know if this is the right thing to do or am I missing something? As stated in the question there is no breakpoint mentioned.

    Thank you in advance

    ReplyDelete
  12. http://cryptogranarchy.blogspot.com/
    http://cryptogranarchy.my1.ru/

    ReplyDelete
  13. Thank you Dr. Fu, I really enjoyed this tutorial. I've been learning Malware Analysis for the past 3 or so months, and I learned a few things about the debugger! :)

    ReplyDelete
  14. Q1. The value of the EAX register 1.

    Q2. The ret is never executed, it is skipped.

    Q3. No, stil is not executed.

    Q4. Yes, execution goes to retn and program is terminated.

    Q5. Yes, the malware does not run. Without the int 2d instruction, the retn is executed and program is terminated.

    ReplyDelete