Wednesday, December 14, 2011

Malware Analysis Tutorial 7: Exploring Kernel Data Structure

Learning Goals:

  1. Explore kernel data structures effectively, e.g., using WinDbg.
  2. Understand the important kernel structures of Windows to maintain live information about processes and threads.
  3. Know the difference between hard and soft breakpoints and can use them effectively during debugging.
  4. Practice code reverse engineering to understand assembly code.
Applicable to:
  1. Computer Architecture
  2. Operating Systems Security
  3. Assembly Language
  4. Operating Systems
1. Introduction

     This tutorial shows you how to explore kernel data structures of windows using WinDbg. It is very beneficial to us for understanding the infection techniques employed by Max++. We will look at some interesting data structures such as TIB (Thread Information Block), PEB (Process Information Block), and the loaded modules/dlls of a process. We will examine what Max++ did to some important kernel DLL files.

1.1 Lab Setup
If you have not installed WinDbg on your host machine (note: not the VM instance), please follow Tutorial 1 first to install the VirtualBox platform (a small LAN consisting of one Linux gateway and one Windows instance infected with Max++). Then please follow Tutorial 4 to install WinDbg on the host machine (note: not the VM instances) and configure the piped COM port for the VM instance to be debugged.  The following is the steps of launching the VM instance and WinDbg:

  1. Launch the Windows guest OS in VirtualBox first. Boot it in the "Debugged" mode. (Follow Tutorial 4 for how to include the "Debugged" boot option).
  2. On your Host machine, start a command window and change directory to "c:\Program Files\Debugging Tools for Windows(x86)" and type the following.
    windbg -b -k com:pipe,port=\\.\pipe\com_11
  3.  You should see in the WinDbg window the following "Breakpoint on INT 3". It means that it currently stops at a software breakpoint (INT 3). Type "g" (means "go") to let it continue. If necessary, "g" it a second time.
  4. Occasionally you might find that your windows guest OS is frozen. Simply in the WinDbg window (at the host) type "g".
  5. Now start the Immunity Debugger in the windows guest OS, and load the Max++ (see Tutorial 1 for where to get Max++ binary).
  6. In the Code Pane of IMM, right click to go to "0x401018" and then set a HARD BREAKPOINT (right click and select "Breakpoint->Hardware, on Execution") at it. This is where we stopped at in Tutorial 6. As you see right now, the instruction at 0x401018 is "DEC DWORD [EAX+20]". Later, when we stop at this address, the instruction will be overwritten, due to the self-extracting feature of Max++, see details in Tutorial 6.  Then Press F9 (continue) to run to 0x00401018.

 1.1.1 Why Hardware Breakpoint?
  Notice that, you have to use hardware breakpoint in Step 6. Why not software breakpoint? Think about how software breakpoint is implemented. When you set a software breakpoint in a debugger, the debugger actually modifies the first byte of the instruction at that location to "INT 3". When the execution gets to the "INT 3", the windows kernel calls debugger to handle the interrupt (which then stops and highlights it in the debugger window, and when you resume the execution or cancel the breakpoint, the debugger writes the original opcode back).

 Recall that the malware does self-extraction (see Tutorial 6). It overwrites the "INT 3" and you will never be able to stop at the desired location 0x401018! That's the reason we use hardware breakpoint. When a hardware breakpoint is set, the address is recorded in one of the four HW breakpoint registers provided by an Intel CPU. The CPU examines the registers everytime one instruction is executed and stops at it. The only drawback is that you can set up to 4 hardware breakpoints at any time.

1.2 Analysis Objective
We will analyze around 20 instructions, from 0x00401018 to 0x0040105B. The assembly code is shown in Figure 1.


Figure 1. Code Segment to Analyze (0x401018 to 0x40105B)



2. FS Register, TIB, and PEB


As shown in Figure 1, instruction 0x00401018 (MOV EAX, DWORD FS:[18]) does some important trick . It is reading the memory word located at FS:[18] into EAX. Here FS, like SS and DS, is one of the segment registers provided in Intel x86 register file.  The FS:[18]is an address specified using the displacement addressing mode. The address is calculated as [value stored in FS] + 0x18. 

Whenever you see some code accessing the FS register, you should pay special attention! FS points to the most important Windows kernel data structure related to the current process/thread. Check out reference [1] for details and you will see that FS:[18] stores the entry address of TIB (Thread Information Block) - also called TEB.

Then the instruction at 0x40101E (MOV EAX, [EAX+30]) takes the word located at EAX+0X30. What does this mean? Since now EAX has the entry address of the TIB, it is now taking some data field which is 0x30 bytes away from the beginning of the TIB record.

We need to figure out the internal data structure of TIB. There are two ways: (1) MSDN document, and (2) take advantage of the WinDbg kernel debugger. For the most well known data structures like TIB, people have already done the address calculation for you. For example, by reading [1], you would know that offset 0x30 stores the entry address of PEB (process information block). But for most cases, for a kernel data structure, you'll have to manually calculate the offset (i.e., figure out the size of all the previous attributes in the structure and sum them up).

The most convenient way would be using WinDbg. Now come back to our WinDbg window in the host machine and type the following: (Ctrl+Break). This is to interrupt the running of the guest windows and get the control back to WinDbg. Then type the following:

dt nt!_TEB

This is to say, display the data type of "_TEB" located in the nt module. If you need information of the "nt" module, you can type

lm

This displays the loaded modules and you can see that  "nt" is the module name for "ntoskrnl.dll".

WinDbg is actually very powerful, by appending "-r n" to the dt command, you can display the data types recursively, i.e., when a data field itself is a complex data type, you can display its contents. For example, dt nt!_TEB -r 2 display the contents recursively and the extraction level is 2.

From the WinDbg dt dump, you can immediately infer that 0x30 of TEB is the entry address of PEB.

3. Loaded Module List
We now proceed to the next few instructions.  Using the technique introduced in Section 2, we can infer that instruction at 0x401021 (MOV ECX, [EAX+C]) loads into the ECX the pointer to LDR (loaded module list). The information of PEB structure can be found on MSDN [2], however, you will find that WinDbg actually can provide more detailed information, including many undocumented attributes.

Now we need to look at the structure of LDR (_LIST_ENTRY). Executing dt nt!_PEB_LDR_DATA in WinDbg, we have the following dump:

kd> dt _PEB_LDR_DATA
nt!_PEB_LDR_DATA
   +0x000 Length           : Uint4B
   +0x004 Initialized      : UChar
   +0x008 SsHandle         : Ptr32 Void
   +0x00c InLoadOrderModuleList : _LIST_ENTRY
   +0x014 InMemoryOrderModuleList : _LIST_ENTRY
   +0x01c InInitializationOrderModuleList : _LIST_ENTRY
   +0x024 EntryInProgress  : Ptr32 Void
kd> dt _LIST_ENTRY
nt!_LIST_ENTRY
   +0x000 Flink            : Ptr32 _LIST_ENTRY
   +0x004 Blink            : Ptr32 _LIST_ENTRY


 Notice that ECX now contains the address of the offset 0xC of the _PEB_LDR_DATA, starting at this address is a _LIST_ENTRY structure which contains two computer words (each word is 4 bytes long). The first four bytes is the Flink, which points to the next _LIST_ENTRY, and the next four bytes is the Blink, which points to the previous _LIST_ENTRY. So this is exactly a doubly linked list structure! More details of the PEB_LDR_DATA structure can be found in MSDN document [4]. However, again, notice that the documentation in [4] is not complete and is NOT accurate! The most authorative information should be from WinDbg.


Now let us proceed to instruction 00401029 (MOV EAX, DWORD [ECX]). This is essentially to move the contents of the FLink to EAX. Now according to [4], the EAX now has the entry address of the_LDR_DATA_TABLE_ENTRY for the next module. However, it is WRONG! the correct information is that EAX now contains the address of the offset 0x8 of _LDR_DATA_TABLE_ENTRY (i.e., the address of the data field "InMemoryOrderLinks")

Now comes the interesting part. Look at instruction 0x0040102D (MOV EDX, DWORD [EAX+20]), what does this mean? Let's examine the data structure LDR_DATA_TABLE_ENTRY first.

kd> dt _LDR_DATA_TABLE_ENTRY -r2
nt!_LDR_DATA_TABLE_ENTRY
   +0x000 InLoadOrderLinks : _LIST_ENTRY
      +0x000 Flink            : Ptr32 _LIST_ENTRY
         +0x000 Flink            : Ptr32 _LIST_ENTRY
         +0x004 Blink            : Ptr32 _LIST_ENTRY
      +0x004 Blink            : Ptr32 _LIST_ENTRY
         +0x000 Flink            : Ptr32 _LIST_ENTRY
         +0x004 Blink            : Ptr32 _LIST_ENTRY
   +0x008 InMemoryOrderLinks : _LIST_ENTRY
      +0x000 Flink            : Ptr32 _LIST_ENTRY
         +0x000 Flink            : Ptr32 _LIST_ENTRY
         +0x004 Blink            : Ptr32 _LIST_ENTRY
      +0x004 Blink            : Ptr32 _LIST_ENTRY
         +0x000 Flink            : Ptr32 _LIST_ENTRY
         +0x004 Blink            : Ptr32 _LIST_ENTRY
   +0x010 InInitializationOrderLinks : _LIST_ENTRY
      +0x000 Flink            : Ptr32 _LIST_ENTRY
         +0x000 Flink            : Ptr32 _LIST_ENTRY
         +0x004 Blink            : Ptr32 _LIST_ENTRY
      +0x004 Blink            : Ptr32 _LIST_ENTRY
         +0x000 Flink            : Ptr32 _LIST_ENTRY
         +0x004 Blink            : Ptr32 _LIST_ENTRY
   +0x018 DllBase          : Ptr32 Void
   +0x01c EntryPoint       : Ptr32 Void
   +0x020 SizeOfImage      : Uint4B
   +0x024 FullDllName      : _UNICODE_STRING
      +0x000 Length           : Uint2B
      +0x002 MaximumLength    : Uint2B
      +0x004 Buffer           : Ptr32 Uint2B
   +0x02c BaseDllName      : _UNICODE_STRING
      +0x000 Length           : Uint2B
      +0x002 MaximumLength    : Uint2B
      +0x004 Buffer           : Ptr32 Uint2B
   +0x034 Flags            : Uint4B
   +0x038 LoadCount        : Uint2B
   +0x03a TlsIndex         : Uint2B
   +0x03c HashLinks        : _LIST_ENTRY
      +0x000 Flink            : Ptr32 _LIST_ENTRY
         +0x000 Flink            : Ptr32 _LIST_ENTRY
         +0x004 Blink            : Ptr32 _LIST_ENTRY
      +0x004 Blink            : Ptr32 _LIST_ENTRY
         +0x000 Flink            : Ptr32 _LIST_ENTRY
         +0x004 Blink            : Ptr32 _LIST_ENTRY
   +0x03c SectionPointer   : Ptr32 Void
   +0x040 CheckSum         : Uint4B
   +0x044 TimeDateStamp    : Uint4B
   +0x044 LoadedImports    : Ptr32 Void
   +0x048 EntryPointActivationContext : Ptr32 Void
   +0x04c PatchInformation : Ptr32 Void

We know that the instruction MOV EDX, DWORD [EAX+20] is to load the contents of the word located at EAX+0x20. But where is EAX pointing at? It's pointing at offset 0x8 of the _LDR_DATA_TABLE_ENTRY. Thus EAX+0x20 is pointing at offset 0x28 (see the emphasized area of the data structure dump above), which is the "Buffer" field of the FullDllName.

In Windows, _UNICODE_STRING is Microsoft's effort to cope with the multi-cultural/language needs for localization of windows in different parts of the world. It consists of two parts: (1) length of the string, and (2) the real raw data of the string. So the "Buffer" field encodes the full DLL name in unicode!

What it essentially means is that code at 0x0040102Dis starting to process/read the DLL name! To verify our conjecture, look at the register EDX in the Immunity Debugger (Figure 3).You can see that the first module name we are looking at is "ntdll.dll".


Figure 3: EDX points to DLL Name

4. Challenges of the Day
Now let us try to get the whole picture of the code from 0x00401018 to 0x00401054. You might notice that we have actually a nested 2-layer loop here.

The outer loop is from 0x40102E to 0x401054, this is essentially a do-while loop. The inner loop is from 0x401036 to 0x401046. Our challenges today are:
(1) What does the inner loop from 0x401036 to 0x401046 do?
(2) What does the out-loop do?

A hint here: the code we discussed today tries to search for a module and do some bad things to that module (these malicious operations will start at 0x40105C). Use your immunity debugger to find it out. We will show you these malicious operations in the next tutorial.

References
1. Wiki, "Windows Thread Information Block", Available at http://en.wikipedia.org/wiki/Win32_Thread_Information_Block
2. Microsoft, "PEB Structure", Available at http://msdn.microsoft.com/en-us/library/windows/desktop/aa813706(v=vs.85).aspx
3.Microsoft, "PEB_LDR_DATA structure", Available at http://msdn.microsoft.com/en-us/library/windows/desktop/aa813708(v=vs.85).aspx


32 comments:

  1. Hello, thanks for the nice tutorial. Still one thing I don't get:
    If after execution of
    401021 mov ecx, [eax+c]
    It contains the address of PebLoaderData, how can it later (see that there is add ecx, 1c) be pointing at InLoadOrder list? Shouldn't it be InInitializationOrder list?

    ReplyDelete
  2. Check the "kd> dt _PEB_LDR_DATA" 2 paragraphs below - the dump generated by WinDbg, you'll see why.

    ReplyDelete
  3. Hi Dr. Fu. Thank you for this great tutorial !
    On figure 1, line 00401026, comment should say : "now ECX has InInitializationOrderModuleList" and NOT InLoadOrderModuleList

    ReplyDelete
  4. i am can't understand where _LDR_DATA_TABLE_ENTRY is coming from? i can see it using WinDbg but i can't go to it logically using the structures of _LIST_ENTRY -> Flink ?

    ReplyDelete
  5. _LDR_DATA_TABLE_ENTRY is from internal MS documentation. Search it online and you can find its data structure definition.

    ReplyDelete
  6. For kernel structures good to use this tool http://ntinfo.biz/index.php/xntsv

    ReplyDelete
  7. PEB (process information block) = pEb = Process Environment Block.

    ReplyDelete
  8. What you actually got was the BaseDllName since you got the address of the InInitializationOrderModuleList you should be at InInitializationOrderLinks [+0x10] then the basename should be in [+0x10 + 0x20]

    ReplyDelete
  9. Thanks for sharing, very informative blog.
    ReverseEngineering

    ReplyDelete
  10. https://fumalwareanalysis.blogspot.com/2011/12/malware-analysis-tutorial-6-analyzing.html?showComment=1563258158230#c8895592019823985370

    ReplyDelete
  11. Hello,
    Firstly thanks you for this fantastic course.
    I saw that I wasn't the only one to see that you missed a little bit. I write it for the next people who will continue to use this course.

    At "0x401021 (MOV ECX, [EAX+C])" instruction, EAX get the address of the PEB_LDR structure (EAX+30 is the PEB address, and EAX+30+C leads to the address of the PEB_LDR). Then at address 0x00401026, there is the following instruction "ADD ECX,1C". Referring to "dt _PEB_LDR_DATA", ECX has now the address of InInitializationOrderLinks, then especially its Flink. So we can see that there is a gap of 0x10 between what I am currently explaining and this tutorial. Your example is really great : the instruction at address 0x401032E "MOV EDX, DWORD [EAX+20]". Referring to "dt _LDR_DATA_TABLE_ENTRY -r2", you found that was the buffer of the FullNameDll, as I said previously there is a gap of 0x10. Therefore, EDX will have the value of the buffer of the BaseNameDll. In this link (https://docs.microsoft.com/en-us/windows/win32/devnotes/ldrdllnotification), it is cleary written that FullNameDll gives also the path of the dll.

    I will continue to follow the enxt steps of this course. It is a pleasure, thanks.

    ReplyDelete
  12. You’re so interesting! I do not believe I have read a single thing like that before.
    Free Downlink Fpr Latest PC Software is given below:
    https://softserialskey.com/simplify3d-crack/

    ReplyDelete
  13. It is really amazing, Thanks for sharing.
    For any type of certification include BIS Certifications visit our website Bis Consultant in delhi
    .


    We provide the IT and telecom equipment manufacturers to get their products certified from BIS,
    India along Product Testing and Type Approval Services from WPC, India

    ReplyDelete
  14. Simply wish to say your article is as astounding.
    The clearness to your publish is just cool and that i can assume you’re an expert in this subject.
    Well together with your permission let me to seize your RSS feed to stay updated with approaching
    post. Thanks one million and please keep up the enjoyable work.
    https://softkeygenpro.com/adobe-xd-cc-crack/

    ReplyDelete
  15. Water bodies are the main source of transportation for international freight forwarding. Due to this, sea freight company in Delhi,
    visit
    Freight Forwarder in Vietnam
    Shipping Company In India

    ReplyDelete
  16. Great Article
    Cyber Security Projects


    Networking Security Projects

    JavaScript Training in Chennai

    JavaScript Training in Chennai

    The Angular Training covers a wide range of topics including Components, Angular Directives, Angular Services, Pipes, security fundamentals, Routing, and Angular programmability. The new Angular TRaining will lay the foundation you need to specialise in Single Page Application developer. Angular Training

    ReplyDelete
  17. https://thepcgameshere.com/days-gone-pc-download-game/
    Days Gone For Pc Download is a game of terror and fear, based on adventure and operations. It is designed, created through the 1st party “SIE Bend” company. And is published through the “SIE studio”. In its adventure element, this game illustrates the worldwide spread of disease. In which many peoples become zombies, that are thirsty for human blood. The game further features the adventure about the game’s protagonist “Decon”. The adventure is about saving his wife two years after the outbreak of a pandemic disease. In fact, Deacon’s wife, Sarah, worked in a government lab. And with her in the lab, her partner Iron Mark is working to create a virus.

    ReplyDelete
  18. Great Article.https://procracx.net/sparkocam-crack/

    ReplyDelete
  19. Rebooting a device helps fix many temporary issues with it, and this applies to printers as well. When your PC says that your hp printer in error state is in an error state, it’s worth turning off your printer and then turning it back on.

    ReplyDelete


  20. Canon Printer Driver and software for your model is a must for utilizing all the devices such as your printer, scanner, fax machines, and much more perfectly. Regardless of having a new PC, printer, an updated OS, or having trouble with your PC and Printer communication, updating your drivers or reinstalling Canon Driver Download can be helpful.Downloading ij.start.cannon drivers can be confusing, you need to cautiously install the software and drivers of your canon printer model that are compatible with your PC.

    ReplyDelete
  21. I am amazed with all the turning point with all the great and nice ways that have been the turning point out here.
    Massage Service in Delhi

    ReplyDelete
  22. Corel Painter 2022 Crack presents an application tailored for serious painters. Why not worry about the special system of virtual paint application, studied over time, built with the help of musicians, for artists? Corel Painter Keygen has updated all of its user software with over 650 redesigned icons and handles with a dark top-down concept. Very delightful: Compared to the previous edition, the new emblems are bigger, easier to understand, and easy to understand.

    Corel Painter 2022 Crack
    Glary Utilities Pro
    Studio One 5 Crack
    Infected Mushroom Manipulator Mac Crack


    ReplyDelete
  23. Hi,
    It is really a great post by you. I found this so interesting. Keep it up and keep sharing such posts.
    Every year number of students go abroad to purse degrees in different fields. Students have to make many assignments as their part of degree. Sometimes crafting a lot of assignment and meet the deadlines is difficult for students. Therefore, they look for someone to Do my assignment for me.

    ReplyDelete