Wednesday, August 15, 2012

Malware Analysis Tutorial 32: Exploration of Botnet Client


Learning Goals:
  1. Practice WinDbg for Inspecting Kernel Data Structure
  2. Use Packet Sniffer to Monitor Malware Network Activities
  3. Understand Frequently Used Network Activities by Malware
  4. Expose Hidden/Unreachable Control Flow of Malware
Applicable to:
  1. Operating Systems
  2. Assembly Language
  3. Operating System Security
1. Introduction

This tutorial explores the botnet client part of Max++. We assume that you have completed tutorial 31 where our lab setting directly results from its analysis. You should have the following before the analysis, as shown in Figure 1.

Figure 1. Process Network Message

As shown in Figure 1, there is a big loop of calling zwReplyWaitReceivePortEx, which tries to wait for a message from the port. EAX represents the return code of the function, e.g., 0x0 represents OK and 0x102 represents time-out. If time-out, it calls function 0x35671D03. Otherwise, it proceeds to process the message.

Max++ first reads the 2nd word in the message, this is supposed to be an integer between 1 and 3. For case 3, it calls function 0x3567162D. for case 1, it calls function 0x3567204F. Note that all the four parameters of 0x3567204F are from the msg (5th, 7th, 8th and 10th words from the message respectively).

2. Execute Remote Command (#1).
Now let's delve into function 0x3567204F. Again, notice that this function gets executed ONLY WHEN the second word of the remote network message is 0x1. Figure 2 shows its function body.

Figure 2. Mysterious Function 0x3567204F

As shown in Figure 2, one of the input parameters of the function is ESI register (pointing to 0x00181580). This is a mysterious data structure. We suspect that it is the COM interface of the fake JScript COM object (the remote max++.x86.dll). The COM interface wraps a C++ object so that the program (Max++) can call the functions available in the DLL.

Next, the function clears memory region 0x009FFEE8, and copies the first four parameters passed in the stack to this region. Notice that these four parameters are the 5th, 7th, 8th, and 10th words in the remote message. Then it saves 0x009FFFE8 to EBP-10.

Max++ then proceeds to call some function of the remote DLL (see the second highlighted area in Figure 2). It seems to be taking out the virtual function table of the object and it invokes the function located at offset 0x18. This function takes at least 7 parameters, and they are [ESI+18], [ESI+20], 0x356782A4, 0, 1, [EBP-20], and [EBP-10]. Here [ESI+18] and [ESI+20] must be some data attributes of the COM interface object, and [EBP-20] is a pointer to collect result. From the VariantClear call (see the last highlighted area of figure 2), we can infer that [EBP-20] is a pointer to VARIANTARG. Lastly notice that [EBP-10] stores 0x009FFEE8, which contains the four words from the remote message.

Although we have no access of the remote DLL, we can infer that this part of the malware is taking some command/parameters of the remote message, and execute some functions correspondingly. It looks like a bot-net client.

3. Initilization (command #3)
We now look at case #3 (where the 2nd word of the remote message is 3). In this case,  function 0x3567162D is invoked.. To enforce the control flow into this case, we simply change the register values correspondingly. Figure 3 displays the body of function 0x3567162D.

Figure 3. Function Body of 0x3567162D

As shown in Figure 3, function 0x3567162D extracts a file from the hidden drive into a newly allocated heap area. The contents of the file seem to be some special javascript tags and HTTP request headers. The file name is read from an absolute location 0x3567806C. Unless the address is being re-written, it's always reading the same file.

Then Max++ jumps to function 0x356719E4, as shown in Figure 4. Again, it's invoking the COM interface of the remote object which is not available. As shown in Figure 4, it loads the virtual function table of the COM object multiple times and calls several functions of that COM interface.

Figure 4. Calling COM Object

4. Conclusion
Since the remote object is not available, we are not able to drill deeper into Max++. The bottomline via the analysis in this tutorial is that the maxware is able to listen to a port and receive commands from a remote server. The remote command has a fairly simple structure, the second word of the remote message decides the action to take at the max++ client side. #3 seems to be initializing service; and #1 seems to be executing some actions, where four words from the remote message serve as the parameters.

In the next tutorial, we will submit Max++ to various malware analysis engines and match their discovery with our manual analysis.


Saturday, August 4, 2012

Malware Analysis Tutorial 31: Exposing Hidden Control Flow


Learning Goals:
  1. Practice WinDbg for Inspecting Kernel Data Structure
  2. Use Packet Sniffer to Monitor Malware Network Activities
  3. Understand Frequently Used Network Activities by Malware
  4. Expose Hidden/Unreachable Control Flow of Malware
Applicable to:
  1. Operating Systems
  2. Assembly Language
  3. Operating System Security
1. Introduction

This tutorial analyzes the network activity performed by max++.00.x86 when its efforts to load 147.47.xx.xx\max++.x86.dll fails. We show the use of network sniffer to assist the analysis. We show the use of debugger to expose and analyze the hidden/unreachable control flow of a malware.

2. Lab Configuration

We assume that you have finished Tutorial 30 and max++.00.x86 is already resident on the system. .Now set a breakpoint at 0x35671797 (this is where the malware tries to modify the kernel data structure about library path of max++. Later it will call Ole32.CoInitialize to load the remote). Now at the Ubuntu server, start the Wireshark packet sniffer and listen on the local area network (use ifconfig to find out which adapter to listen to).

Now press F9 until you hit 0x35671797. At this moment, in the Wireshark window, no packets should be intercepted yet. Execute the program step by step until we reach 0x35671D8B. This is right before the call of ole32.CoInitialize.

Figure 1. The Code Which Tires to Load Remote DLL


3. Wireshark Assisted Analysis
Now the intersting part, just one more step in the WinDbg instance, the Ole32.CoInitialize is called. Then you can notice that there is a lot of communication between 169.253.236.201 (our WinDbg instance) and 74.117.114.86. From Figure 2, you can tell that it's using a special HTTP method PROPFIND to retrieve max.x86.dll (note that PROPFIND is a method provided by the WebDav protocol which is an extension of HTTP).

Figure 2. Network Trace of Ole32.CoInitialize
However, interestingly, if we directly F9 from 0x35671797, we got the following in Figure 3. Notice the difference!

Figure 3. A slightly different network trace
 Clearly, the malware is trying to invoke the /install/setup.ppc.php at 108.61.4.52! Now the question is: who is sending this request and why didn't we capture it when we step by step the execution?

Challenge 1. Find a way to trace back to the sender of the packet to 108.61.4.52.

 4. Run Malware without Remote DLL
We are interested in looking at the rest of the malware logic and would like to have a rough idea of Max++.00.x86's behavior what if 74.117.114.86/max++.x86.dll is loaded. This would need us to tweak the control flow a little bit to observe the behavior. We need to perform the following lab configuration:

(1) set a breakpoint at 0x35671D8D and run to it. See Figure 4. This is right before the ole32.CoInitialize() call, which tries to load the remote 74.117.114.86/max++.x86.dll. However, the file is not available any more and the call will fail and terminate the entire process. We need to skip this call so that we could examine the rest of the malware logic.
Figure 4. Breakpoint to Divert Control Flow When Remote DLL Loading Fails


(2) Click the 2nd button on the toolbar (the Python window) and then type
  imm.setReg("EIP", 0x35671D93)
 This is to skip the call of ole32.coInitialize and jump to the next instruction

(3) Now in the register window, change the value of EAX to 0 (to indicate that the call is a success).

After the control flow diverting is successful, max++.00.x86 jumps to function 0x35674737, whose function body is shown in Figure 5.
Figure 5. Function 0x35674737 - Allocate Memory in Heap
Then the malware calls function 0x35671E37 [note at this moment 0x00182130 is the beginning address of allocated heap memory]. As shown in Figure 6, it is constructing some data structure at 0x00182130 (size: 0x24 bytes).

Challenge 2. Use data breakpoints to find out what is the type of the data structure constructed by 0x35671E37.

Figure 6. Function 0x35671E37 constructs some data structure
The control soon flows to 0x35671C4A. This function has several interesting calls, as shown in Figure 7. It seems to be creating a port and listens to it. To figure out the logic, we have to carefully handle the execution of function 0x35671E61 (because it is invoking functions in the remote max++.x86.dll, which is not loaded due to network failure).

Figure 7. Function body of 0x35671C4A
Now let's delve into function 0x35671E61 first,Figure 8 shows its first part. It's a call to ole32.CLSIDFromProgID("JavaScript"). The function locates registry entry based on program ID information. But this call triggers the remote DLL. We'll need to look at the details.
Figure 8. A Call That Triggers remote max++.x86.dll

By tracing into the old32.CLSIDFromProgID("JavaScript") call, we notice that at theole32.CoGetComCatalog call, it is stuck on loading the 74.117.114.86/max++.x86.dll. As shown in figure 9. It seems that CoGetComCatalog visits the loaded module again (and reads the manipulated information of the current module and thus trying to load the remote module. This is similar to the CoInitialize call in discussed in  Tutorial 30).
Figure 9. CLSIDFromProgID Stuck on CoGetComCatalog
Since the remote module name causes the problem, we could try to reset it back. From Tutorial 30, we know that the module name/path information is located at 0x002529c0. We could change it back to the original name "\\.\C2CAD972#4079#4fd3#A68D#AD34CC121074\L\max++.00.x86". (Right click on address 002529c0 in memory dump, select Binary Edit, then enter the path string in the UNICODE box).
Figure 10. Modify the Module Name - Convert it Back

Now let's let's observe the second parameter of CLSIDFromProgID in Figure 8. Via a simple analysis we can identify that the second parameter is located at 0x009FFF48, as shown in Figure 11.

Figure 11. Successful Completion of CLSIDFromProgID

As shown in Figure 11, address 0x009FFF48 stores the class ID. Pay attention to the byte order (you should read the first 4 bytes in the reversed order). For example, for the first 4 bytes (60 C2 14 F4), it should read as 0xf414c260. Searching f414c260 in regedit, we found CLSID {f414c260-6ac0-11cf...}, as shown in Figure 11. You can verify that it matches the highlighted area in the IMM memory dump pane. Reading more details about CLSID {f414c260-6ac0-11cf...}, we can find that the CLSID is mapped to jscript.dll in the system directory, this is as expected (i.e., the CLSIDFromProgID works correctly, given that the broken remote library link did not crash the CoGetComCatalog call in figure 10).

However, notice that, there is a possibility that the remote library when loaded, will re-write the registry entry so that later when JSScript object is used, it is actually referring to the functions of the remote library. As we do not have the 74.117.14.86/max++.x86.dll binary, we have no way to tell.

4.1 Rest of Logic of Function 0x35671E61
We now continue from the call of CLSIDFromProgID.  Again, notice that the CLSID is stored at 0x009FFF48.

Figure 12 shows the rest of the logic of the function 0x35671E61. The major part is a call of CoCreateInstance which constructs a unique instance of the JScript COM object. Note that its second last parameter rrid is the id of the interface that is used to communicate with JScript. However, as the co-initialize function fails, the CoCreateInstance() returns an error code 0x800410F0 (means the COM interface not initialized correctly). In such case, we have to modify the EAX register at 0x35671E90 to force the logic through.

It can be seen that, in Figure 12, three calls related to JScript COM object are placed. However, due to the failed co-initialize, we have no way to know about the details of these three functions. Lastly, function 0x35671E61 returns.

Figure 12. Interacting with COM Object

4.2 Function 0x3567162D
Using the similar technique, we can enforce the logic into function 0x3567162D. Figure 13 shows its function body. As shown in Figure 13, Max++ is readling from \??\C2CAD...6cc2 and allocates 0x15b bytes at 0x003E0000 and extracts the contents fro mthe file into 0x003E0000.

Figure 13. Loading New Malicious Logic

The rest of functio n0x3567162D is shown in Figure 14. It applies 2 layers of decryption to extract the contents at 0x003E0000. As shown in Figure 14, at 0x003E0000 it looks like an XML spec. At this moment, we do not know the meaning of "<jst>" tag. But if you look at the contents, it looks like a URL to download from intensivedive.com and the rest looks like the HTTP request header.



Figure 14. Extraction of Encrypted Contents
Challenge 2. Analyze the logic of function 0x35671ECF. Notice that you have to carefully rewire the logic when it tries to invoke functions in remote DLL.

4.3 Function 0x356713AC. 
At the end of functio n0x3567162D, it calls function 0x356713AC, which is shown below. Its function is pretty similar to 0x3567162D. It reads from another hidden file, resolve the IP of intensivedive.com and constructs request payload.
Figure 15. Function 0x356713AC
At the end of function 0x356713AC, it calls 0x356712D8, whose function body is shown as below. It prepares the necessary resources (UUIDs) for socket communication, in 0x35674237, which calls function 0x3567417c.
Figure 16. Function 0x356712D8 First Half

The function body of 0x3567417C is shown in Figure 17. Note that the first call of ws32_socket will fail. The most interesting part (see highlighted) is the call of BindIoCompletionCallBack. It sets 0x356740D4 as the handler on any IoCompletion on handle of the network communication. Let's set a breakpoint and see if it's getting called. This breakpoint, under the current setting will never get hit because the WSASocket call fails. However, the analysis of its binary code is still possible. We leave it as a homework for readers.

Figure 17. Function Body of 0x3567417C
Challenge 3. Analyze the function of 0x356740D4.

The rest of of 0x356712D8 deals with sending out packets (mainly to intensivedive.com/install.ppc) and there are too many errors as the network initialization of WSASocketW fails. Let's go back to 0x35671C6F and see what's the logic here.

Figure 18. Port Service Open
As shown in Figure 18, clearly, Max++ is opening a port (using zwCreatePort), and then there is a big loop and during each iteration, it is calling ZwReplyWaitReceivePortEx to try to listen to the port.

Challenge 4. Find out the port number that Max++ is using. Notice that since TCP/IP stack service is hijacked by Max++, netstat command won't get you any interesting information!

In the next tutorial, we will tweak the control flow of Max++ to get into each of the switch case of the zwReplyWaitReceivePortEx call and check out if Max++ is serving as a bot-client of a bot-net.