Introduction

I’ve assembled this guide from the things I’ve learned over the last couple of years. I hope it can be helpful.

Malware Handling

Transporting

Every malware sample should be contained inside a zip folder and password protected with the industry standard “infected” password. This is not for security or privacy but to inform the analyst that the zip file contains malware and to tread carefully. It’s also used to avoid accidental executions.

Defanging

All samples should be defanged until ready for detonation to avoid accidental executions. This is done either by appending a new file extension or removing the file extension. I like to add “.mal” to the existing filename so that the original sample file extension is maintained.

Classification/Naming Convention

There is no industry standard for naming conventions when it comes to malware classification. This means most analysts come up with their own version for both known and unknown samples. I use this: [TypeOfMalware].[MainRole].[VariantNameIfExists].exe Examples for each category in the naming convention is below:

Type of malware:

Trojan
Worm
Spyware
Adware
PUP

Main Role:

Backdoor
Botnet
Downloader
Launcher

Variant Name:

Emotet
Neshta
Dridex
Zloader

For example, if you have a sample and you know that it’s disguising itself as a regular program then you know it’s a Trojan. Furthermore if after analysis you figure out that the entire purpose is to download a file then the final name would be Trojan.Downloader.exe. If you know the malware family or variant you can include that too like: Trojan.FileInfector.Neshta.exe If you have no idea what it is, you can use something like Virus.Unknown.exe or something similar and fill in the blanks as you learn more. You might even come across a sample that hasn’t been seen in the wild yet and you’ll get to name it.

Lab Environment

PLEASE TAKE A BASE INSTALL SNAPSHOT!

Networking

Setup a new host-only or better yet internal network only adapter. Enable DHCP so that both VMs get assigned networked IPs. Set the IP addressing range to be something entirely different from your host’s network that way it’s easily identifiable. For example, if your home/work machine uses a 192.168 address, make your lab network a 10.0.0 address or a 172.16 address. This newly created adapter will be used on both VMs.

Windows VM

A Windows 10 VM should be used at this point unless analyzing a much older piece of malware. The Windows 10 VM should have vmwareTools/VirtualBox Guest Additions. We’ll ignore some valid warnings about how some malware can detect sandboxes/VMs and then choose not to run, it’s much easier to transfer malicious files with drag-and-drop enabled, you’ll have more screen real estate for running multiple tools, the performance will be better, and you’ll be able to evade those anti-sandbox checks via debuggers or binary patching anyways.

After the base Windows 10 is installed, I recommend installing updates and fully disabling Windows Defender using dControl by Sordum. I also recommend changing the Appearance and Performance settings to “Best Performance” and debloating Windows with https://github.com/Sycnex/Windows10Debloater. Might I also recommend activating your Windows installation with WinActivate?

After this, Mandiant’s FLARE kit or Mentebinaria’s RE Toolkit should be installed. The install for FLARE will take a long time but it includes almost every tool you’ll ever need for malware analysis. It should be noted that the FLARE setup will need an internet connection. Other tools that you might want to install are ScyllaHide and xAnalyzer for x32dbg/x64dbg. Any additional tools or plugins you might need/think of that aren’t included should be installed now as well. If you like Ghidra, go searching on Github for some helpful Ghidra plugins/extensions. There are some very helpful ones out there.

After you’ve installed all your tools, change the VM adapter to the host-only or internal network only adapter that was setup earlier. Then navigate to the adapter’s IPv4 settings and change DNS to the IP address of your Remnux VM and disable the loopback adapter (idk why but this adapter tends to cause issues with connecting with the Remnux VM when it’s enabled). After all of this is finished, definitely take a snapshot.

Remnux VM

Remnux is very easy to setup because it’s an .ova file. However, there are some changes to be made. Make sure the networking adapter is also set to the host-only or internal network only adapter that you setup earlier. Once you have your Remnux VM up and running you’re gonna want to change some inetsim configuration settings.

-> sudo nano /etc/inetsim/inetsim.conf

-> Uncomment the DNS service

-> Change default IP from the 10. address to a 0.0.0.0

-> Change the DNS Service IP address from 127.0.0.1 to the IP address of the Remnux machine.

-> Save changes

The Remnux VM should be working to act as a fake network for your Windows VM now. You should test to make sure that the Windows and Remnux VMs can talk to each other but not your Host machine. You should also enable inetsim and make sure that it’s working properly. You’ll see an Inetsim webpage for any website you visit on your Windows VM. If you append a “anyProgramName.exe” to the end of the url, you’ll see that Inetsim will actually serve up a fake executable to download. If all of this works then you know your Remnux VM is ready to go.

Kali VM/Commando VM

Not really necessary for malware analysis but having a “hack box” can help at times. If a sample serves up a Meterpreter payload, you’ll be able to answer back. I also use a hack box to assist in writing malware to better understand certain techniques like EarlyBird APC Injection. I do this usually by creating calc.exe or MessageBox shellcode with MSFVenom and including that shellcode in a binary I write in VisualStudio that uses the particular process injection technique that I’m learning more about.

I’ve always been an advocate of “to defend you need to know how to attack” so I used the same method for malware analysis where I’ve tried to actually write my own malware to better understand the techniques as well as the transition from source code to assembly code. Learning to write your own shellcode will also assist you in learning more about Assembly because the goal of shellcoding is to make it as efficient as possible. You want to avoid null bytes and the end result super small in order to fit in a small memory buffer. In order to do this correctly, you’ll have to pull out all the Assembly tricks and it’ll give you a much greater understanding of the Assembly language.

I’ve also began collecting source code of various malware or cracked versions of builders to better create detections and aid in analysis. I use my Commando VM for this purpose.

Malware Triage (Basic Static Analysis)

Think of this stage like the recon stage from the hacker’s methodology. This is the stage where you try to find out more information that can help you down the rest of the stages. The goal here is not to find any crazy revelations. It’s simply to find things that warrant further investigation. For example, if you find a powershell command in strings, the process would be taking a note of that command and then look for it during dynamic analysis for more information and context. This stage is also used to make an estimated guess as to what the sample might do or the “goal” of the sample which is then confirmed or denied at a later stage. It’s also where I try to spend a lot of time attempting to figure out what malware family the sample is a part of as this can be a huge help with the rest of the analysis.

Hashing

Everyone here knows what hashes are and how they work so I won’t go into that detail. But here are some resources you can search the hashes in:

Don’t forget about fuzzy hashes, imphashes, and imp fuzzy hashing (you’ll need an Enterprise account to use these on VirusTotal). These, in my opinion, are much better for malware analysis purposes because they provide insight to code similarity and thus a potentially easy classification. Knowing the malware family/variant can not only be super helpful for the rest of the analysis because you can look up previous reversing reports as guides, but it is also one of the main goals for an unknown sample. Using these weirder hashes can sometimes help you answer the question of “what is this?” immediately.

If the hashes generated a positive result and you think you might know the malware family, take a look at the reference docs for that family on Malpedia. This can help you confirm/deny as well as give you write-ups that can assist you in the additional stages.

Packed vs Not Packed

Packing is obnoxious but that’s also why malware authors do it. It’s a way to compress and obfuscate the executable so an analyst can’t easily take a look into the underlying code. One of the first steps for samples should be unpacking. Sometimes it requires a command-line utility and other times it requires firing up the debugger and setting breakpoints. It’s entirely dependent on the type of packer used.

I do want to note the terminology differences between Crypters and Packers. Packers compress the data in an executable while crypters encrypt the data in an executable. Whilst the common term used is “packing” for both types, these are the real definitions and differences of the two.

With that being said, the first step is figuring out if something is packed. Step one is using automated tools like Detect It Easy (DiE) or PEID. These will scan the file and try to find out if a PE is packed and if so, what is it packed with? These tools have a number of signatures that will help. Once you know the packer, you can do some online research on how to unpack it. If these tools don’t work, you can also scan the PE with YARA packer rules in case it finds something the other tools didn’t.

If all of these things didn’t work and you still don’t know, there’s a couple of other things to look for. Look for the section names. You can search Google for standard PE file section names to verify but most commonly you should only see .text, .data, .rdata, .reloc, or .rsrc section names. If you see UPX0 or XXetgfSW or .00cfg or something else weird as a section name, then you can assume it’s probably packed. One thing to note here: executables compiled in Golang or Nim will have very strange section names but it doesn’t mean it’s packed. In fact, if you see the weird Golang or Nim sections then it probably isn’t packed but will instead use string obfuscation and dynamic API loading techniques. If you see the .symtab or .gopclntab sections then it’s Golang. If you see a few sections with a slash and numbers like /24 then it’s probably Nim.

You can also look at the entropy. Packing is obfuscation/compression/encryption and is messing with the underlying bytes. It’ll try its best to make the bytes random so it looks nothing like it did originally. This is by design in order to protect software from reverse engineers. However, the more random the packer makes the bytes, the higher the entropy goes up. If you open the file with DiE and the entropy is listed at 7.5 then it’s pretty clear it’s packed. I’d say anything higher than 7 and you can be pretty certain. Entropy lower than 6.5 is most likely not packed. Something that’s like 6.5-7 might be packed but might not and could just have some encrypted data stored in the binary which is making that entropy number go up. I’ve seen a Dropper sample have an entropy of 7.1 before but it wasn’t packed. Why? Because it had an encrypted payload in the resources that took up the majority of the file space for the entire executable (the encrypted payload was a large executable but the dropper itself that I was looking at contained very little code). This led the entropy for the entire dropper to be high and look packed when in reality, the high entropy was just caused by the encrypted payload in the resources. Long story short, entropy is a good indicator but don’t solely rely on it.

Other things to look at are section sizes. Particularly the relationship between raw section size and virtual section size. This can be viewed in PEView or PE-Bear (my personal favorite). The sizes won’t always be the same but should be similar. Like if the .text section of a binary has a virtual size of 5600 bytes and the raw size is 5300 bytes, it wouldn’t throw up a red flag for me. However, if the virtual section size is 12,352 bytes and the raw size is 0 bytes, then there’s an issue and it’s probably packed. The difference between the virtual and raw section sizes are as follows. The virtual size is how much space that section will take up in memory during execution. The raw section is how much space that section will take up on disk when not executing. So if you think about how packers work, this makes sense because after compression, a section could cease to exist as the bytes are now stored in another section (e.g. think of how the sections are changed with the UPX packer). This makes the raw size equal zero. However, once the executable is running in memory and goes through its unpacking routine, that section is now filled with data and it will take up space in memory.

Next up is imports/exports. When talking about Windows executables, there should almost always be some sort of Imports because the executable will want to do things and actually run. This is achieved through Windows API calls. The API calls that it’ll be using will be present in the Import Address Table (IAT) which can be viewed through pestudio under the Imports section. We’ll talk more about imports, exports, and Windows API calls later. However, in regards to packing, there should be more than a few imports for an executable. Even a Hello World program will have a decent number of API calls just from the VisualStudio compiler. If you see an executable has like 3 or less imports, then something is fishy. Either it’s packed or it uses dynamic API loading (discussed later).

Finally, the strings can give you some insight. Normally you’ll see something readable when running a strings utility (I recommend FLOSS). If it’s just gobbledygook the sample could be packed or is using some other sort of string obfuscation.

The key point to remember is using all of these different techniques are important. You can’t just look at strings and go “oh it’s packed.” You want to get as many data points as you can and then determine if it’s packed based on more than one data point.

Strings

As mentioned earlier, strings can be useful for determining if a file is packed. It can also provide some insight into what the program might do, as well. If you see “cmd.exe /c” in the strings, it might have a point in time where it spawns a new cmd.exe process to execute a command. This gives you a hint to maybe set breakpoints on ShellExecute, CreateProcess, or CreateProcessInternalW even if you don’t necessarily see those APIs in the Import Table.

Sometimes you’ll see a URL that it either downloads a file from or exfils data to. It should be noted that any strings that might give you hints as to what the malware does, could be there to trick you instead. You have no definitive answers yet. That’s what the later stages of analysis are for. However, it’s also important to note these potential indicators for now as they could help at later stages or you can report that it’s a false flag which could assist other analysts in the future.

Imports/Exports

Imports and exports can give you a sense of what the executable might do. For example, if you see CreateToolhelp32Snapshot, Process32First, and Process32Next in the imports, you know that the sample will be iterating through all of the processes. You don’t yet know why. It could be for anti-analysis or process injection or a number of other reasons but you at least know one of the functionalities. If you see VirtualAllocEx, WriteProcessMemory, and CreateRemoteThread, you can guess that it’ll probably perform process injection. Things like LoadResource, SizeofResource, LockResource, should give you the idea to take a look into the resources of the sample because clearly it wants to do some resource manipulation and most commonly will be launching a payload from the resources. If you see GetProcAddress, GetModuleHandle, LoadLibrary, and that’s it, then you know you’re gonna be dealing with dynamic API loading. Long story short, looking at the imports can be very helpful during the recon stage.

Exports are a bit different. They aren’t as helpful as looking at the imports but if you have a DLL sample, it can give you some insight on how to run it. For example, if the DLL has one export and it’s called DllRegisterServer then you know you’ll have to run the DLL by using regsvr32.exe sample.dll,DllRegisterServer. If it has an export called “thicc” then you can try using rundll32.exe sample.dll,thicc. It won’t often give you an idea of what it’ll do unless it has multiple exports or is trying to impersonate a real DLL. If it has multiple exports, they might have identifying names like GetDesktop or something. I see it more often if there’s a pair of samples where the main sample is an executable but will also download and use a DLL for more functionality. However, this isn’t always the case. I tend to focus way more on the imports than the exports but I always take a look at the exports just in case something catches my eye or to see how the DLL would like to be executed.

A helpful resource for understanding malware techniques based on imports is the website: malapi.io. You can also use the preinstalled CAPA tool from the Windows FLARE VM. There’s also a lovely tool called CAPA Explorer for Ghidra that you can grab on Github that will assist in code analysis from the results that the CAPA tool acquired.

Other Stuff

I wanted to talk about a couple of other things that can potentially be helpful at least for intelligence purposes. The original filename can give you an idea of what it was compiled to be. Often times the original filename will be what the sample is trying to impersonate.

Another thing to look at is the PDB file information located in the executable (can be seen in pestudio). The bad thing is that this can be totally faked but usually the attackers won’t care and it’ll have some of the folder structure from the computer it was compiled on. Could be helpful for identifying campaigns or help to tie a subject to the sample.

YARA

YARA is great. You should have a bank of well-made YARA rules grabbed from reputable sources to avoid as many false positives as possible. The YARA-Rules repository is a great place to start since some big names in the business like Florian Roth include their rules in this repository and add to it. YARA works based off of signatures so well-crafted YARA rules can tell you not only the malware family it thinks it is but it can also tell you what encryption algorithms it thinks it’s found, what packer it’s using, etc. You can get a great deal of information from this. But keep in mind, you might also get false positives too.

SignSrch

SignSrch is another signature detection utility except this time it focuses on obfuscation and encryption. Use it to identify any encryption algorithms and record their offsets for later analysis.

PEiD

Can identify packers and encryption algorithms with its KANAL plugin.

Automating Triage

You can do all of this manually or you can automate these steps. It’s up to you. I recommend creating a checklist for this step so you can choose to use automated tools until there’s a question on the checklist was unable to be answered. Then use manual tools to try and answer it. I’ve included my entire malware analysis workflow/checklist in this notebook. You can modify it to your liking.

If you don’t want to create your own automation tool I have some for you:

Basic Dynamic Analysis

Tools

You’ll mostly be using Procmon, ProcessHacker, tcpview, Wireshark, regshot, and inetsim. But I wanted to add to this list a bit.

Sysmon - This is a great tool if you have the right configuration. There’s plenty of configurations for sysmon from Github that you can download and install. Sysmon will basically act as procmon but will record any juicy activities into its own Windows Event Logs. This is great because you won’t have to deal with the eye-hurting procmon logs or filtering rules. However, it won’t catch everything. Only what the configuration tells it to catch so it can miss things.

Chainsaw - This is a wonderful command-line utility for parsing Windows Event Logs. One of the main features is that it helps to carve out sysmon logs. It then uses Sigma Rules to give you more of an explanation on what the sample is doing as far as TTPs are concerned.

Hollows Hunter - Sometimes I get the feeling that maybe an executable is injecting into other processes or messing with other process memory but for whatever reason, I can’t see it. That’s when Hollows Hunter comes in. This tool scans all running processes and does a good job of detecting malicious implants. I don’t use it very often but it has come in handy.

SpyStudio - This is basically like a fancier looking version of procmon. It’s the same idea. It hooks into the sample and records what it does. However, sometimes this tool has found more interesting data for me than procmon since it is actually hooking into the sample. Again, this is not something I use often but sometimes it helps. I will warn you though that when running, it can make the performance of your VM very poo-poo.

API Monitor - I’ve started to like this better than procmon (only if I know exactly what APIs are being called) because it shows more information on arguments passed into specific API calls and malware authors don’t usually look for it to be running. The con to API monitor is that it doesn’t just record all API calls for a process. You have to specify what API calls you would like to monitor. So I suggest running procmon first to get a general idea of what’s going on, then use this tool for more information on particular API calls.

Un{i}packer - If a sample is packed and you’re unable to use a free command-line utility like UPX and you’re not comfortable with manually unpacking in a debugger, try this. It’ll use the Unicorn engine for emulation and try to automatically unpack the sample for you based on common unpacking techniques.

Basic Workflow

The basic workflow is quite simple. Before anything, make sure you have a pre-detonation snapshot. You’ll most likely have to run the sample many times so having a snapshot to where you can quickly run it again is great to have.

Ensure your VMs can talk to each other but not to your host.
Ensure your Windows VM’s DNS IP is set to your Remnux VM’s IP
Arm the sample by giving back the proper file extension
Launch procmon, process hacker, and regshot on the Windows VM.
Pause and delete capture data on Procmon to start with a clean slate.
Set a filter on procmon for your sample’s name.
Launch inetsim and Wireshark on your Remnux VM
Take first shot with regshot.
Start capture on procmon.
Launch executable and see what happens.
After some time (I usually give it about 2 minutes) you can stop the capture on Procmon and Wireshark.
Close any of the remaining malware processes.
Take 2nd shot with regshot and compare.

Now comes the actual analysis part. Look at all the logs for all of the tools you just used. Find as many IOCs as you can both host-based identifiers and network-based identifiers. This process will take some forensics knowledge as you’ll have to learn what’s normal and what isn’t.

Here’s some tips:

Use the procmon filtering
1. There will be a ton of data in the Procmon windows. This is why we did the initial recon to help us out.
2. Filter for anything that you noted earlier.
  1. If you saw cmd.exe /c in the strings then add a Procmon filter of Details->Contains->cmd.exe /c and see if anything pops up
Use the procmon historical process tree
1. It’s very helpful to see what processes started what other processes and it can navigate you to the specific procmon logs of the creation of the processes too
Use tcpview to see any open ports/connections for the malicious process
After that data is acquired you can filter wireshark using information you grabbed from tcpview

Make sure to not just do this process once. Run the sample multiple times restoring from snapshot each time. Try new things for each execution. Run as admin, run as user, run it with networking to Remnux, run it without any networking, try adding command-line switches, etc. Try new things and write down any differences you find along the way. Also mark anything you’d like to understand more such as “I see it’s creating and deleting this file but what are the contents of that file?” After this stage you should have a good idea of what the malware is doing/its main purpose. You should only have a few questions left that should be able to be answered through advanced static and dynamic analysis.

Basic Dynamic Analysis Tips

In order to perform this task successfully, there’s some things you should do to prepare. First, make sure that your dynamic VM is setup like a real machine. 6-8GB + of RAM, 2 cores, and 250GB of disk space. It’s a lot of resources but it makes your VM look like at least a weak modern laptop by today’s standards.

Also if you’re using Vmware or virtualbox, use the Pafish (Paranoid Fish) script from Github to see what anti-VM checks it can find so you can try to bypass them. You won’t be able to get all of them without breaking the VM but take note of the checks that failed so that you can see if the malware is trying those same checks during runtime. Try using QEMU instead of the usual hypervisors. QEMU allows you to completely customize your vm which makes it more difficult to use but easier to avoid more anti-vm checks.

Change the names of your tools! Malware authors know the tools that analysts use and they will actually read malware analysis reports. So change your tools to be weird names when they’re running as a process. Call procmon “GodMode.exe” or something for example.

Please don’t call your VM something like Analysis-VM for the hostname. Keep it at the standard DESKTOP- or something. Also don't name the username in your VM to Analyst or Victim. Make it look like a legitimate machine.

Don’t rename the samples to sample.exe either. Just stick to either your naming convention or better yet, rename it to its original compiled name. You can find it using PEStudio.

Try using the sysinternals tool called “DebugView” which will show you the output of any OutputDebugString API or DbgPrint Kernel-mode API if the malware uses them for any reason. When it works, it can actually give you some decent hints as to what the malware is doing if you’re lacking on information as some malware will have debug prints for the author that weren’t removed when compiled. It can be something as simple as “Success!” or it can be something helpful like “Injection into explorer.exe succeeded!”

If normal executions aren’t working and you suspect that it might be performing some anti-analysis checks, try using SysAnalyzer, API Monitor, or SpyStudio. They have built in protections since they hook into the process. They will attempt to hide themselves and other analysis indicators during runtime. You’ll actually see in SysAnalyzer if CreateToolhelp32Snapshot gets called, it’ll say “hiding sysanalyzer. Hiding procmon. Hiding wireshark.” in the API log. This could be useful in also identifying some of the anti-analysis techniques that the sample is using. Keep in mind though, that this won’t protect against lower level APIs commonly used for anti-analysis like CPUID().

Sometimes the anti-analysis checks are just too good for basic dynamic analysis so you’ll have to use advanced static/dynamic analysis to patch the binary to ignore these checks. If you’re able to find them and patch them out, come back to this stage.

Some samples target only hosts that are connected to a domain. If you have the computing resources, you can setup another vm to be a basic domain controller and create a domain. Most malware won’t have this check. However, some of the bigger APT groups’ samples might have a check to see if there’s a domain because they only want to target enterprise environments or maybe they’re looking for a specific domain if they have a specific target. Unfortunately, you’ll most likely need to dive deeper into advanced analysis to find out the specifics of this check. But once you find it out, it’s very easy to setup a small internal domain.

Advanced Analysis

Now we’re getting into the reversing side of malware analysis. Advanced static is taking a look at the underlying assembly code. It’s code analysis but without the source code so it’s less fun. That being said, if you get good at advanced static analysis, you can find way more interesting aspects of the malware such as killswitches, vulnerabilities, or decryption keys. It can also help you to write configuration extractors so you can easily extract the C2 servers every time you come across the same malware family.

Then there’s advanced dynamic analysis aka debugging. This is by far my favorite stage. You have pretty much full control over the sample during runtime by modifying buffers and assembly instructions. It’s easily the most enjoyable stage of the process. The thing you have to worry about in this stage is dodging and weaving through a variety of anti-analysis/anti-debugging techniques.

Assembly

There’s some stuff that you’ll need to know for both advanced static and advanced dynamic. Firstly the difference between x86 architecture and x64. The good news is that x86 is the most common architecture that you’ll see malware in. This is because x86 can run on both 32-bit and 64-bit systems so the malware can be run on more machines. So let’s focus on x86 assembly:

Registers:

EAX - Used to store information or return values
ECX - Counter register in loop instructions
EDX - Used in division to return the modulus
ESP - Points to the top of the stack
EBP - Points to the base of the stack
ESI - Source address when copying a group of bytes in memory
EDI - Destination address when copying a group of bytes in memory
EIP - Points to the next instruction to be executed

IIRC technically you can use whatever registers you want as long as you always perform register cleanups if you’re coding purely in assembly. However, the above are the main uses of the registers and compilers will try their best to follow these guidelines.

Flags:

Carry Flag (CF) - Set when arithmetic operation goes out of bounds (like when you “carry” the 1)
Sign Flag (SF) - Indicates that the result of the operation is negative (negatives don’t exist in assembly so the register will have 0xFFFFFF and this sign flag will have a value indicating that it’s actually a negative number)
Overflow Flag (OF) - Indicates that an overflow occurred in an operation leading to a change of the sign
Zero Flag (ZF) - Set when the arithmetic or logical operation’s result is zero

Stack:

First In Last Out
1. Like stacking plates in your cabinet. The first plate you put in will end up being the last plate you use.
2. First instruction placed in the stack will be the last called
Most often used for arguments in x86 architecture
1. x64 architecture tends to perform more of a mixture of both stack and register based argument storage for functions

Example Instructions:

MOV A, B - Copies data from B into A.
CALL sub100 - Calls the function at sub100.
XOR A, B - Performs an XOR function on A using the value of B. The result of the XOR is stored in A.
XOR A, A - Performs an XOR on itself. Anything XOR’d by itself always equals 0. It’s a more efficient way of emptying a register than by going MOV A, 0.
JMP 0x8000 - Jump to the address 0x8000
LEA A, 0x25000 - Load Effective Address from 0x25000 into A
MUL A, B - Multiply A by B and store the result in A
DIV A, B - Divide A by B and store the result in A
- Used often in exploits to achieve a smooth transition into shellcode. This is called a NOP-slide. If you see a large consecutive amount of NOPs into what looks like shellcode, you might actually be analyzing an exploit.
NOP - No operation. Does literally nothing.
PUSHAD - Saves all register values to the stack
POPAD - Pulls out all register values from the stack
PUSH - Stores a value to the stack
POP - Pull a value from the stack
- THIS IS AN IMPORTANT ONE TO UNDERSTAND
- Malware will sometimes be tricky especially with shellcode or manually written assembly
- An author can push a value to the stack right before the return to jump to a different function without calling an actual JMP instruction
  - Normally at the start of a function you’ll see a couple of different pushes onto the stack like PUSH EBP
  - One of these usually has the RETURN address so that once the function is complete, it knows where to go next
  - However, malware authors sometimes will have this normal functionality but right at the end of the function, instead of just returning, they’ll put a PUSH 0x40000 to create a new return address
  - This is because the RET function without any value will just look at the value on top of the stack
- This make analysis more difficult for analysts
RET - Pulls the return address from the stack and jumps to it unless a value is included with the RET instruction
CMP A, 0x23 - Compares the value of 0x23 with the value stored in A. Saves the result in A.

Conditional Instructions:

JNZ - Jump if not zero
JNE - Jump if not equal
JGE - Jump if greater than or equal to
JLE - Jump if less than or equal to
LOOP 0x8000 - Jumps back to address 0x8000 and will continue to do so until ECX reaches 0. It decrements ECX on every execution.

Finding Malware Configurations

Malware will often have some sort of config “file” for lack of a better term. The sample wants to talk to a C2 and therefore needs to know where to talk to and it’s pretty silly to just store it in plaintext within the binary for us to find so usually they obfuscate/encrypt it somewhere. My suggestion is look in the .data and .rdata sections. So why .data? Well anything stored as a global variable will be stored in the .data section once the binary is compiled.

#include <Windows.h>

LPCSTR C2Server = "162.153.22.19";

int main(){
    printf(C2Server);
    return 0;
}

Seen in the example above, the C2Server variable is stored as a global variable. This means once it’s compiled, we’ll see that particular string in the .data section of the binary. If the same program was written like this:

#include <Windows.h>

int main(){
    LPCSTR C2Server = "162.153.22.19";
    printf(C2Server);
    return 0;
}

The C2Server variable would be stored in the .text section of the binary after compilation. It makes the most sense that the configuration would be stored as a global variable because then every function and subfunction would be able to access it for information. It just makes things easier. So knowing this, I always look at .data and .rdata sections first when I open up a disassembler like Ghidra or Binary Ninja. You’ll probably end up seeing a lot of random strings in there but you might see a big blob of data that often looks like gibberish but not empty. I highly suggest when you see this, to look at the cross-references from this blob of data in order to see if you can spot some kind of encryption algorithm. Sometimes you’ll get lucky and see from the assembly that maybe they’re just taking this data and performing a simple XOR or base64. Those are the best because you can easily deobfuscate/decrypt it and you’ll have the config file. RC4 is also pretty simple once you spot the algorithm because there’s usually a key pretty close to the blob in the .data section and if there isn’t, you can usually see it getting passed into the RC4 algorithm and there’s no IV so it’s easier than something like AES. If you can’t figure it out right away then don’t worry because you can come back to it when debugging. But I still suggest spending some time to look for it because it might be an easy win. A cheat way of doing this is that hopefully you’ve figured out what malware family your sample is from at this point in the analysis. With that info you can do some open-source research to see how others have extracted the config if it has one. If you’re in a rush just do this to grab the IOCs quickly. However, I would recommend trying to do this yourself for training purposes. If you want to try it yourself, Zloader and IcedID are both malware families that will contain their own configurations. Most malware families will contain configurations but these were the two families that I used to introduce me to the world of configuration decryption and creating custom configuration extraction scripts in Python.

Manual Unpacking

Sometimes malware will be packed with UPX which is awesome because you can either use the UPX utility itself to unpack it or use the built-in UPX utility from CFF Explorer. However, there are a ton of other packers that exist and some are way more complicated than others. VMProtect, I have no idea how to unpack so I can’t help you. That’s probably the most hardcore packer that exists right now (besides Themida) because it actually creates a mini virtual machine for the binary to run in. OALabs I think has some resources on how they work with VMProtect but I don’t think they fully are able to unpack the binary. The good news is that I haven’t come across many binaries with VMProtect yet as it’s a commercial and expensive packing tool. Other packers like AsPack, UPX, customized UPX, MPRESS, Obsidium, FSG, MEW, etc are used way more. The other good news is that most of the time, you can use the free/trial utilities to unpack these or the following process to manually unpack these with a debugger.

Set breakpoints on VirtualAlloc, CreateProcessInternalW, IsDebuggerPresent, WriteProcessMemory, and VirtualProtect
Press the run button in the debugger
If the IsDebuggerPresent is hit, Execute Until Return and then change the value in EAX from 1 to 0 and run again
1. IsDebuggerPresent looks at the value located in the BeingDebugged flag portion of the Process Environment Block (PEB) and if you don’t want to keep using this trick, I have another tip for you:
  1. When the breakpoint on IsDebuggerPresent is hit step over and you’ll be brought to a couple of MOV instructions
  2. Keep stepping over until you reach the last MOV which will actually be a “movx” instruction iirc and right-click->follow in dump on the value being passed in that instruction
  3. You should see an “01” highlighted in the memory dump
  4. Modify this value to be “00”
  5. You have successfully manually edited the PEB so all further actions taken to read that flag either through the malware calling IsDebuggerPresent or manually parsing the PEB will see an answer of “no this is not being debugged” even though it is ;)
  6. Or if you want to be boring just type “dbh” in the command-line console in x32dbg
When VirtualAlloc is hit, Execute Until Return and then follow the EAX value in dump and run again
You might have multiple VirtualAlloc’s to deal with. Follow the same step as above but just use different memory dump tabs so you can keep your eye on all of them after every run. X32dbg has 5 memory dump tabs for you to use. Use them.
You should see an MZ header pop up in one of the memory dumps by the time it gets to a VirtualProtect. If you end up hitting a CreateProcessInternalW or it just exits then you have some more recon to do because you’ve gone too far in the binary. If this happens, ensure that the sample process is terminated before restarting.
1. Buuuut if you hit CreateProcessInternalW then it might be using a newly spawned process to unpack
2. If this is the case, pop open Hollow’s Hunter and it very well might find the unpacked executable for you :)
When you see an MZ header in the memory dump, verify that it looks different and it isn’t just the original binary. Have the original binary hex open and compare it to what you see in the memory dump. If it’s the same then you might have to move forward a little bit more. Perhaps one or two more VirtualAlloc’s. If it’s different then you can dump it by either highlighting all of the bytes in memory dump or dumping from the memory map tab.
Once it’s dumped, you might have to fix the IAT if the unpacking routine has it mapped to memory. If so, open up the dumped exe in PE-Bear and head over to the section headers tab. You’ll want to change the raw address values to the corresponding virtual address values. Then for the raw size values, calculate the difference between the addresses. The last address you’ll basically just have to guess the size in order to bring the colored bar to the border line but it’ll be similar to the size of the last virtual section so you can use that as a guestimate. Then you can verify if you can see the imports/exports now. If you can then you’ve done it correctly and you can save the binary. Oh last step I forgot is to change the “Image Base” value to the offset where you pulled it from memory. So if the MZ header you found was located at 0x400000 then change the Image Base value to that if it isn’t already set to that. Now you’re officially done and you can save the modified binary.
Profit

AsPack is one that is a little bit different but it’s not so bad. It’s just weird.

When you open the binary in the debugger and reach the entry point, you’ll see a PUSHAD as the first instruction or one of the first instructions. This will push all register values onto the stack and is essentially storing them for later. The easiest way to deal with this? Find the POPAD.
Scroll down until you see the first POPAD. It should have a JMP instruction within the next like 3 instructions after the POPAD. Set a breakpoint on the JMP.
Press the Run button so that you reach the JMP and then do a Step-Into or Step-Over to follow the JMP.
You should now see instructions that look like a regular EntryPoint like PUSH EBP being the first instruction. This should now be the EntryPoint for the unpacked version of the executable.
Open Scylla -> Find IAT -> Dump -> Save to file -> Fix Dump of file you just made.
Open it up in PE-Bear to verify that things are all good. You can look at the dumped binary in PEStudio to verify it’s been unpacked. You might still have a leftover .aspack section in the binary after this process but if you can see more imports and strings than before than it worked and that’s just a leftover artifact from the process.
Profit

Handles

Being able to understand API calls and arguments on the stack is great but sometimes APIs use handles instead of full filenames so if you’re looking at the stack, you won’t actually see what it’s messing with. This is where learning handles helped me a lot. When I’m debugging a sample, I always have Process Hacker open. That way I can keep an eye out to make sure it doesn’t randomly spawn a new process under the nose of the debugger and so I can take a look at strings in memory and the handles for the process I’m debugging. Whenever you see a handle that’s passed as an argument for an API, open up the process in Process Hacker by double-clicking on the process and navigate to the handles tab. Then just match the handles and it’ll give you the answer. If the handle argument is 0x12C then just find the handle value of 0x12C in Process Hacker and it’ll have details of the file or other artifact that it’s currently trying to do something with. GG EZ.

Advanced Debugging Tips

I want to first emphasize that you shouldn’t need to go line by line in the debugger at this point. Based on all the other stages, this stage should really only be used to answer very specific questions that have yet to be answered. It shouldn’t be used to figure out absolutely everything. Think of it more like a last resort to answer a question that all other techniques have failed to answer. Technically you can analyze an entire binary using debugging but you’re very quickly going to lose your mind by staring at assembly code all day and pressing “Step-Over” 50 million times. Identify the specific functions that you’re curious about from the other methods of malware analysis and then use the debugger to first figure out how to get there without breaking something and then use it to figure out exactly what those specific functions are doing. For example, let’s say you have a function that you believe is reading the config file in memory where all the C2s are stored. You want to make sure that you extracted all the C2s from basic dynamic analysis because it might only use others if it can’t connect to the first server. Or maybe it uses the round-robin technique and rotates through the list of C2s at random. However, in all other stages of analysis, you have failed to find the decrypted config file in memory and can’t quite figure out the decryption algorithm to extract it from the code. What do you do? You use debugging and let the malware do all the heavy lifting. Set a breakpoint on the decryption function and run until it hits that breakpoint. From there you can read more line-by-line to figure out where it’s reading values from and where the resulting decrypted values are stored. There’s a good chance that you’ll find the structure in memory that holds the now decrypted config and can extract it for IOCs. It might also help in identifying the encryption algorithm used. But seriously, don’t go line-by-line through the entire executable. PE files have so much junk code in it just to do basic operations that it’s extremely inefficient and is more likely to confuse you or get you lost under like three layers of CALL functions.

If you are approaching a sample where all of the other stages didn’t help and you need the debugger to just figure out anything then I have some additional tips so that you’re still not just going instruction by instruction. Please don’t be scared to set breakpoints and press play. For some reason I’ve seen people scared to just press the “Go” button. This is why you’re doing it in a VM and create a variety of snapshots. If you have a list of imports but have no idea what’s happening, set breakpoints on all of the imports and just press the play button. The Windows API is very particular about the data it accepts which means that decrypted data is what is going to be provided as arguments for the API calls every time. Obviously the Windows APIs weren’t programmed to perform their own decryption from within the calls so the arguments MUST be decrypted before use for an API. Let’s say you saw a CreateFileA import in the binary but all other tools showed you nothing and there’s string encryption so you can’t see any plaintext files. Set a breakpoint on CreateFileA and press play to see the values on the stack.

TLS Callbacks are annoying pieces of garbo. That’s why they’re used to do malicious things. TLS Callbacks are executed before the EntryPoint. This means that before you can even start stepping through with the debugger, it has already performed some actions because most debuggers will pause execution specifically starting at the EntryPoint. There are a couple of ways to get around this if you suspect that this is happening. The most reliable way is first use something like PEView to grab information regarding the TLS Table Address and TLS Table Size. Then set a breakpoint on every callback function registered inside the table. The easy way to do it is just select “break on TLS Callbacks” in x32dbg preferences but sometimes this doesn’t work and is less reliable.

There are also Windows Event Callbacks that are called for specific events like a mouse click. So if you’re single-stepping and clicking the mouse to do so, the callbacks would be executing and you wouldn’t even notice. Also, more annoyingly, if you set breakpoints, these callbacks will still bypass your breakpoints. Unfortunately, the only solution I found for this is to set breakpoints on all APIs that have the ability to register callbacks and callback functions.

Another tip is to not be afraid to change the EIP. Remember the introductory Assembly notes above? The EIP is the Instruction Pointer and it tells the computer what instruction to execute next. In a debugger we can change it to start executing anywhere we want in the binary. Think of it like a teleport. I’ve used this for multiple things. If a function was called and I missed it because I wasn’t paying attention or I was pressing “Step Over” too fast and wanted to know what the function did, I’ll just teleport back to it by changing the EIP to the address of that function. Or if there’s a function that is constantly being skipped over or never reached I’ll change the EIP to it just to see what it’ll do and then work backwards to try to figure out why it isn’t ever being called. If there’s a function that I really want to know about quickly, and I’ve identified it earlier, I’ll just try skipping directly to it. There’s a ton of uses and it’s one of my favorite tricks. However, the functions you’re looking to teleport to might be reliant on data structures setup by previous functions so this trick doesn’t always work. But when it does, it’s great and you should use it.

Advanced Static Analysis Tips

One of the biggest tricks for this is to learn to identify commonly used structures. Remember that malware authors are just like any other devs. They’re lazy and will copy-paste when given the opportunity. So a lot of the time code structures or code flows will be the same between two completely different malware families. Learn to recognize these by researching commonly used TTPs like process injection, process enumeration, commonly used encryption algorithms, etc. Learning to actually write malware using common techniques can help as well because it will show the translation from source code into assembly.

Once you can identify the common structures, you can reverse most aspects unless the authors have implemented something like a custom encryption algorithm. However, even if it’s custom and you have no idea how to reverse it, you’ll still be able to identify that it’s an encryption/decryption algorithm with a bit of practice. Identifying it and where it’s located, will allow you to see the before and after of each string once you’re debugging the sample.

You can certainly figure out some very important information by advanced static analysis but it’s an art in itself. Those who are really good at it can do amazing things like crafting C2 emulators in order to send commands to the sample as if they had the actual admin panel in front of them. However, even those of us who aren’t smart enough to do wild things like that are still able to get information.

I use advanced static like a mini-map when I’m debugging. I feel much more comfortable debugging so that’s usually what I stick to. However, as I mentioned earlier, it’s easy to get lost while debugging. So before I do any debugging, I’ll first uncheck the “DLL can move” option in CFF Explorer to disable ASLR for that sample. If you don’t know, ASLR stands for Address Space Layout Randomization. It was a utility developed to try to defend against exploit developers. Exploit devs have still been able to get around this though. Basically what ASLR does is every time you execute the binary, it will be stored in a different address space in memory. This will make the offsets for jumps and functions completely different between your static analysis tool and the debugger which is why I’ll disable it.

Once ASLR is disabled, I’ll open it up in Binary Ninja or Ghidra. It’s here that I treat reverse engineering like a timed test. Using code analysis, I’ll find the cross-references from the imports to quickly reverse some easy functions that I can easily identify and know what they’re doing. Basically I’m answering the easy questions first. Then I’ll start identifying the functions that I sort of know what they’re doing but I’m not skilled enough to completely reverse it or determine how the task is being done (e.g. custom encrypt/decrypt) but since I know what their goal is, I’ll mark it down as a function that I’d like to know more about. For any of these types of functions, I record the entry point address for that particular function. Then I’ll head over to my debugger and try to find it, label it as what I think it’s doing and set a breakpoint on it so I can watch what it does. That way even if I don’t know how a string is being decrypted, I can see the encrypted string being passed in as an argument and the resulting decrypted string once it’s finished. There are a number of tools that can help with this. For example, there are tools that allow you to take all of the labels from Ghidra, Binary Ninja, or IDA and transfer them to your x32dbg session. This way you’ll have all your labels and won’t have to navigate purely with offsets. Another tool that exists is ret-sync which allows you to link the views between x32dbg and Ghidra or IDA. It will link the disassembler and debugging views together so you always know where you are.

Here’s a big tip that helped me. Lookup all constants! The signature matching tools can’t find absolutely everything. Find the constants in the binary and search them online. You’ll most likely find documentation on an algorithm in question or possibly to a malware family/technique.

Patching Binaries

This is an extremely powerful technique especially for bypassing any identified anti-analysis techniques. As an example, let’s take the very standard CreateToolhelp32Snapshot, Process32First, and Process32Next technique to iterate through all of the processes. One of the anti-analysis techniques is to do this and compare the names of all the processes one by one to a blacklist the malware sample has stored in order to try to identify any programs that would indicate it’s being analyzed (like procmon or wireshark). So if we don’t patch it, then on every execution we’ll have to set breakpoints around this area and manually modify the process names or change resulting flags/register results so nothing matches the items in the blacklist. This is very annoying and time consuming so instead we can patch some of the surrounding instructions.

If this whole process of looking for analysis programs is in its own little function and that’s all the function does, the easiest would be to replace all CALL’s to this function with NOPs. That way this particular anti-analysis function won’t ever be used during execution. However, this easy situation isn’t always the case. This particular situation could just be included in the Main function so we can’t just NOP it all out. Maybe it’s part of a function that does anti-analysis but then also performs key malware actions so we have to be more precise about what we patch. So what else can we do? Well in order to figure out if strings match, there needs to be some sort of compare happening. It can be a call to strncmp() or a custom written version of strncmp() (yes, some malware authors will manually write out the strncmp() function instead of calling it because they’re annoying people and like to make analysis difficult). The key point is to find that function and step over it. The resulting value (usually a 1 or 0) will tell the malware if the strings matched or not. This 1 or 0 value will then be compared to a stored value by using the CMP instruction. From there it most likely will have a decision branch of something like “if it’s a 0 then continue execution but if it’s a 1 then call ExitProcess().” The key is to patch it so it always takes the branch that we want. We could change the CMP instruction so that every time there’s a match, it’ll continue the execution but that’ll mean that every time there’s a process that doesn’t match (which now after patching the CMP would mean any non-analysis programs) it’ll call ExitProcess(). We don’t want that. So what’s the trick? The trick is to patch the jump. If there’s conditional branches the jump instructions will be using things like JNZ, JZ, JNE, JE etc. These basically translate to if-then statements. All we have to do is change the conditional jumps to a regular JMP with the value of the address for the branch it takes to continue execution so it’ll never take the ExitProcess() branch.

Key things to remember though about patching binaries. You might need to pull out some assembly optimization tricks because you want to do your very best to keep the size of the instructions the same. In other words if you’re patching out 8 bytes worth of instructions, you want to try to replace those instructions with 8 bytes worth of your own instructions. If you increase/decrease the size, you can cause issues and actually break the executable. In x32dbg when patching with just NOPs, you can select a checkbox that makes sure it’ll keep the same size with NOPs. The NOP instruction is only one byte so when you replace an instruction and check that box, you’ll see that it might make 4 NOPs for you in order to keep the size of the binary the same.

Other Malware Formats

Powershell Scripts

More often than not, malicious Powershell scripts are used as Stage 1 loaders. They pretty much always reach out to a C2 to download the next payload although I’ve seen them contain a stored .NET executable before. Due to their plaintext/uncompiled nature and increase of focus on Powershell in AVs and EDRs, attackers have been getting pretty annoying with obfuscation. The good news is that if you can get around that obfuscation, it’s an easy analysis.

Usually a powershell script will be obfuscated in layers. First layer is very often a small script that is like: Powershell.exe

This is often used for layer 1 for size purposes usually. After decoding the base64 you’ll probably run into more annoying things like

IeX([DoiNgAllKiNdSoFStrInGthinGs]::["C" + "on" + "Vert" + "FromBasE64"]ASDWERgdvdaerawtHdbasVBDF==)

But when you start to practice with powershell deobfuscation a bit more, it’s not really that bad. You can go in and deobfuscate piece by piece but we’re lazy computer nerds. So by far the easiest way is to let Powershell do all of the work for us. A neat trick is to remove the IEX and its corresponding brackets. IEX is Powershell shorthand for “Invoke-Expression” aka “run this command that’s inside these two brackets.” So if we remove the IEX and its brackets and instead put the stuff that was contained within the IEX brackets into a variable instead, we can just print out the variable.

So using the above example, we would just put this into powershell:

$deobfuscateScript = [DoiNgAllKiNdSoFStrInGthinGs]::["C" + "on" + "Vert" + "FromBasE64"]ASDWERgdvdaerawtHdbasVBDF==

Then you can just print out $deobfuscateScript to see the result. You can repeat this for however deep the layers of obfuscation go until you have all of the answers. If you mess up and the Powershell runs then that’s why you’re doing this in a sandbox. Don’t be afraid of just letting the malware do the work for you. The malware can’t hurt you.

Shellcode Analysis

If you’ve identified a block of shellcode during any point of your analysis, I highly recommend that you take a look. The reason for this is that adding shellcode into a program and using the program to execute it is actually very straight forward and the first thing that you learn how to do when learning how to write malware and therefore is used a lot.

Identifying

Shellcode is just more assembly. The difference is that it’s usually stored in a malicious binary as raw bytes without any sort of file structure. Due to its file structure-less nature, it’s considered Position Independent Code (PIC) which means it can just run wherever in memory (sometimes it’ll require a specific DLL to be present in memory to run properly though). Shellcode is designed to be optimized for size to fit inside limited buffer space. When generating shellcode there are a couple of common bad bytes that you don’t want to include. The two worst bad bytes that should never be in any shellcode include 0x00 and 0x80. The reason for this is that NULL bytes and interrupts are the worst for shellcode. So if you see a memory buffer that contains a decent amount of data that looks like a bunch of nonsense bytes and doesn’t have any NULL bytes or 0x80 in the code, then you might have identified some shellcode.

If you’re debugging a sample and you think you might have identified shellcode in a memory dump, you can get more insight by right clicking and selecting “view in disassembly.” If it doesn’t have any valid assembly instructions then it might just be a data structure but if it has valid assembly instructions then there’s your shellcode that you can dump.

YARA of course. If you think that something might be shellcode, might I recommend trying some YARA rules such as the Meterpreter or Cobalt Strike Beacon YARA rules? Often when there’s shellcode, there’s a payload.

Analyzing

So you’ve dumped the shellcode, now what? Despite the binary not having a proper file structure, you can still open it up in a binary analysis tool like IDA, Ghidra, etc to take a look at the assembly. However, you usually don’t need to go this deep with shellcode since it’s usually very short and used to perform a basic task. Maybe use it to determine what sort of obfuscation it uses (if any). What I usually do with shellcode is use some sort of CPU emulation framework to either symbolically execute or actually execute the shellcode to see what it does. Tools like Unicorn, Angr, Speakeasy, Scdbg all help with this. Even Un{i}packer could help with this despite it being designed for unpacking (since it emulates code). Speakeasy is certainly my favorite.

Speakeasy was developed by Mandiant’s FLARE team and it emulates the instructions to give you a report on what the shellcode does. Take a look at its documentation for more info.

Angr is useful if you prefer emulating with a Python library. It’s also great for conditional branches. This is true not only for shellcode but for regular executables as well. You’re able to specify the entry point, the bad branch, the success branch, and then it will simulate the execution over and over and give you the information on how to reach the success branch. This framework is commonly used for reverse engineering CTFs so there are plenty of write-ups on it to learn some of the functionality.

Scdbg is one that actually executes the shellcode iirc but it will record the actions that the shellcode takes. I haven’t messed with this too much but it’s included in the FLARE toolkit.

Why do we care about shellcode?

Easy answer is because shellcode is generated from a hacking framework like Metasploit, Cobalt Strike, Empire, etc. The automatically generated payloads from these attack frameworks can easily be created into shellcode and included in part of the main executable that will later be launched. This provides the attacker with an easy backdoor without having to write an entirely custom RAT. From analyzing these payloads, we’ll get some network-based IOCs like C2 addresses. On top of that we’ll get some threat intel because maybe this particular threat actor only makes custom launchers for Meterpreter payloads or something.

The most fun part of it is so you can control it. If you’ve identified a sample as having shellcode in it and you’ve identified what kind of payload it is, you can control it by redirecting the traffic to an attacker VM like a Kali box.

Dynamic API Loading

Introduction

Dynamic API loading is an extremely common technique so I wanted to go a little bit more in-depth with it. Dynamic loading is utilizing the Windows API to load additional APIs needed without directly linking them to the executable during compilation time. Normal programs can use this technique to save space while still able to perform actions in the event of a lesser traveled decision tree in the code. Malware authors use this to hide APIs utilized from the IAT.

Source Code

So below I’ll provide some code on how this technique is actually written in code. I’ll use a simple MessageBox example. Note: I don’t have Visual Studio on my work machine so this is from memory. There might be a compilation error but the ingredients and the formula is correct.

#include <Windows.h>

// Set the type definition for the function you're going to be calling dynamically
typedef PVOID(WINAPI *PMessageBoxA)(HWND, LPCSTR, LPCSTR, UINT);

void main()
{
    // Load the User32 library since it's the one that contains the MessageBox function
    // Get a handle to the newly loaded User32 library
    HMODULE hLoadUser32 = LoadLibraryA("User32.dll");
    HMODULE hUser32 = GetModuleHandleW(L"User32.dll");

    // Link the typedef function created earlier with the real function being located by using GetProcAddress
    PMessageBoxA funcMessageBoxA = (PMessageBoxA)GetProcAddress(hUser32, "MessageBoxA");

    // Use the dynamically loaded function
    funcMessageBoxA(NULL, "Success!", "Dynamic API Loading", MB_OK);
}

Reversing

So once this is compiled and stripped of all debug information, the only Windows APIs you’ll see are LoadLibraryA, GetModuleHandleW, and GetProcAddress. MessageBoxA won’t be anywhere in the imports. However, you will see it in the strings. You’ll also see User32.dll in the strings as well. This is a big hint for dynamic API loading. We’ll get to string obfuscation in a bit.

But first, how does one figure out what’s being loaded and when? If the strings aren’t encrypted, then you’ll be able to see all the arguments to GetProcAddress from static code analysis. Easy answers there. You can also do it while debugging if you’d prefer. You just need to set a breakpoint on GetProcAddress to see the arguments being passed into it. As we can see above, the only arguments for GetProcAddress is the handle to the library (which in this case is User32.dll) and a string containing the API call to dynamically load (in this case is “MessageBoxA”). This will give you all the answers. After stepping over the GetProcAddress function, you should see CALL instructions get populated with the actual API name in x32dbg. Or there will be CALL instructions to registers like EAX and once the EIP is hovering on the CALL instruction, you’ll see where it’s actually going to be calling. In this case, when hovering over it, you’ll see MessageBoxA.

Now that you know how to reverse the basics, what about the harder stuff? Malware authors also do their own little bit of analysis so they know the APIs they’ll dynamically load can be viewable in the strings. So what do they do? They’ll use string encryption. That way during static analysis, both the imports and the strings won’t have anything identifiable as to the loaded APIs. So how do you fix it? Well the GetProcAddress can’t take in an encrypted value so the string will need to be decrypted/dobfuscated before use in that API. Now it’s up to you if you want to figure out the encryption/obfuscation for the purposes of writing an automated deobfuscation script or not. If you don’t care and just want to see what is being passed in, just set a breakpoint on GetProcAddress since the argument HAS to be the decrypted value. However, if you do want to find the string modification stuff to either reverse engineer or to just see the value before it’s decrypted, then I do have some tips. Firstly, work backwards from the GetProcAddress function. The author will either decrypt all strings at once or will decrypt each string just before using it as an argument. Either way, the call to GetProcAddress is the last step. The other APIs (LoadLibrary, GetModuleHandle) will be viewable so look for any random CALL instructions to an unknown function between GetModuleHandle and GetProcAddress. It’s up to you to figure out if that function is doing the decryption or not but if it is, then you’ll see the encrypted string being passed in as one of the arguments and the return value should provide the decrypted string (unless it’s storing the resulting string somewhere and simply returning a 1 or 0 for success/failure). If you don’t see a random CALL between any of these APIs then it most likely has a single function where it decrypts all of the strings before doing anything else with them. So again keep working backwards until you find it.