Intro to Malware Analysis
The Beginning
Shortly after college, I started my first job as a Tier 1 Cyber Incident Response Team (CIRT) analyst. My day-to-day duties included primarily conducting email analysis and customer support (our team managed calls to the CIRT team, answered general security questions, and performed initial analysis for a variety of different alerts). Our team would escalate suspicious phishing email links/attachments to our malware team for further analysis. As a new analyst, I thought the role of our malware analysis team was awesome. I had no prior security experience and was amazed by their knowledge and skill.
Over time I learned some basic analysis techniques. I was able to do some quick checks to determine if a file was suspicious or not, but I wanted to learn more. I learned about the GIAC Reverse Engineering Malware (GREM) SANS course and was lucky enough to take it a few years ago. I think GREM helped solidify my interest in malware analysis and open the world of possibilities. The course provided me with a great baseline, a ton of common tools to use, and samples to practice on. Since taking the course I have spent over 6 months as a malware analyst in a full-time capacity and still enjoy learning more about malware analysis in my current role although malware analysis is not a direct duty anymore. I have a pile of malware books at home I plan on working through and incorporating my self-study into a series of blog articles as I learn and continue to progress my understanding.
I also took TCM-Security’s Practical Malware Analysis and Triage course a few months ago. I will be revisiting the material and plan on taking the exam by early 2024. I have found malware analysis resources to be somewhat scarce and believe TCM Security’s class is a great beginner course at an amazing price point. I will share my experiences with other training resources here as I work through them.
What is Malware Analysis?
You may be asking yourself, what is malware analysis and why should I care about it? I believe malware analysis is a critical piece in any cyber defense team. Malware analysts are responsible for analyzing suspicious files and providing the team with reports detailing the malware’s capability, Indicators of compromise, and recommendations for detection. Having a solid malware analyst on your team can help improve your organization’s defensive posture, feed your threat intelligence program with meaningful tactical level intelligence (signatures and threat hunting queries that analysts can use immediately to detect similar threats in your network), and an in house capability to analyze suspicious files in addition to automated tools. I’m sure as AI and learning models progress the world of malware analysis automation will continue to expand, but in my experience, at least automated tools often produce a large number of false positives making the need for analysts to review the tool’s output even more important. I am also a firm believer that you need to understand the underlying mechanics behind the tools you use to be efficient. Like back in school when we did math by hand before learning to use a calculator. If you run automated tools and they spit out a ton of garbage indicators, how will you know the difference? That is why I believe at least learning the basics of malware analysis is a super practical skill set to develop. I believe continuing to progress your knowledge is what makes the difference between a new analyst and a senior analyst. Truly understanding the gaps in your tools and identifying which tool is best for which scenario. This is something I hope to improve more on with this blog series.
Malware analysis skills can also be applied to the offensive side of cyber security. Understanding the methodologies analysts and automated tools used to detect malware will improve your ability to evade detection. Malware analysis and exploitation development I believe share similar skill sets as well. As a malware analyst in some cases, you need to reverse engineer a binary to find out what it is doing. As an exploitation developer or other offensive engineering role, you are reversing programs to identify security weaknesses or building tools that are difficult to analyze/detect. Very similar skill sets, but with a different perspective/end game in mind.
Stages of Malware Analysis
I love the 4 stages of Malware Analysis Lenny Zelter shares on his blog. Lenny Zeltzer is the course author for GREM and founder of the REMnux toolkit for malware analysis, and his blogs are an amazing resource for malware analysts. I think the stages are a good way to slowly approach learning malware analysis in chunks as well so it doesn’t seem too daunting. I realized after taking GREM, that I only have a baseline understanding of malware and there is so much more to learn. I will go over each of the stages briefly here and we will continue to dive deeper in future articles.
Stage 1. Automated Analysis
This is the easiest and fastest stage. You use automated tools to perform analysis and then review a final report the tool provides. The skill level required to perform the analysis is very low, but the output is often incorrect or includes false positives. Some types of malware may be impossible to analyze entirely using automated methods (encrypted/encoded malware may not return anything in an automated tool or if the malware is advanced it may contain sandbox detection/evasion techniques). Automated analysis does provide a great baseline and should be included in everyone’s analysis process. Some popular tools are listed below.
Joe’s Sandbox
Cuckoo’s Sandbox
Any.RUN
Stage 2. Static Properties Analysis
Static properties analysis provides a wealth of information about malicious files. It requires more understanding from the analyst to interpret the data, but you can often find an incredible amount of information from a file by viewing its static properties. Some of my favorite tools for static analysis are PEStudio, strings, and floss. Static analysis consists of reading file metadata like some of the following:
Human readable strings - often you can find many basic indicators just running strings on a file. These could include Indicators of Compromise (IOCs) like IP addresses, Domains, Registry Keys, hard-coded commands, or other malware family-specific IOCs to identify the sample
File Timestamps - Depending on the type of malware and what language it was written in, timestamps can sometimes show if the same was newly compiled or is an older version.
Entropy - This is a score to show the randomness of the data in a file or a section of a file. The entropy score can help determine if a file or section of a file is packed or encoded.
File Header Information - This will help verify the type of file you are looking at, version information, and in some cases may show if the file was packed with a specific packer.
Signature detections - Using a tool like Yara, Loki/Thor Scanners, etc. you can scan the file for pattern matches.
Stage 3. Interactive (Dynamic) Analysis
This stage requires a sandboxed environment to analyze the sample while it runs. The analyst uses a variety of tools to provide the sample with fake resources and monitor its behavior on the system. Dynamic analysis can help analysts unpack samples by running the tool and letting it unpack itself. Sometimes feeding the sample resources like fake DNS/internet or other services will trigger certain behaviors you cannot easily find while performing static analysis. Some tools I commonly use:
Regshot
Process Explorer
Fakedns
Interim
Stage 4. Manual Code Review
Manual Code Review is usually seen as the hardest part of malware analysis. This stage commonly uses a debugger or disassembler to walk through the sample’s code. This method is extremely time-consuming but can provide the most information about a sample. My favorite tools for code review are currently ( I really want to start digging deeper into other tools here and find I am still very week at manual code review, and utilizing debuggers in my own analysis process which I review as a Stage3/4 area since you are technically interacting with the sample, but it still requires in depth code knowledge.):
DNSpy for .NET executables
Ghidra for other types of executable files
Where To Go From Here
I think a lot of people think of manual code review when they talk about malware analysis, but in reality, I believe it's only a small part of malware analysis. You can find a ton of information about the most common samples today using stages 1-3, so if you are still learning or are not very familiar with disassemblers, then don’t worry you can still analyze malware while you work on improving your skills. This article is the start of a series I hope to publish covering malware analysis and discussing topics as I learn/grow in my analysis skills. I plan to post more walkthroughs with live samples, and tool demos, Share other training resources as I work through them, and use this space to document my progress for accountability. I hope you can learn something from my journey and if there are any topics you would like to see please let me know!