How Malware Works

How Malware Works
Dall-E Image with prompt "scary looking malware with a dark ominous background"

I had a question on one of my posts on YouTube that made me realize that some people may not be super familiar with some of the terminology surrounding malware development and reverse engineering.

This isn't meant as a shame on the OP, moreso a failure on my part (and a lot of others who use technical terminology without properly explaining it in articles and videos meant for a broad audience) in not explaining it.

This article is meant to be a way of backing up a bit and explaining how malware works in general. It's an incredibly generic definition, because malware comes in all kinds of shapes and sizes and is written to do many different things.

Broad Generalizations

Malware in its most basic definition is software that is planted on a machine, usually in a non-consensual (the user did not mean to install it) or less-consensual (the user meant to install something else, but instead installed malware) manner, that takes action on said machine, or other machines networked with the target machine, that would generally be described as malicious. Malicious activity can be anything from running random calculations on your machine using your hardware as a cryptocurrency miner to encrypting your files for ransom as is the case with ransomware.

In general, malware is defined by the primary actions it is meant to, or is actively observed, taking part in: if a piece of malware is primarily built to encrypt files and facilitate a ransom payment, it's primarily thought of as ransomware, even if it can also steal files. In some cases, malware is described based upon its initial intrusion (or installation) vector. Malware that is installed less-consensually by the user who is intending to install something else is often called a Trojan as it is a piece of malware masquerading as a piece of legitimate software like the giant wooden horse of Troy. There are also more general-purpose pieces of malware whose primary purpose is to just give the attacker remote access to a target device: these pieces of malware are often lovingly called RAT's, or Remote Access Trojans.

I prefer to use the most descriptive name possible, but often it is more necessary to just default to calling something malware. In general, though, I prefer using naming conventions that aptly describe the primary understood purpose of the malware: ransomware, wipers, RAT's, etc. are fairly descriptive, while calling something a Trojan is less descriptive, even moreso because that particular term is often used fairly flippantly. Implant is another that lacks any sort of descriptive nature.

Command-and-Control (C2)

Generally speaking, most malware is not entirely automated and relies on external infrastructure for some purpose. This may be receiving commands from or sending exfiltrated data to the attacker, or it may just be simply telling the attacker that they have successfully infected a system.

This infrastructure is generally called Command-and-Control infrastructure, often shortened to C2 or C&C. C2 infrastructure is usually a web server under the attacker's control (though sometimes it isn't owned by the attacker, it may be a server that the attacker has hacked into in some manner) that is used to communicate with the malware or the system infected by the malware.

One of the ways that defenders and researchers track malware is using C2 infrastructure: if malware A and malware B both try to communicate with domain C, there is a good chance that malware A and malware B are somehow related, or are being used by the same attacker. This analysis is strengthened via victimology, or the particular types of victims targeted by a given attacker or malware, as well as particularities of the malware itself discovered via reverse engineering.

For the malware developer, deciding how your malware is going to communicate with your infrastructure is vital. If you want your malware traffic to blend in with other traffic on the network, you can use the HTTP(s) protocol to communicate with a web application-based C2. This means the traffic will look like the victim is just browsing the web. You can also write custom protocols to communicate over specific ports, or you can use other common protocols like DNS. The world (of networking) is your oyster.

Actions on Objectives

Like I said before, most malware is categorized based on what actions you want to enact on the victim. If you're writing ransomware, you need to figure out an encryption/decryption schema. If you're writing a RAT, now you need to build in functionality to execute commands and other actions and communicate with your C2. If you're writing an infostealer, you want to find ways to access the information you want to steal and send it back to the C2.

Next you want to find ways to actually exploit your access to the system: ransomware is useless if you can't receive payments and communicate with the victim. RAT's are useless if you don't know what you want to do with your access. Infostealers are useless if you can't exploit the information you're stealing.

Frankly, as a defender, this is an important stage to focus on for detection and remediation, but waiting until the exploitation stage is a dangerous tactic. You want to catch the malware as soon as it lands onto the victim system, or before. There are a multitude of ways to circumvent protections at this stage: ransomware can use different libraries to encrypt files, infostealers can find a multitude of ways to access the target information, and all malware can use a variety of ways to communicate with their C2. The best way to prevent actions on objectives is to stop the malware long before it can start taking those actions. The second best way is playing whack-a-mole with malware as it is actively trying to act on its objectives.