The Sound of Malware
By Trellix · June 23, 2022
Do, a debugger, you often use
Re, a reverse engineer
Mi, a name, I call myself
Anyways….
By now, you must be very thankful I reminded you of this famous song; I am sure it will be stuck in your head the rest of the day. You’re welcome!
Confused on how this relates to malware analysis?
In the world of malware and reversing, there are tools, scripts, and methods we use to investigate the relationship between malware families, detect new versions and understand differences across malware samples. A great example of why we are doing this is to understand if a current detection log written is still working on a newer variant of the malware. What did the adversary change in the code and will that impact our protection towards our customers? One approach is to compare older and newer samples to figure out which components are always present in both samples. In some cases, we get ambitious and use a bulk number of samples that are, for example classified as ransomware, and search for common denominators in the code to help identify new samples which fit the classification of belonging to ransomware.
Another classic method is to perform code comparison is binary diffing. Using BinDiff injunction with IDA Pro, we create two databases of the malware samples and start to compare the two. An overall comparison score is generated, and it will demonstrate the similarities which exists between samples as shown in Figure 1.
In Figure 1, this dashboard from BinDiff, demonstrates the two sample have a similarity of 52 percent. Drilling more down into this dashboard, the differences and similarities are split up in several categories. It is also possible to visualize the functions for example, compare and spot the exact difference. Adversaries tend to reuse code and optimize certain parts in newer releases. The important question is where do these similarities exist? In the ‘generic’ code elements, or in the components that were used to create this malware?
There are many other methods to compare malware such as fuzzy-hashing, extracting code-blocks and then comparing etc. Recently in our blog on DPRK Ransomware families, we used graph-technology and the Hilbert Curve mapping to discover similarities shown in Figure 2.
We have frequently used code comparisons and visualizations but would it be possible to compare malware samples using a more abstract technique? What about sound?
A novel technique
Music, sound, and especially analog synths have always been a fascinating topic for me. As a veteran from the Dutch Navy, I had the pleasure to spend a week on a submarine listening to sonar sounds, seeing frequency waterfalls and filtering out ships versus animals; it was a phenomenal experience. Combining all of this with the desire to investigate and innovate new comparison methods, I pondered would it be possible to compare malware samples based on their sound?
To start, we took two samples of the Conti ransomware for Linux, one released in May, the other one in June. From a BinDiff and code comparison perspective, there were minimal differences and an overall score of 99.8 percent equality. Would our experiment with sound show the same?
First, we had to transfer the sample into an audio-file, so this can be played and used for frequency analysis. We used the ‘cat’ command from the Linux command-line and sent it through an audio player to generate the sound:
>> cat malwarefile.bin | mplayer -cache 1024 -quiet -rawaudio samplesize =1: channels=1 :rate=8000 -demuxer rawaudio -
This will make some noisy audio, similar to a dial up modem trying to connect to the Internet using a telephone line. Rerouting the audio from the headset-jack towards a mixer, the sound was recorded and exported both into the .WAV (waveform) and .MP3 audio files.
Loading the two .WAV samples generated from the Conti ransomware Linux variants in one of our used audio-analysis tools (Audacity and Sonic LineUp), we saw the spectrogram in Figure 3:
Studying the above picture, one can spot that while they are almost identical, there is a minor difference at the end of the sound-profile of the June sample. This matches what was discovered during binary code analysis.
Now, onto the next experiment. From our investigation on DPRK ransomware families, we discovered that the VHD ransomware sample and the BEAF sample had some code similarities but also a lot of differences. Would that also be possible to prove with sound?
Again, we created the required audio files for both samples. Since the binary malware samples were different in file size, the length of recording would be different for each sample. This was reflected when we loaded our audio files into Audacity.
We see first the difference in length, but also the similarities in the waveforms stand out. Using Sonic LineUp, we tried to line up the wave and frequency spectrums to align them as best as possible.
Not only visually, but also when we conduct a frequency plot-spectrum analysis of the two audio samples, the differences clearly present themselves as seen in Figure 6.
The VHD plot spectrum displays a lot more activity in the ranges above 7000Hz than BEAF as one example of identifiable differences.
Converting malware samples to sound to compare them was an interesting and worthy learning experience. It also proved things we found with traditional code analysis or visual analysis were also seen in audio analysis. Honestly, we did not expect the results of this fun experiment in proving what we observed during traditional code similarity/comparison research methods. It was remarkable to discover that in the Conti case, where we observed minimal changes from a traditional approach, that the sound conversion and frequency analysis demonstrated the same findings.
Wouldn’t it be nice if instead of asking for ransom, the ransomware gangs would start making music of their code for us to enjoy on Spotify or Apple Music? We gave it a try to create the first version of Conti ransomware gang Techno where we combined the generated audio of the Conti code and combined it with some tracks using Garageband. Check it out and tell us what you think!
I’m curious if there’s more creative talent out there that can convert code to audio and make some nice music with it. Upload your mix and share it in our SoundCloud channel so we can add it to the playlist.
RECENT NEWS
-
Sep 10, 2024
Trellix Integrates Email Security with Data Loss Prevention
-
Aug 21, 2024
U.S. Department of Defense Chooses Trellix to Protect Millions of Email Systems from Zero-Day Threats
-
Aug 14, 2024
Magenta Buyer LLC Raises $400 Million of New Capital
-
Aug 1, 2024
Trellix Endpoint Security Stops 100% of Threats in Leading Industry Test
-
Jul 29, 2024
Trellix Named Email Security Innovation Leader
RECENT STORIES
The latest from our newsroom
Get the latest
We’re no strangers to cybersecurity. But we are a new company.
Stay up to date as we evolve.
Zero spam. Unsubscribe at any time.