Features

To help malware reseacher, SOC investigator, CERT analyst, etc. Exalyze perform some automatic actions. Objective is to do almost a reverser would like to show in a start of his analysis. We also added the capacity to pivot on those datas and identify other potentially related samples. It’s a “toolbox” for many cases to help an analyst.

Malware analysis tools

Sequences Extraction

During fast analysis of malware samples, we often look for strings and external API usage, and how these are used in the sample functions or subfunctions.

Th sequences extraction feature automatically does that, by analyzing strings cross references and sensitive API usage.

The set of identified actions is then stored in database, and displayed for each sample.

This type of quick analysis can lead to a fast understanding (albeit partial) of the malware capabilities, meaning a lot of gain in analysis time during reverse engineering!

In the example below, we can see that the sequences analysis of the sample helps quickly identifying the sample’s capabilities.

Example of sequence analysis — Can you guess what this malware is doing just by looking at the sequences view?

Capabilites Extraction

When manually analyzing samples, we often proceed to identify their capabilities by recognizing global features of the binary.

The analysis is mostly based on personal expertise and subjective assessments, as for example, when we see the presence of a call to CreateProcessW, we assess “This sample probably create processes!”

Such individual calls, functions or pattern clusters may indicate distinct sample capabilities, and may be missed during manual analysis.

Upon sample submission, we conduct thorough functional analysis to identify specific capabilities, then we map these capabilities to their corresponding Tactics, Techniques, and Procedures (TTPs).

A capability summary is then compiled and included in the analysis report.

This executive overview provides valuable intelligence to analysts without reverse engineering expertise, enabling them to quickly understand the malware’s potential functionality.

Note

For some of those capabilities we also generate the corresponding MITRE ATT&CK TTPs and summarize these TTPs in a matrix such as seen below:

Entropy Map

By using a color coded visualization of the malware samples, analysts can quickly recognize its structures. When this visualization is used a preliminary triage tool, if significantly accelerates the initial assessment phase of malware analysis by highlighting anomalies that warrant deeper investigation.

Note

For each file we generate an associated entropy bitmap based on the entropy of the sample separated in 256 bytes chunks.

The entropy map is represented with the following colors:

For example, an analyst can quickly identify packed or encrypted samples by identifying large section of the binaries with high entropy (colored in red). The two following figures show such an example by highlighting the differences between a sample and its packed version:

Yara Rule Matching

In order to help malware analysts caracterize samples, Exalyze evaluates multiple sets of Yara rules and displays which rule has matched the sample being analyzed.

The core ruleset from Yara Forge, which is a highly curated set of open source rules from various providers like Malpedia, ESET, Avast…
A public ruleset from Exatrack
Subscribed users can also access a set of advanced yara form Exatrack

Matched Yaras rules are available directly from the sample report view.

Yara Generation

Exalyze’s YARA generation works in four steps:

Identification of interesting strings to process.
Exclusion of strings already seen in a database of precomputed trusted binaries.
Disassembly of the sample and extraction of interesting parts of bytecode.
Exclusion of patterns already seen in a database of precomputed trusted binaries.

More than a hundred thousands of samples populate the trusted database, but we can miss some patterns, so we added a match check if some binaries share more than 60% of their patterns.

This process ensure that the Yara generation is fast, and with our tests it works mostly fine :D

Similarity Analysis

Code similarity is a unique capability of Exalyze to conduct comprehensive comparison between a sample and ALL other executables already analyzed.

Our similarity analysis engine is based on the Machoc hash which we published in 2016.

This methodology requires a full disassembly of each sample, and then a generation of the Control Flow Graphs (CFGs) for each function. These CFGs are then hashed using Murmurhash, and creates a unique signature for each sample.

The code profiling approach is complementary to traditional hash search such as imphash or richhash, because it enables analysts to find find evolutions of malware families even when all other types of hashs don’t match.

For example, considering the code evolution below, we can see that most of the code is the same, but a few functions were added. Using our similarity analysis matching we can quickly identify this kind of match between a sample and thousands of others.

We gave a presentation about our malware similarity analysis algorithms at BotConf 2025!

Note

We establish a similarity threshold of 75% code correspondence to classify executables as related.

Finding Similar Samples

When hunting for threats, we are often looking for variants of known malware families.

This unique capability is very useful for both threat hunters and SOC/CERT analysts, because if “the funny sample” you found is highly similar to a malware, it probably isn’t a good news.

Sample Metadata Search

Exalyze allows to search for samples based on ther extracted metadata, this includes:

Sample hashes
PE Metadata ( Orignal file name )
Network Identifiers (extracted IPs, URLs, Domains)
Matched Yara rule (subscribed users only)

Exalyze offers a flexible query language that allows to combine multiple criteria in order to further refine a search. This allows threat hunters to easilly pivot and further their investigation.

For example, here is the list of all the samples who have the substring core in their original filename and are refering to the IP address 4.0.0.0

Sensitive Sample User Matching

When facing an unknown threat, it is interesting to be able to get in touch with someone who already dealt with a similar situation. Exalyze makes it easy for you to find someone who already faced a “similar” sample (in terms of similarity analysis).

If you upload a sample with the level of confidentiality sensitive and this sample matches one or multiple sensitive samples already present on the platform, then Exalyze will create a match between you and the user who uploaded the matching sample.

This match grants to each member access to the other contact information.

You can view all the matches concerning your profile by clicking on My user matches.
By contact information we mean an additional contact email that you can choose or not to provide when registering or editing your user profile
Providing a contact email is required to upload a sensitive sample
For this contact email, we recommend using an email address that decently anonymize you
If you want to opt out from this feature, you can remove this contact email from your user profile at any given time and we will hide any matches concerning your profile.
The sensitive level of confidentiality offers the same guarantees than confidential (ie: not visible to anyone except yourself or any member of your group if you have one)