iDeath of eVoldemort

August 13, 2018

Fairy tales and fantasy stories have long dispelled the myth about the invincibility of global storybook power brokers and villains (as for us, for more than 20 years we’ve been busting the very same myth in cyberspace). Every Voldemort relies on security of his diary, his ring, his snake, his… well, I guess you know all about the Horcruxes. And the success of your war on villainy, whether fairytale or virtual, depends on two key qualities: perseverance and intellect (meaning technology). Today I will tell you how perseverance and intellect, plus neural networks, machine learning, cloud security and expert knowledge — all built into our products — will keep you protected against potential future cyberthreats.

In fact, we have covered the technologies for protection against future cyberthreats before (more than once, a lot more than once, and even for laughs). Why are we so obsessed with them, you may wonder.

It’s because these technologies are exactly what makes robust protection different from fake artificial intelligence and products that use stolen information to detect malware. Identifying the code sequence using a known signature after the malware has already sneaked into the system and played its dirty tricks on the user? No one needs that. “A poultice on a wooden leg,” so to say.

But anticipating cybervillains’ patterns of thought, apprehending the vulnerabilities they’ll find attractive, and spreading invisible nets capable of automatic, on-the-spot detection — only a few industry players are capable of that, sad but true. In fact, very few, according to independent tests. WannaCry, the decade’s largest epidemic, is a case in point: Thanks to System Watcher technology, our products have proactively protected our users against this cyberattack.

The key point is: One cannot have too much future cyberthreat protection. There is no emulator or big-data expert analysis system able to cover all of the likely threat vectors. Invisible nets should cover every level and channel as much as they can, keeping track of all objects’ activities on the system, to make sure they have no chance ever to cause trouble, while maintaining minimum use of resources, zero “false positives,” and one hundred percent compatibility with other applications to avoid blue screens of death.

The malware industry keeps developing, too. Cybervillains have taught (and continue to teach) their creations to effectively conceal themselves in the system: to change their structure and behavior, to turn to “unhurried” action modes (minimize the use of computing resources, wake up on schedule, lie low right after penetrating the target computer, etc.), to dive deep into the system, to cover up their traces, to use “clean” or “near-clean” methods. But where there is a Voldemort, there are also Horcruxes one can destroy to end his malicious being. The question is how to find them.

A few years ago, our products beefed up their arsenal of proactive technologies for protection against advanced cyberthreats by adopting an interesting invention (patent RU2654151). It employs a trainable objects behavior model for high-accuracy identification of suspicious anomalies in the system, source localization and suppression even of the most “prudent” of worms.

How does it work?

When active, any object leaves traces on a computer. Use of HDD, use of memory, access to system resources, transfer of files over the network — one way or another, every piece of malware will eventually manifest itself, even the most sophisticated; traces cannot be removed completely. Even attempts to mop up traces generate more traces, and so forth, recursively.

How do we make out which traces belong to legitimate applications and which ones to malware? While making sure the computer does not go under for lack of computing power? Here is how.

The antivirus product collects info about the apps’ activities (executed commands, their parameters, access to critical system resources, etc.) and uses that info to build behavioral models, detect anomalies, and calculate the maliciousness factor. But I want you to take a good look at the method used to achieve this. Remember, we are after speed of operation, not reliability alone. This is where math, or more specifically mathematical digest, comes into play.

The resulting behavioral model gets packed very small — preserving the needed depth of object behavior information on one hand, and on the other, not requiring significant system resources. Even an onlooker closely monitoring computer performance will not register any sign of this technology.

Example (illustrative):

The maliciousness factor calculation relies on four external attributes:

  • Object type (executable/nonexecutable);
  • Size (over/under 100kB);
  • Source (downloaded from the Internet or unpacked from an archive on a USB flash drive);
  • Spread (more/fewer than 1,000 installations based on KSN statistics)

And four behavior attributes:

  • Whether the object transfers data over the network;
  • Whether the object reads data from the HDD;
  • Whether the object adds data to the registry;
  • Whether the object interacts with the user using a window interface.

Each question can be answered with “no” (0) or “yes” (1).

That said, the file app.exe, size 21kB, extracted from otherstuff.zip, detected on 2,113 computers, not reading HDD data, transferring data over the network, lacking a window interface, and adding data to the registry, will appear as:

1 0 0 1 1 0 1 0

If we present this as an 8-bit integer, we get 0b10011010 = 154. This is what we call digest. But unlike the classic hashing (e.g., MD5 or SHA-1), our digest technology is much smarter. In real life, thousands of object attributes are registered, each one resulting in multiple digests used by a trainable model to identify patterns. This produces as accurate a behavior pattern as we can get. And very quickly, too.

The maliciousness factor is an altogether separate story; both malware and legitimate apps may demonstrate perfectly identical behavior. For example, many applications add data to the system folder. How do we tell which ones do this as part of their legitimate duties and which with malicious intent?

First of all, the factor has a cumulative effect or, to be more clear, is monotonously growing. Over time, that allows for the detection of very low-profile malware without any false positives — short bursts of suspicious activity (such as system registry modification, which occurs every time an application is installed) will not set off the antivirus. The resulting digest is fed through a “black box” trained neural network, which delivers a verdict on whether the object’s behavior is malicious or not.

And of course, the technology gains a lot from KSN — this cloud-based system enables the exchange of suspicious samples, their automated analysis, and refinement of the technology itself to improve the accuracy of its verdicts. The capabilities offered by KSN are used on an ongoing basis to tweak the neural network and have it trained by other algorithms and experts. This helps us to detect not only dangerous files but also networking sessions, components, and other nanoelements of the puzzle, which will ultimately lead us to eVoldemort.