One of the most important milestones Kaspersky Lab hit to become a renowned global player in the security industry was the release of the then revolutionary version of Kaspersky Anti-Virus 6.0. Officially launched in 2006, the product was a raving success on the global antivirus market, establishing Kaspersky as a technology leader for years to come. It would be immodest to call our product the best antivirus solution in the world, but a number of magazines and independent benchmarks did it for us.
The path to success was nothing short of patchy and loopy – and hopefully someday some writer from Hollywood will take it and run with it, but for now, we will try to share our success with photos, notes, and memories from the original team of developers. We hope this story will act as a guiding example to young developers making new apps and services today – provided they are as possessed by the idea of becoming the best as these original “Six” creators were.
The success of the “Six” was rooted in the downright catastrophe of the previous version. In fact, this fifth version never saw the light of day as it was originally conceived.
To understand the essence of the catastrophe, one must travel back to 2002: Windows XP had barely hit the shelves; CPUs were able to finally scratch 1 GHz of clock rate; and the relatively young antivirus industry was yet to encounter a totally new variety of threats. All antivirus companies eagerly extended the capabilities of their products: at the time a competitive solution had to include a firewall, a constantly running file system monitor, and dozens of other features.
With a powerful scanner engine built back in the 1990s, the Kaspersky dev team admitted that packing the solution with even more new features would make it unbearably slow – even the then-existing V4.0 was readily condemned by users (there were all sorts of allegations saying ‘Kaspersky is slow’ back then that were an integral part of the PC folklore). That was the reason why the development process of the new V5.0 was approached with a great care, considering the core business essentials: a new CTO appointed, a new development framework employed, and a new antivirus architecture chosen.
The company dispatched its entire pool of resources to support the project. Still, in a year’s time, the conclusion that following all these new development rules would not necessarily guarantee the creation of a competitive product was decided upon. The resulting system, which parroted enterprise-class client-server apps (the choice of architecture made by the CTO), was not able to meet the requirements imposed on the antivirus products by the market. It was slow and heavy and the number of bugs was not decreasing as the team ran the tests. The number of bugs – in fact – was increasing.
“I started to ask people, our company veterans, what they think. They said it was all about the architecture. It was like the house of cards: on fixing one of them, you bring down the entire stock,” Eugene Kaspersky admitted. So it did not make any sense to continue with the project as it was. It had to be entirely demolished and built from scratch.
We can do it!
The Kaspersky Lab dev team forked into two groups: one struggling to fix the product regardless of unwisely chosen architecture, and another to publish the previous V4.0 into a suitable new product.
At the same time, a group of four decided to make a completely new product not only to be compliant with the market requirements, but to be future-proof as well. The goal, which was set by the “Six” dev team, was easy to explain but hard to achieve. The new version had to prevent all viruses and threats from leaking into the system, it had to be fast, agile, and transparent, and… well, it had to be nice-looking.
“We just wanted to make the best product ever”, the Six developer team reminisces. A very small team beset with a monumental task, as it was perceived by the rest 200 staff members. Yet the small team on this challenging mission had their reasons to be optimistic: the company founders, Eugene Kaspersky and Alexey De-Monderik, were at that time seeking alternatives for a new architecture –and they were to discover that the alternative existed and was invented by none other than the Kaspersky team.
Help coming from Prague
It must be said that two antivirus cores (so called ‘engines’) were operating in a bundle under the hood of V4.0. The checking of files was done by the old, very capable (and massively licensed by a bunch of international companies, from G-Data to F-Secure) V 3.0 engine developed back in 1996. The newer task of powering the filtration of the web traffic was handled by the brand new and powerful mechanism conceived in the course of a brainstorming session in Prague in 1998.
The engine ended up being dubbed ‘Prague,’ despite the fact that it was developed in Moscow by Andrey Doukhvalov, who had not attended the brainstorming session in the capital of the Czech Republic. The key ideas, however, originated in Prague, and Andrey joined the company to tweak the ideas and implement the Prague concept.
Prague was intended to become a purely anti-virus core, but the goals set for it were so ambitious that the newly developed engine’s flexibility and ingenuity were enough to power even more complicated systems. The question, whether it is possible to implement the entire product based on Prague, weighed heavily on Kaspersky as he spoke to the developers. He recalls:
“Once I asked Victor Matyushenko how Prague was coping inside of the product, and he said it was ‘Solid as a rock!’. That was the breaking point. Then I just formulated the question. I walked into the room where Graf [De-Monderik] and Petrovich [Doukhvalov] were working and asked them the question: ‘Why don’t we base the product entirely on Prague?’ Graf said something along the lines of ‘impossible-prague-is-not-intended-for-this,’ but Petrovich hesitated. The next morning he came up to our office with a little stack of paper, and he told me, ‘You know, I coded some use cases on Prague. Graf looked up to him and said, ‘We have to have a little chat.’ Then – after chatting – they joined me again and confirmed it was worth trying.”
The trial started with a very concise team who wrote the first lines of code which were later to become the “Six.”
“We started to look around in order to find people who could get creative and contribute something, and we gathered an extended team,” De-Monderik remembers. “Take Pavel Mezhuev, a coder, he was a newbie – yet bright. There was Mike Pavlyuschik, whom we had been long working together with. He was capable of generating out-of-the-box ideas and concepts. I thought he was one of the most talented and industrious creators on the playground.”
After two months of open floor, experimental coding, we reached the decision that the project would become a commercial solution. Now all we needed was a project manager.
“Remember Nikolay Grebennikov from the office next door? He reads quite a lot, he is young and he is new to the company. Let’s take him!” Andrey Doukhvalov says, remembering a conversation he had with De-Monderik. Andrey Sobko, a driver software engineer, would join the party a bit later.
What is ‘Prague’
This section is of more interest to software engineers. The rest of our audience can freely skip it.
Even in the early days of the 1990s, as the antivirus industry was just emerging, there were viruses undetectable by regular signatures. As an example, a polymorphic virus, which encrypts its code differently with each infection is not detectable within the signature approach. As software became more complex, in the world where Internet penetration skyrocketed and malware coders turned from having fun to offering their service to the market, malware turned into an increasingly more sophisticated and manifold threat. Even with an engine incorporating additional capabilities over the signature-based detect algorithm, as it was at Kaspersky, the developers were forced to constantly update the very antivirus, not just signature databases, should they encounter a malware utilizing new principles. It significantly slowed down the reaction-time to new viruses, and the success Kaspersky Lab got after becoming the industry-first to cure the infamous CIH (Chernobyl) virus proved that reducing the time of reaction was well-worth every effort.
With that in mind, in 1998 Kaspersky suggested to colleagues that the time had come to go and elaborate on the new anti-virus engine. Where could they go?
“The company was short of money,” Kaspersky said, “and we had to find the cheapest place near Moscow to leave the city and mull the problem over far from the noise and rush. The place had to be entirely disconnected. There wasn’t any Wi-Fi around at that time. The cheapest place turned out to be a European capital, namely, Prague.”
By brainstorming the new antivirus engine version, the Kaspersky Lab team reached the conclusion that an object-oriented approach was the best solution for the engine layer, that is to say, each analyzed file or object had to be dissected based on its structure, and the objects inside had to be detected, analyzed, and checked. The object management, in its entirety, had to be executed at run-time.
All existing object environments were discussed and rejected due to being inflexible, memory-devouring, or slow. An idea emerged in the course of discussion: to develop our own environment, which would include memory management capabilities and other service procedures, which would give the antivirus the ability to dissect and analyze a potential malware code in a fast and efficient manner.
The overarching idea was born in Prague by De-Monderik and Andrey Krykov, and was supported by the first lines of code provided by Doukhvalov and Kryukov.
Then, for over a year, it was mainly Doukhvalov who continued to elaborate on Prague – this was the reason Kaspersky Lab hired him, after all. Having an architect expertise, Dukvalov ensured Prague would be flexible, scalable, and easily deployable into the product without invoking any architectural limitations. Ultimately, the goal was to build a multiplatform solution.
The object hierarchy was a bit challenging to debug, but a convenient inter-object message exchange system and a minimalist programming interface made Prague an easily integrated architecture to use on-demand where applicable.
“It was intended for component approach,” Doukhvalov proudly notes. “That means, component could be added to an existing program. The system was very open, with opportunities to add elements and change its behavior patterns.”
A component-based architecture, compact and not resource-intensive, per common agreement, was the basis to introduce a series of radically new technologies to KAV 6.0. They were easily implemented. Moreover, when Prague was tweaked to serve as the basis for the entire product, and not only an antivirus engine, Pavel Mezhuev significantly contributed to fine-tuning the architecture:
“We implemented one more architectural solution, deploying a model of separating business logic and an interface. Also, Doukhvalov and Mezhuev created a task manager which was able to control any single process within the product, and the process reciprocity was very simple”, states Nikolay Grebennikov, who was the project manager for KAV 6.0.
The principle of “Six”
Considering the trial and initial development stage were both carried out by a small group, it became clear that the monstrous project management approaches would not work for the team. As a result, an approach similar to SCRUM, was implemented: the developers would be seated in an open space, continuously interacting – thus they readily covered all aspects of the development process. That was the way the “Six” development team worked.
Short profile: SCRUM
SCRUM is a project management approach for agile software development environment. It is based on the principle that the customer (user) does not necessarily know exactly what they need and may change the requirement in the course of the process. That means the development process is characterized by presence of consequent and numerous cycles: building – demonstrating – analyzing feedback – updating the version.
But the SCRUM distribution of roles was significantly reviewed. Kaspersky has defined six roles:
This is a person – actively involved in the coding process – who knows what to build and how to build it.
There is no straightforward definition for this role, but the person is responsible for ensuring that certain solutions are brought to life. Perhaps more importantly, the technical designer must know how NOT to do things.
The inventor applies unconventional solutions to solve problems. In the case of “Six”, the problems were countless. The solution had to ultimately provide the highest level of protection yet consume the least amount of computing resources.
The role of project manager in SCRUM does not strictly presuppose regulation. He controls the resource pool and deadlines, but he is not an immediate leader. He does not command the coders on what to do, but motivates them to take the lead themselves.
“The team was small we did not even have a chief person at first,” Doukhvalov says. “The manager was planning, reporting, but the decisions we were taking on the project itself were collective.”
The product is created for clients, not for the dev team itself. It is vital to have an understanding of the expectations the prospect users judge the product against, and how they are going to use it. While the operational principles are defined by those who understand the nature of the anti-virus functionality, a thousand little things such as settings, messaging, or UI, have to take user requirements into consideration.
Working under pressure, lack of sleep, in-group conflicts, instability… Someone has to ensure the environment in the room is friendly and productive. This role was assigned to Kaspersky himself, so he predictably combined it with the role of a sponsor, who provides funds and resources to the team and protects it from outside influence.
Frankly, there is one more role which is of vital importance for SCRUM projects – it is a scribe, who takes notes on the process. But this position was not occupied by anyone, and it created problems.
“We did not know why we did this or made that decision only half a year ago,” Kaspersky affirms.
According to the principle, the number of roles does not necessarily correspond to the number of team members. One role may be distributed amongst several people, while a single member can perform several roles.
“While preserving a formal organization, we acted as a single team, so there were blurred lines between roles: in particular, when brainstorming, people took over different roles”, Nikolay Grebennikov confesses. “Say, one person was coding, yet expressing his view of the design – and it counted. I myself was a Project manager but also participated in discussions – so those blurred lines really contributed to our success, as we cared about any single element of our project.”
According to De-Monderik, the coders were highly interchangeable: “Each member of the team was God of is his domain, yet 50% of his skills were overlapping with someone else’s. Mike was able to code drivers if Sobko was not present, UI specialists could handle some engine-related tasks and vice versa. I could do designs instead of Max Yudanov, and Kolya Grebennikov at times could design skins as well”.
It is crucial to understand that each role is leading on a specific stage of the project execution. During the start phase, the architect is the center figure. The inventor comes into action during the middle stage of the development process when the features are created and developed. And during the last stage, the key figure is the manager as the project now has a lot of resources which require tight management in order for the team to be able to make the deadline.
In pursuit of the ideal
Given the ‘SCRUM’-ish approach and the overall ambitiousness of the project, “Six” did not have any static requirement list. According to the basic requirements, the product had to feature the following capabilities:
- Full-fledged support against current security threats;
- Optimized use of PC resources;
- Component-based infrastructure for better scalability;
- Easy adaption of components to different platforms.
With these meta-requirements, the corresponding technical requirements to the products underwent a series of changes. As a result, the release was continuously postponed, but the team was able to develop a revolutionary solution a couple of years ahead of general market requirements, which was also superior to the previous version in terms of speed.
Following the launch of Kaspersky Anti-Virus 6.0, Maxim Yudanov, who was responsible for UI design, said: “One of the key differentiators of the project is the absence of a ‘set in stone’ requirement list. We made prototypes, discussed the product, updated the list of technical requirements and features, jotted the main point down on yellow Post-It notes and stuck them onto the monitors, forgot something, remembered something, and all over again, asked for help from the audience (I mean the beta test community). I am sure the final product would not have been what it is now if we had intended to base the work on the traditional requirement list. If it were the case, we would have ended up creating a product as we imagined it at the start. I am assured that in that case, the product would have been lots worse than the one we ended up with”.
Today, this approach is not novel. But ten years ago, applying this method to large-scale projects was revolutionary and unconventional. The major difference between the so-called ‘extreme programming’ (the term was widely used then, now similar methods are united under the ‘agile software development’ umbrella) and the bureaucratic CMM-coding approach (which is now practically extinct) lies within the absence of traditional requirement list as a Holy Bible once approved as a sole basis of project work for years ahead. CMM might be a right approach for outsourced development projects, but for commercial projects it is useless.
Nikolay Grebennikov, now Kaspersky Lab’s CTO, agrees: “Should we first come up with a fixed set of features with no due-course changes applicable, we would not have had any vision of what users exactly needed and could not hope for such extent of support from them. The first version of the build was not really usable and had a lot of problems. To fix the issues, we spent a lot of time – five quarters between the alpha and the technical release. In today’s world, it is the luxury you cannot have, but at that time, it was a very useful experience”.
Kaspersky is quite straightforward about this: “When you develop innovations, get ready for continuous deadline violations.”
Life at work
The key members of the KAV 6.0 development team used to reminisce over that period of their lives with nostalgia. With lack of sleep, lack of time spent with families, lack of free weekends, they were under great emotional stress and saw the payoff in the progress and quality of the work.
One of the emails Nikolay Grebennikov wrote during the period when the project was underway, provides a rather poetic insight:
“At some point of time, it just stopped being a mere project. It was like engaging into some game of terrific scale and might, when you plunge into it and live it from the start to the final credits. On your way to work, you ride on the metro recalling wins and fails of the previous game save, then you get to work, and then you think of how you can play it through to the next level. Having put your child to bed, you again live inside of the game where you are capable of all the things imagined, and everything is possible.”
“It was the time of my life, actually,” Kaspersky recalls. “Eyes shining with enthusiasm, Post-It notes, sleepless team members. It was a boiling soup of ideas and action.”
When the team extended, the core of the dev group passed this team spirit onto the newcomers, as De-Monderik recalls:
“With all team members working well, we had to count on the ‘core group’ ability to spark enthusiasm. There was the ‘core group’, they had an overarching idea, they had the challenge: to make the best product ever. It was the key goal for us: Kolya [Grebennikov], Pavel Mezhuev, Doukhvalov, myself, Mike Pavlyuschik… we were able to transfuse our enthusiasm to other team members. When everyone around is working hard, and you are in the same room, when you see how it actually happens, and you unknowingly try to commit as well”.
Even the project management was quite informal, and it brought its fruit.
“If I remember well, at first we had status meetings,” De-Monderik said. “In the morning when the group used to gather in the room, Kolya used to deliver some kind of recap speech: we have such resources, we do such things today – he was so good at it. We had an enormous whiteboard where we wrote and drew our findings. As we were not a huge team, it was enough.”
Grebennikov acknowledges that the key takeaway of that experience was that the status meeting as a formal method of organizing project work is not the main thing. You should gather the team only if the meeting brings derivable benefits to the project.
Expanding the reach
As the project evolved over time, from September 2003 to March 2006, the team grew and on the day of the release the group was almost 30 people. With more requirements and the transition to the ‘Alpha’ stage, the team then included Maxim Yudanov, a designer, Pavel Nechayev, Denis Guschin, Eugene Roschin and Andrey Gerasimov, software engineers. They brought a set of innovative features, including ‘skin’-based UI and a built-in firewall. The group also included installation specialists and beta tests supervisors. Yet one of the most definitive steps taken to modify and refine KAV 6.0 was a newly invented approach by the “Six” dev team – a forum-based beta-test.
All the stakeholders collectively admitted that it was a result of the dedicated testers’ forum that Kaspersky Anti-Virus 6.0 turned out to be so well-designed and carefully tested. The open beta testing (which became a common practice in Kaspersky Lab) was then a true innovation and a risk: the competitors could have had an opportunity to learn about product features before its release.
“We took the matter seriously as a beta test effectively exposes the beta code to hackers and competitors,” Grebennikov said. “So people really took sides about it. The opponents had their say, and their basic arguments are listed above. Yet the supporters also presented a solid proof of their standpoint. Our resources were limited, with only two testers available while the rest of the testers were allocated to the V5.0 trials. Our product was created from scratch, and we needed to assign a massive group to run the tests. For the first time, we employed the regular build update approach, first on a weekly, then on a daily basis. Testing those builds on the forum allowed us to provide the highest quality of testing without involving significant internal resources.”
All the developers were actively engaged into the forum discussions with the testers.
“During the forum testing the pool of testers included over several thousands of users, with some 500 people as active core audience,” Nikolay Grebennikov adds. Every night Nikolay used to spend hours on the forum, at times falling asleep at the computer. They were eager to see a new build and used to run it every evening, with absolutely no cost for the company.
The forum residents provided both information on the bugs and suggestions on how to make the product better. A significant part of this collective feedback was taken into account, contributing to KAV 6.0’s value. Besides, the suggestions were not collected solely online. Kaspersky recalls that developers took regular strolls down the office grounds, involving everyone, from sales managers to technical support personnel, in the beta version test. Based on feedback from the staff, the product would undergo certain improvements: for instance, based on suggestions received from the technical support team, switching language to English was designed via a single click.
Still, the improvements brought from the forum discussions came with a price – mainly translating to more deadline violations.
“Based on feedback from forum users, we received a huge list of suggested improvements, but at one moment I realized we cannot add anything above what we had already had, even if the proposition was compelling. We were under stringent planning requirements to launch the release by Q2 2006. We made it just at the nick of time: at 6.30 pm on March 31”, Nikolay Grebennikov dramatically points out.
The bottom-line is, our extended yet still quite concise team developed a product with a phenomenal compact installer, smooth user interface based on exchangeable skins, low impact on the PC’s performance, and, most importantly, a product literally packed with powerful and innovative features, including proactive protection capabilities to block suspicious applications based on their behavior patterns.
“They at Symantec were knocked down when American magazines started to rate us as a golden-star product. We received top marks everywhere”, Eugene Kaspersky affirms with delight.
Thanks to established partner networks in Europe, US, and China, the successful product immediately got along the supply chain, and the best-selling rankings in online stores visibly went green.
The success of “Six” was due to wisely selected architecture, which allowed to easily deploy technical innovations and offered high performance, as well as to the development approach which was a 100% match for a more concise and enthusiastic team. That is, supposedly, the key takeaway from the work done in order to bring Kaspersky Anti-Virus 6.0 onto the market – to make your project a 100% success, both architecture and the development techniques should be well aligned to fit the development team and the scale of the work.
Six roles and one coffee machine
“In the SCRUM playbook we tossed between us during that time I spotted an interesting rule, which I now consider in many spheres besides development”, Kaspersky states. “If something is getting in the way of the development process, it should be eliminated as a first priority. Period. It does not matter what it is. That also means, should a developer need anything, they should get it right away.”
“On day one of the project I asked: ‘What is your first priority requirement?’ ‘A coffee machine’, Petrovich replied. Next morning they were provided with an expensive and luxury coffee machine. And we made it!”