Artificial intelligence safety, or When to expect SkyNet?

June 12, 2015

What do billionaire inventor Elon Musk, the Google Now on Tap service launched at Google I/O, and the recent “Ex Machina” premiere have in common? The idea that unites all three is artificial intelligence or, more precisely, the process of imposing limits into artificial intelligence (AI) so it truly serves humanity and does not inflict any harm.

Should we be afraid of artificial intelligence?

What is artificial intelligence capable of today?

For those who are not really into the topic, let me enumerate several facts which can demonstrate the progress machines have made in their ability to do very human things.

Today, Google can correctly recognize speech with 92% accuracy versus 77% just two years ago; the company has developed an AI platform that learned to play classic videogames on its own accord. Microsoft taught a robot to recognize images (or, more precisely, to see certain objects on the images) with just 4,94% error rate – stunningly, an average human has a higher error rate.

Google’s stats suggest that their driverless cars which, by now, have already gone over 1,800,000 miles on public roads in California, were involved in car accidents only 13 times in six years, and in 8 of those cases, the car behind was to blame.

It all proves that, even with low probability of developing full-fledged AI in the short-term perspective, something similar will inevitably emerge in the nearest decades.

With that in mind, don’t think the impact from ‘smarter’ machines will be seen only in the virtual domain. Here is an extreme example: unmanned aerial vehicles, or drones, fly completely on their own, but the command to shoot targets is still given by humans. This is the way the U.S. prefers to fight against terrorists in Pakistan and other dangerous regions.

Are programmers capable of creating a reliable ‘safety lock mechanism’ to prevent the AI from committing unethical or immoral acts?

The possibility of automating such tasks as well is widely discussed. The Terminator franchise turned thirty last year, so you can vividly imagine the consequences of such a decision in the not too distant future.

I won’t dwell too much on the apocalyptic scenarios, rather I will focus on some more down-to-earth questions. One of which is: Are programmers capable of creating a reliable ‘safety lock mechanism’ to prevent the AI from committing unethical or immoral acts?

Such acts might be based on various reasons, yet the most obvious of them is a conflict of resources between humanity and the AI. However, there are other scenarios. I’d disclaim that the harm would not necessarily be intentional. There is a great example Stanislaw Lem cites in his wonderful work, Summa Technologiae. In essence, the idea is as follows:

“Suppose the prognostic block of the ‘black box’ (AI) detects a danger potentially able to impact the state of the humanity’s homeostatic balance… The said danger is provoked by the rate of population increase which substantially exceeds the civilization’s ability to satisfy the humanity’s basic needs.

Suppose one of the external channels of the ‘black box’ informs the system about a new chemical compound which is not harmful for one’s health and suppresses the ovulation.

Then the ‘black box’ decides to inject microscopic dose of the compound into potable water system across a certain country, but would encounter a dilemma: whether to inform the society and rick facing opposition or to keep the society unknowing and thus preserve the existing balance (for the greater good).”

Should we be afraid of artificial intelligence?

As we see here, a very innocent optimization issue is solved in an elegant, simple and efficient, yet absolutely unethical solution which is based on intentional limitation of people’s fertility without their consent.

I think it is exactly the infrastructure management domain which will be delegated to a powerful AI-based system, as the cost/profit ratio in such spheres would be much favorable than in the case of a fully robotic secretary or cook.

Teaching ethics to a robot: how to integrate the ‘safety lock?

Any 20th century youngster will immediately recall Isaac Azimov’s three laws of robotics, but, it’s not enough. As proven by the example above, it is not necessary to harm anyone to significantly cut down the population (and remember, it could be done for the greater good).

There are many other options potentially detrimental for humanity. It is possible to find a loophole in the terms defining the concept of ‘harm’, to delegate the very task of harming to people, or to undermine the existence of the rules itself.

The ‘friendliness towards people’ itself may be reconsidered by a smart AI. Here is the opinion Roman Yampolsky, an AI expert, has on the issue, as cited in his interview:

“Worse yet, any truly intelligent system will treat its “be friendly” desire the same way very smart people deal with constraints placed in their minds by society. They basically see them as biases and learn to remove them… Why would a superintelligent machine not go through the same “mental cleaning” and treat its soft spot for humans as completely irrational?”

A technical conceptualization of the ‘safety lock’ is quite realistic. In essence, the ‘safety locks’, which are necessary to tame the AI, are none other than ‘sandboxes’, widely used for security in modern runtime environments like Java or Flash.

It is widely acknowledged that there is no ‘ideal’ sandbox and escape from the sandbox is quite possible, as the recent story with the Venom bug has shown. The AI which relies on flexibility and immense computing power is a good candidate for a security tester looking for vulnerabilities in its own sandbox deployment.

Andrey Lavrentiev, the Head of Technology Research Department, sees the problem as follows:

“A full-fledged AI system will understand the meaning of everything it ‘sees’ through its numerous sensors. The limitation policies for its actions should be defined in accordance with the concept, or with the images shaped in the AI’s ‘brain'”.

Today, machines are better at image recognition than humans, but they still lose to humanity when it comes to manipulating those images or relations

“Today, machines are better at image recognition than humans, but they still lose to the humanity when it comes to manipulating those images or relations, i.e. the modern AI does not have ‘common sense’. As soon as this changes, and the machines pass this last outpost over and learn to manipulate perceived objects and actions, there won’t be an opportunity to integrate any ‘safety locks’ anymore.”

“Such a supreme intelligence would be able to analyze dependencies in perceived data much faster than a human would ever be, and then would find the way of bypassing rules and limitations imposed by a human and starting to act on its own accord.”

A meaningful limitation, designed to prevent an AI from doing something harmful, would be the effective isolation of the AI from the real world, which will deprive it of a chance to manipulate physical objects. However, with this approach, the practical use of the AI is close to zero. Curiously, such an approach would be good for nothing, as the AI’s main weapon would be… us, people.

This probability is depicted in the recent Ex Machina sci-fi thriller. Like any other typical Hollywood product, this movie is stuffed with forced arguments and dramatically overestimates the problem. Yet, the core of the problem is defined surprisingly correctly.

First, even primitive robots are capable of influencing a person’s emotional state. An obsolete and easily programmed ELISA chat bot (should you want to speak to it, click here) was able to exfiltrate important personal information from the human interlocutors, armed only with empathy and polite questions.

Second, we increasingly rely on robotized algorithms to filter and categorize information. Someone managing these data flows, as proven by a recent controversial Facebook experiment, may influence people’s emotional environment and their decision making tendencies.

Even if we suggest that in the aforementioned example the AI is governing a city or a country indirectly and performing solely counselling functions, the AI is still capable of advising a solution which would turn out to be unethical in the long run. The consequences of the decision, in this respect, are known to the AI but not to living people.

In a private life this influence might emerge really fast and be even more impactful. During the recent Google I/O conference the new Now on Tap system was presented. It watches over all apps on the user’s smartphone, exfiltrates contextual data and lets it be used for the online search.

For instance, if you read an article on some musician in the Wikipedia app and ask Google “When is his concert?”, the robot would immediately know who exactly is referred to as ‘him’. A helpful robot already reminds us it is time to go to the airport, as the flight is scheduled in a couple of hours, proving to be a resourceful and savvy personal assistant.

Of course, it is not the full-fledged AI who takes care of these assistance tasks – it is merely a self-learning expert system designed to perform a narrow selection of tasks. Its behavior is fully pre-defined by people and thus predictable.

However, the computing evolution might make this simple robot a lot more sophisticated. It is critical to ensure it manipulates the available information solely for the user’s good and does not follow its own hidden agenda.

That’s the problem which occupies a lot of bright minds of our time, from Stephen Hawking to Elon Musk. The latter can barely be considered a conservative thinker afraid of, or opposed to, progress. Quite the contrary, the inventor of Tesla and SpaceX is eagerly looking into the future. However, he sees the evolution of AI one of the most controversial trends with consequences still unforeseeable and potentially catastrophic. That is why earlier this year he invested $10 million into AI research.

With that said, what awaits us in the future?

As strange as it seems, one of the most feasible scenarios, which experts consider too optimistic, is the ultimate impossibility of creating a full-fledged AI. Without a significant technology breakthrough (which is yet nowhere to be seen), robots will just continue updating and improving their existing set of skills.

While they are learning simple things like driving a car or speaking native languages, they are not able to substitute a human in making autonomous decisions. In the short-term perspective, the AI is likely to create some ‘collateral damage’ such as eliminating taxi drivers as an occupation, but won’t be considered a global threat to humanity.

Andrey Lavrentiev suggests that the conflict between the AI and the humanity is possible under only one condition: the need to share the same resources.

“A human has a body and is interested in creating favorable conditions for its convenience (and the convenience of the mind as well). With the AI, the situation is quite the opposite: it initially exists only in the digital world”.

“The AI’s key objective and motivation is to fully process the information supplied through the external channels, or its ‘sensory organs’, assess it, identify the principles of its change”.

“Of course, the AI also relies on some material foundations, but its dependence on the ‘shell’ is much weaker that in case of the human. The AI, contrary to the human, won’t be so concentrated on preserving its ‘shell’ (or ‘body’), as AI would be, in fact, ‘everywhere’. The organic extension of AI’s outreach in search of new information would be space exploration and studying of the Universe’s laws, so it can disseminate itself beyond Earth”.

“However, even in this scenario, there are certain pitfalls. Once this superintelligence sees the humanity or the universe as imperfections in its digital model, it will try to eliminate either of them in order to reach harmony. Or, possibly, it will need the resources consumed by humans in order to ‘explore the space’, making the old ‘AI vs. humanity’ conflict relevant again”.