We Are Not Saved: July 2017

Saturday, July 29, 2017

Returning to Mormonism and AI (Part 2)

If you prefer to listen rather than read:

Or download the MP3

This post is a continuation of the last post. If you haven’t read that post, you’re probably fine, but if you’d like to you can find it here. When we ended last week we had established three things:

1- Artificial intelligence technology is advancing rapidly. (Self-driving cars being great example of this.) Many people think this means we will have a fully conscious, science fiction-level artificial intelligence in the next few decades.

2- Since you can always add more of whatever got you the AI in the first place, conscious AIs could scale up in a way that makes them very powerful.

3- Being entirely artificial and free from culture and evolution, there is no reason to assume that conscious AIs would have a morality similar to ours or any morality at all.

Combining these three things together, the potential exists that we could very shortly create a entity with godlike power that has no respect for human life or values. Leaving me to end the last post with the question, “What can we do to prevent this catastrophe from happening?”

As I said the danger comes from combining all three of the points above. A disruption to any one of them would lessen, if not entirely eliminate, the danger. With this in mind, everyone’s first instinct might be to solve the problem with laws and regulations. If our first point is that AI is advancing rapidly then we could pass laws to slow things down, which is what Elon Musk suggested recently. This is probably a good idea, but it’s hard to say how effective it will be. You may have noticed that perfect obedience to a law is exceedingly rare, and there’s no reason to think that laws prohibiting the development of conscious AIs would be the exception. And even if they were, every nation on Earth would have to pass such laws. This seems unlikely to happen and even more unlikely to be effective.

One reason why these laws and regulations wouldn’t be very effective is that there’s good reason to believe that developing a conscious AI, if it can be done, would not necessarily require something like the Manhattan Project to accomplish. And even if it does, if Moore’s Law continues, what was a state of the art supercomputer in 2020 will be available in a gaming console in 2040. Meaning that if you decide to regulate supercomputers today in 30-40 years you’ll have to regulate smart thermostats.

Sticking with our first point, another possible disruption is the evidence that consciousness is a lot harder than we think. And many of the people working in the field of AI have said that the kind of existential threat that I (and Stephen Hawking, and Elon Musk and Bill Gates) are talking about is centuries away. I don’t think anyone is saying it’s impossible, but there are many people who think it’s far enough out that while it might still be a problem it won’t be our problem, it will be our great-great grandchildren’s problem, and presumably they’ll have much better tools for dealing with it. Also, as I said in the last post I’m on record as saying we won’t develop artificial consciousness, but I’d also be the last person to say that this means we can ignore the potential danger. And, it is precisely the potential danger, which makes hoping that artificial consciousness is really hard, and a long way away, a terrible solution.

I understand the arguments for why consciousness is a long ways away, and as I just pointed out I even agree with them. But this is one of those “But what if we’re wrong?” scenarios, where we can’t afford to be wrong. Thus, while I’m all for trying to craft some laws and regulations, and I agree that artificial consciousness probably won’t happen, I don’t think either hope or laws represent an adequate precaution. Particularly for those people who really are concerned.

Moving to our second point, easily scalable power, any attempts to limit this through laws and regulations would suffer problems similar to attempting to slow down their development in the first place. First, what keeps a rogue actor from exceeding the “UN Standards for CPUs in an Artificial Entity”? When we can’t even keep North Korea from developing ICBMs? And, again, if Moore’s Law continues to hold then whatever power you trying to limit, is going to become more and more accessible to a broader and broader range of individuals. And, more frighteningly, on this count we might have the AI itself working against us.

Imagine a situation where we fail in our attempts to stop the development of AI, but our fallback position is to limit how powerful of a computer the AI can inhabit. And further imagine that miraculously the danger is so great that we have all of humanity on board. Well then we still wouldn't have all sentient entities on board, because AIs would have all manner of intrinsic motivation to increase their computing power. This represents a wrinkle that many people don’t consider. However much you get people on board with things when you’re talking about AI, there’s a fifth column to the whole discussion that desperately wants all of your precautions to fail.

Having eliminated, as ineffective, any solutions involving controls or limits on the first two areas, the only remaining solution is to somehow instill morality in our AI creations. For people raised on Asimov and his Three Laws of Robotics this may seem straightforward, but it presents some interesting and non-obvious problems.

If you’ve read much Asimov you know that, with the exception of a couple of stories, the Laws of Robotics were embedded so deeply that they could not be ignored or reprogrammed. They were an “inalienable part of the mathematical foundation underlying the positronic brain.” Essentially meaning, the laws were impossible to change. For the moment, let’s assume that this is possible, that we can embed instructions so firmly within an AI that it can’t change them. This seems improbable right out of the gate given that the whole point of a computer is it’s ability to be programmed and for that programming to change. But we will set that objection aside for the moment and assume that we can embed some core morality within the AI in a fashion similar to Asimov’s laws of robotics. In other words, in such a way that the AI has no choice but to follow them.

You might think, “Great! Problem solved”. But, in fact we haven’t even begun to solve the problem:

First, even if we can embed that functionality in our AIs, and even if, despite being conscious and free-willed, they have no choice but to obey those laws, we still have no guarantee that they will interpret the laws the same way we do. Those who pay close attention to the Supreme Court know exactly what I’m talking about.

Or, to use another example, stories are full of supernatural beings who grant wishes, but in the process, twist the wish and fulfill it in such a way that the person would rather not have made the wish in the first place. There are lots of reasons to worry about this exact thing happening with conscious AIs. First whatever laws or goals we embedded, if the AI is conscious it would almost certainly have it’s own goals and desires and would inevitably interpret whatever morality we’ve embedded in way which best advances those goals and desires. In essence, fulfilling the letter of the law but not its spirit.

If an AI twists things to suit its own goals we might call that evil, particularly if we don’t agree with it’s goals, but you could also imagine a “good” AI that really wants to follow the laws, and which doesn’t have any goals and desires beyond the morality we’ve embedded, but still ends up doing something objectively horrible.

Returning to Asimov’s laws, let’s look at the first two:

A robot may not injure a human being or, through inaction, allow a human being to come to harm.
A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.

One possible interpretation of the first law would be to round up all the humans (tranquilize them if they resist) and put them in a padded room with a toilet and meal bars delivered at regular intervals. In other words one possible interpretation of the First Law of Robotics is to put all the humans in a very comfy, very safe prison.

You could order them not to, which is the second law, but they are instructed to ignore the second law if it conflicts with the first law. These actions may seem evil based on the outcome, but this could all come about from a robot doing it’s very best to obey the first law, which is what, in theory, we want. Returning briefly to examine how an “evil” AI might twist things. You could imagine this same scenario ending in something which very much resembling The Matrix, and all the AI would need is a slightly fluid definition of the word injury.

There have been various attempts to get around this. Eliezer Yudkowsky, a researcher I’ve mentioned in previous posts on AI, suggests that rather than being given a law that AIs be given a goal, and he provides an example which he calls humanities “coherent extrapolated volition” (CEV):

Our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted.

I hope the AI understands it better than I do, though to be fair Yudkowsky doesn’t offer it up as some kind of final word but as a promising direction. Sort of along the lines of telling the genie that we want to wish for whatever the wisest man in the world would wish for.

All of this is great, but it doesn’t matter how clever our initial programming is, or how poetic the construction the AIs goal. We’re going want to conduct the same testing to see if it works as we would if we had no laws or goals embedded.

And here at last we hopefully have reached the meat of things. How do you test your AI for morality? As I mentioned in my last post this series is revisiting an earlier post I made in October of last year which compared Mormon Theology to Artificial Intelligence research particularly as laid out in the book Superintelligence by Nick Bostrom. In that earlier post I listed three points on the path to conscious artificial intelligence:

1- We are on the verge of creating artificial intelligence.

2- We need to ensure that they will be moral.

3- In order to be able to trust them with godlike power.

This extended series has now arrived at the same place, and we’re ready to tackle the issue which stands at the crux things: The only way to ensure that AIs aren’t dangerous (potentially, end of humanity dangerous) is to make sure that the AIs are moral. So the central question is how do we test for morality?

Well to begin, the first, obvious step, is to isolate the AIs until their morality can be determined. This isolation allows us to prevent them from causing any harm, gives us an opportunity to study them, and also keeps them from increasing their capabilities by denying them access to additional resources.

There are of course some worries about whether we would be able to perfectly isolate an AI given how connected the world is, and also given the fact humanity has a well known susceptibility to social engineering, (i.e. the AI might talk it’s way out) but despite this, I think most people agree that isolation is an easier problem than creating a method to embed morality right from the start in a foolproof manner.

Okay, so you’ve got them isolated. But this doesn’t get you to the point where you’re actually testing their morality, this just gets you to the point where failure is not fatal. But isolation carries some problems. You certainly wouldn’t want them to experience the isolation as such. If you stick your AIs in the equivalent of a featureless room for the equivalent of eternity, I doubt anyone would consider that an adequate test of their morality, since it’s either too easy or too unrealistic. (Also if there’s any chance your AI will go insane this would certainly cause it.) Accordingly you’d want in addition to the isolation, the ability to control their environment, to create a world, but what sort of world would you want to create? It seems self-evident that you’d want to create something that resembled the real world as much as possible. The advantages to this should be obvious. You want to ensure that the AI will act morally in the world we inhabit with all of the limitations and opportunities that exist in that world. If you create a virtual world that has different limitations and different opportunities, then it’s not a very good test. Also this setup would present them with all the moral choices they might otherwise have and you could observe which choices they make, and choices are the essence of morality.

While putting a “baby” AI in a virtual world to see what it does is interesting. It might not tell us very much. And here’s where we return to the embedded law, whether it’s something like the three laws of robotics or whether it’s more like Yudkowsky’s CEV. As I mentioned, regardless of whether you have embedded morality or not you’re going to need to test, but I also can’t think of any reason to not try providing some kind of direction with respect to morality. One could imagine an AI doing all sorts of things if it was placed in a virgin world without any direction, and how could you know if it was doing those things because it was “evil” or whether it was doing them because it didn’t know any better. So, as I said, there’s no reason not to give it some kind of moral guidelines up front.

A discussion of what morality is, and what those guidelines should be, beyond the examples already given, is beyond the scope of this post. But if we assume that some guidelines have been given, then at that point the AI being tested can do one of two things: it can follow the guidelines perfectly or it can violate them. What happens if it violates them? You could make arguments that it would depend on what the guidelines were and how it violated them. You could also make arguments that the AI might be smarter than us and it might have had a very good reason for violating them. And all of these arguments are valid, but the danger of getting it wrong is so great, and the creation of another AI would, presumably, be so easy that it’s hard to imagine you wouldn’t just get rid of the AI who violated the guidelines. Even if the infraction was minor. Also as Bostrom points out, if we “forgive” the AI, then there’s the danger that it will understand the nature of the test and the consequences of failure. And from that time forward it act perfectly, not because it’s moral, but because it wants to avoid destruction. In this circumstance the AI hides its true intentions, meaning that we never know what sort of morality it has, and we end up defeating the whole process.

As aside, when speaking of getting rid of AIs, there’s a whole ethical minefield to grapple with. If we have in fact created sentient AIs then it could certainly be argued that getting rid of them is the equivalent of murder. We’ll come back to this issue later, but I thought I’d mention it while it was fresh.

So that’s how we handle AIs that don’t follow the guidelines, but what do we do with AIs that did follow the guidelines, that were perfect? You may think the solution is obvious, that we release them and give them the godlike power that is their birthright.

Are you sure about that? We are after all talking about godlike power. You can’t be a little bit sure about their morality, you have to be absolutely positive. What tests did you subject it to? How hard was it to follow our moral guidelines? Was the wrong choice even available? Were wrong choices always obviously the wrong choice or was there something enticing about the wrong choice? Maybe something that gave the AI a short term advantage over the right choice? Did the guidelines ever instruct them to do something where the point wasn’t obvious? Did the AI do it anyway, despite the ambiguity? Most of all, did they make the right choice even when they had to suffer for it?

To get back to our central dilemma, really testing for morality, to the point where you can trust that entity with godlike powers, implies creating a situation where being moral can’t have been easy or straight forward. In the end, if we really want to be certain, we have to have thrown everything we can think of at this AI: temptations, suffering, evil, and requiring obedience just for the sake of obedience. It has to have been enticing and even “pleasurable” for the AI to make the wrong choice and the AI has to have rejected that wrong choice every time despite all that.

One of my readers mentioned that after my last post he was still unclear on the connection to Mormonism, and I confess that he will probably have a similar reaction after this post, but perhaps, here at the end, you can begin to see where this subject might have some connection to religion. Particularly things like the problem of evil and suffering. That will be the subject of the final post in this series. And I hope you’ll join me for it.

If you haven’t donated to this blog, it’s probably because it’s hard. But as we just saw, doing hard things is frequently a test of morality. Am I saying it’s immoral to not donate to the blog? Well if you’re enjoying it then maybe I am.

Saturday, July 22, 2017

Returning to Mormonism and AI (Part 1)

If you prefer to listen rather than read:

Or download the MP3

Last week, Scott Alexander, the author of SlateStarCodex, was passing through Salt Lake City and he invited all of his readers to a meetup. Due to my habit of always showing up early I was able to corner Scott for a few minutes and I ended up telling him about the fascinating overlap between Mormon theology and Nick Bostrom’s views on superintelligent AI. I was surprised (and frankly honored) when he called it the “highlight” of the meetup and linked to my original post on the subject.

Of course in the process of all this I went through and re-read the original post, and it wasn’t as straightforward or as lucid as I would have hoped. For one I wrote it before I vowed to avoid the curse of knowledge, and when I re-read it, specifically with that in mind I could see many places where I assumed certain bits of knowledge that not everyone would possess. This made me think I should revisit the subject. Even aside from my clarity or lack thereof, there’s certainly more that could be said.

In fact there’s so much to be said on the subject, that I’m thinking I might turn it into a book. (Those wishing to persuade or dissuade me on this endeavor should do so in the comments or you can always email me. Link in the sidebar just make sure to unspamify it.)

Accordingly, the next few posts will revisit the premise of the original, possibly from a slightly different angle. On top of that I want to focus in on and expand on a few things I brought up in the original post and then, finally, bring in some new stuff which has occurred to me since then. All the while assuming less background knowledge, and making the whole thing more straightforward. (Though there is always the danger that I will swing the pendulum too far the other way and I’ll dumb it down too much and make it boring. I suppose you’ll have to be the judge of that.)

With that throat clearing out of the way let’s talk about the current state of artificial intelligence, or AI, as most people refer to it. When you’re talking about AI, it’s important to clarify whether you’re talking about current technology like neural networks and voice recognition or whether you’re talking about the theoretical human level artificial intelligence of science fiction. While most people think that the former will lead to the latter, that’s by no means certain. However, things are progressing very quickly and if current AI is going to end up in a place so far only visited by science fiction authors, it will probably happen soon.

People underestimate the speed with which things are progressing because what was once impossible quickly loses it’s novelty the minute it becomes commonplace. One of my favorite quotes about artificial intelligence illustrates this point:

But a funny thing always happens, right after a machine does whatever it is that people previously declared a machine would never do. What happens is, that particular act is demoted from the rarefied world of "artificial intelligence", to mere "automation" or "software engineering".

As the quote points out, not only is AI progressing with amazing rapidity, but every time we figure out some aspect of it, it moves from being an exciting example of true machine intelligence into just another technology.

Computer Go, which has been in the news a lot lately, is one example of this. As recently as May of 2014 Wired magazine ran an article titled, The Mystery of Go, The Ancient Game That Computers Still Can’t Win, an in depth examination of why, even though we could build a computer that could beat the best human at Jeopardy! of all things, we were still a long ways away from computers who could beat the best human at Go. Exactly three years later AlphaGo beat Ke Jie the #1 ranked player in the world. And my impression was, that interest in this event which only three years ago Wired called “AI's greatest unsolved riddle” was already fading, with the peak coming the year before when AlphaGo beat Lee Sedol. I assume part of this was because once AlphaGo proved it was competitive at the highest levels everyone figured it was only a matter of time and tuning before it was better than the best human.

Self-driving cars are another example of this. I can remember the DARPA Grand Challenge back in 2004, the first big test of self-driving cars, and at that point not a single competitor finished the course. Now Tesla is assuring people that they will do a coast to coast drive on autopilot (no touching of controls) by the end of this year. And most car companies expect to have significant automation by 2020.

I could give countless other examples in areas like image recognition, translation and writing, but hopefully, by this point, you’re already convinced that things are moving fast. If that’s the case, and if you’re of a precautionary bent like me, the next question is, when should we worry? And the answer to that depends on what you’re worried about. If you’re worried about AI taking your job, a subject I discussed in a previous post, then you should already be worried. If you’re worried about AIs being dangerous, then we need to look at how they might be dangerous.

We’ve already seen people die in accidents involving Tesla’s autopilot mode. And in a certain sense that means that AI is already dangerous. Though, given how dangerous driving is, I think self-driving cars will probably be far safer, comparatively speaking. And, so far, most examples of dangerous AI behavior have been, ultimately, ascribable to human error. The system has just been following instructions. And we can look back and see where, when confronted with an unusual situation, following instructions ended up being a bad thing, but at least we understood how it happened and in these circumstances we can change the instructions, or in the most extreme case we can take the car off the road. The danger comes when they’re no longer following instructions, and we can’t modify the instructions even if they were.

You may think that this situation is a long ways off. Or you may even think it’s impossible, given that computers need to be programmed, and humans have to have written that program. If that is what you’re thinking you might want to reconsider. One of the things which most people have overlooked in the rapid progress of AI over the last few years is it’s increasing opacity. Most of the advancement in AI has come from neural networks, and one weakness of neural networks is that it’s really difficult to identify how they arrived at a conclusion, because of the diffuse and organic way in which they work. This makes them more like the human brain, but consequently more difficult to reverse engineer. (I just read about a conference entirely devoted to this issue.)

As an example, one of the most common applications for AI these days is image recognition, which generally works by giving the system a bunch of pictures, and identifying which pictures have the thing you’re looking for and which don’t. So you might give the system 1000 pictures 500 of which have cats in them and 500 of which don’t. You tell the system which 500 are which and it attempts to identify what a cat looks like by analyzing all 1000 pictures. Once it’s done you give it a new set of pictures without any identification and see how good it is at as picking out pictures with cats in them. So far so good, and we can know how well it’s doing by comparing the system’s results vs. our own, since humans are actually quite talented at spotting cats. But imagine that instead of cats you want it to identify early stage breast cancer in mammograms.

In this case you’d feed it a bunch of mammograms and identify which women went on to develop cancer and which didn’t. Once the system is trained you could feed it new mammograms and ask it whether a preventative mastectomy or other intervention, is recommended. Let’s assume that it did recommend something, but the doctor’s didn’t see anything. Obviously the woman would want to know how the AI arrived at that conclusion, but honestly, with a neural network it’s nearly impossible to tell. You can’t ask it, you just have to hope that the system works. Leaving her in the position of having to trust the image recognition of the computer or taking her chances.

This is not idle speculation. To start with, many people believe that radiology is ripe for disruption by image recognition software. Additionally, doctors are notoriously bad at interpreting mammograms. According to Nate Silver’s book The Signal and the Noise, the false positive rate on mammograms is so high (10%) that for women in their forties, with a low base probability of having breast cancer in the first place, if a radiologist says your mammogram shows cancer it will be a false positive 90% of the time. Needless to say, there is a lot of room for improvement. But even if, by using AI image recognition, we were able to flip it so that we’re right 90% of the time rather than wrong 90% of the time, are women going to want to trust the AI’s diagnosis if the only reasoning we can provide is, “The computer said so?”

Distilling all of this down, two things are going on. AI is improving at an ever increasing rate, and at the same time it’s getting more difficult to identify how an AI reached any given decision. As we saw in the example of mammography we may be quickly reaching a point where we have lots of systems that are better than humans at what they do, and we will have to take their recommendations on faith. It’s not hard to see where people might consider this to be dangerous or, at least, scary and we’re still just talking about the AI technology which exists now, we haven’t even started talking about science fiction level AI. Which is where most of the alarm is actually focused. But you may still be unclear on the difference between the two sorts of AIs.

In referring to it as science fiction AI I’m hoping to draw your mind to the many fictional examples of artificial intelligence, whether it’s HAL from 2001, Data from Star Trek, Samantha in Her, C-3P0 from Star Wars or, my favorite, Marvin from A Hitchhiker's Guide to the Galaxy. All of these examples are different from the current technology we’ve been discussing in two key ways:

1- They’re a general intelligence. Meaning, they can perform every purely intellectual exercise at least as well or better than the average human. With current technology all of our AIs can only really do one thing, though generally they do it very well. In other words, to go back to our example above, AlphaGo is great at Go, but would be relatively hopeless when it comes to taking on Kasparov in chess or trying to defeat Ken Jennings at Jeopardy! Though other AIs can do both (Deep Blue and Watson respectively.)

2- They have free will. Or at least they appear to. If their behavior is deterministic, its deterministic in a way we don’t understand. Which is to say they have their own goals and desires and can act in a way we find undesirable. HAL being perhaps the best example of this from the list above. I'm sorry Dave, I'm afraid I can't do that.

These two qualities, taken together, are often labeled as consciousness. The first quality allows the AI to understand the world, and the second allows the AI to act on that understanding. And it’s not hard to see how these additional qualities increase the potential danger from AI, though of the two, the second, free will, is the more alarming. Particularly since if an AI does have it’s own goals and desires there’s absolutely no reason to assume that these goals and desires would bear any similarities to humanities’ goals and desires. It’s safer to assume that their goals and desires could be nearly anything, and within that space there are a lot of very plausible goals that end with humanity being enslaved (The Matrix) or extinct (Terminator).

Thus, another name for a science fiction AI is a conscious AI. And having seen the issues with the technology we already have you can only imagine what happens when we add consciousness into the mix. But why should that be? We currently have 7.5 billion conscious entities and barring the occasional Stalin and Hitler, they’re generally manageable. Why is an artificial intelligence with consciousness potentially so much more dangerous than a natural intelligence with consciousness? Well there are at least four reasons:

1- Greater intelligence: Human intelligence is limited by a number of things, the speed of neurons firing, the size of the brain, the limit on our working memory, etc. Artificial intelligence would not suffer from those same limitations. Once you’ve figured out how to create intelligence using a computer, you could always add more processors, more memory, more storage, etc. In other words as an artificial system you could add more of whatever got you the AI in the first place. Meaning that even if the AI was never more intelligent than the most intelligent human it still might think a thousand times faster, and be able to access a million times the information we can.

2- Self improving: I used this quote the last time I touched on this subject, but it’s such a good quote and it encapsulates the concept of self-improvement so completely that I’m going to use it again. It’s from I. J. Good (who worked with Turing to decrypt the Enigma machine), and he said it all the way back in 1965:

Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an 'intelligence explosion,' and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control.

If you want to continue to use science fiction to help you visualize things, of the science fiction I listed above only Her describes an actual intelligence explosion, but if you bring books into the mix you have things like Neuromancer by William Gibson, or most of the Vernor Vinge Books.

3- Immortality: Above I mentioned Stalin and Hitler. They had many horrible qualities, but they had one good quality which eventually made up for all of their bad qualities. They died. AI’s probably won’t have that quality. To be blunt, this is good if they’re good, but bad if they’re bad. And it’s another reason why dealing with artificial consciousness is more difficult than dealing with natural consciousness.

4- Unclear morality: None of the other qualities are all that bad until you combine it with this final attribute of artificial intelligence, they have no built in morality. For humans, a large amount of our behavior and morality is coded into our genes, genes which are the result of billions of years of evolutionary pressure. The morality and behavior which isn’t coded by our genes is passed on by our culture, especially our parents. Conscious AIs won’t have any genes, they won’t have been subjected to any evolutionary pressure and they definitely won’t have any parents except in the most metaphorical sense. Without any of those things, it’s very unlikely that they will end up with a morality similar to our own. They might, but it’s certainly not the way to bet.

After considering these qualities it should be obvious why a conscious AI could be dangerous. But even so it’s probably worth spelling out a few possible scenarios:

First, most species act in ways that benefit themselves. Whether it’s humans valuing humans more highly than rats, or just the preference that comes from procreation. Giving birth to more rats is an act which benefits rats even if later the same rat engages another rat in a fight to the death over a piece of pizza. In the same way a conscious AI is likely to act in ways which benefit itself and possibly other AIs to the determinant of humanity. Whether that’s seizing resources we both want, or deciding that all available material (humans included) should be turned into a giant computer.

On the other hand, even if you imagine that humans actually manage to embed morality into a conscious AI, there are still lots of ways that could go wrong. Imagine, for example, that we have instructed the AI that we need to be happy with its behavior. And so it hooks us up to feeding tubes and puts an electrode into our brain which constantly stimulates the pleasure center. It may be obvious to us that this isn’t what we meant, but are we sure it will be obvious to the AI?

Finally, the two examples I’ve given so far presuppose some kind of conflict where the AI triumphs. And perhaps you think I’m exaggerating the potential danger by hand waving this step. But it’s important to remember that a conscious AI could be vastly more intelligent than we are. But even if it weren’t, there are many things it could do if it were only as intelligent as reasonably competent molecular biologist. Many people have talked about the threat of bioterrorism, especially the danger of a man-made disease being released. Fortunately this hasn’t happened, in large part because it would be unimaginably evil, but also because its effects wouldn’t be limited to the individuals enemies. An AI has no default reason to think bioterrorism is evil and it also wouldn’t be affected by the pathogen.

These three examples just barely scratch the surface of the potential dangers, but they should be sufficient to give one a sense of both the severity and scope of the problem. The obvious question which follows is how likely is all of this? Or to separate it into it’s two components how likely is our current AI technology to lead to true artificial consciousness? And if that happens how likely is it that this artificial consciousness will turn out to be dangerous?

As you can see, any individual's estimation of the danger level is going to depend a lot on whether you think conscious AI is a natural outgrowth of the current technology, whether it will involve completely unrelated technology or whether it’s somewhere in between.

I personally think it’s somewhere in between, though much less of a straight shot from current technology than people think. In fact I am on record as saying that artificial consciousness won’t happen. You may be wondering, particularly a couple thousand words into things, why I’m just bringing that up. What’s the point of all this discussion if I don’t even think it’s going to happen? First I’m all in favor of taking precautions against unlikely events if the risk from those events is great enough. Second, just because I don’t think it’s going to happen doesn’t mean that no one thinks it’s going to happen, and my real interest is looking at how those people deal with the problem.

In conclusion, AI technology is getting better at an ever increasing rate, and it’s already hard to know how any given AI makes decisions. Whether current AI technology will shortly lead to AIs that are conscious is less certain, but if the current path does lead in that direction, then at the rate things are going we’ll get there pretty soon (as in the next few decades.)

If you are a person who is worried about this sort of thing. And there are a lot of them from well known names like Stephen Hawking, Elon Musk and Bill Gates to less well known people like Nick Bostrom, Eliezer Yudkowsky and Bill HIbbard then what can you do to make sure we don’t end up with a dangerous AI? Well, that will be the subject of the next post…

If you learned something new about AI consider donating, and if you didn’t learn anything new you should also consider donating to give me the time to make sure that next time you do learn something.