Central Ideas:
1- It is possible that, compared to a force-brutal replication of natural evolutionary processes, large efficiency gains can be achieved by designing a search process that has intelligence as its goal, using several obvious improvements over natural selection.
2 – Quality superintelligence is a system that is at least as fast as a human mind, and qualitatively much more intelligent.
3 – Artificial intelligence (AI) systems gradually become more capable and as a consequence can be increasingly used in the real world (in trains, cars, robots, etc.). In this broad automation, occasional accidents may occur. Some call for more supervision; others, for the development of better systems.
4 – If we classify an AI as capital, then with the invention of machine intelligence that can fully replace human labor, wages would equal the reduced cost value of machines, far less than the income needed for human subsistence.
5 – The claim that a superintelligence should emerge before other potentially dangerous technologies, such as nanotechnology, is based on the fact that a superintelligence would reduce the existential risks linked to nanotechnology, but the reverse would not occur.
ABOUT THE AUTHOR:
Nick Bostrom is a Swedish philosopher, known for his work on existential risk, the anthropic principle, ethics in human enhancement, superintelligence risks. He is Professor of Philosophy&Oxford Martin School (Oxford University) and Director of the Institute for the Future of Humanity.
Chapter 1 – Past Developments and Current Capabilities
We begin by looking back. History, on its broadest scale, seems to present a sequence of distinct growth regimes, each faster than its predecessor. This pattern has been used to suggest that another (even faster) growth regime may be possible. However, we don’t put much emphasis on this observation – this is not a book about “technological acceleration,” “exponential growth,” or the many notions sometimes lumped together under the rubric of the “singularity.” Next, we go over the history of artificial intelligence. Further on, we examine the current capabilities of this field of research. Finally, we look at some of the recent expert opinion polls and reflect on ignorance about the timeline of future advances.
Now, it is important to note that the demarcation between artificial intelligence and computer programs, in general, is not so sharp. Some of the applications listed above [credit card approval and disapproval, search engines, etc.] can be seen more as generic software applications than specifically as AI – although this brings us back to McCarthy’s maxim that when something works, it is no longer called AI. A more relevant distinction for our purposes is that between systems that have a limited range of cognitive capability (whether they are called “AI” or not) and systems that have more generally applicable problem-solving capability. Essentially, all systems currently in use are of the first type: limited.
However, many of them have components that may also play a role in general artificial intelligence or be in service of its development – components such as classifiers, search algorithms, planners, solvers, and representational frameworks.
Chapter 2 – Paths to Superintelligence
Machines are currently far inferior to humans in general intelligence. However, one day (as we have already suggested) they will be superintelligent. How do we build on the current stage to reach the stage of machine superintelligence? This chapter explores several possible technological paths. We will talk about artificial intelligence, brain emulators, biological cognition, human-machine interfaces, and also about networks and organizations. We will evaluate their different degrees of possibility as paths to superintelligence. The existence of multiple paths increases the probability that superintelligence will be achieved by at least one of these paths.
The fact is that the computational resources required for us to be able to simply replicate the relevant evolutionary processes that produced human-level intelligence are still far beyond our reach – and will remain so even if Moore’s law were to continue for another century. It is possible, however, that, compared to a force-brutal replication of natural evolutionary processes, large efficiency gains could be achieved by designing a search process that has intelligence as its goal, using several obvious improvements over natural selection.
Even so, it is very difficult to predict the magnitude of the efficiency gains from such an artificial evolutionary process. We are not even able to say whether the gain would be five or 25 orders of magnitude.
Chapter 3 – Forms of superintelligence
After all, what exactly do we mean by the term “superintelligence”? While we do not wish to get bogged down in the terminological mire, something needs to be said for the clarification of the concepts used here. This chapter identifies three different forms of superintelligence and argues that, according to what is relevant in practice, they are equivalent. It will also be shown that the potential for intelligence on a machine substrate is much greater than on a biological substrate. Machines have a number of fundamental advantages capable of giving them incredible superiority. Biological humans, even if perfected, will be outmatched.
Fast superintelligence is an intellect exactly like the human mind, but faster. Conceptually, it is the easiest form of superintelligence to analyze. Fast superintelligence can be defined as follows: a system that can do everything that human intellect is capable of doing but much faster.
The simplest example of fast superintelligence would be a complete emulation of the brain running on fast hardware. An emulation operating at 10,000 times the speed of a biological brain would be able to read a book in a few seconds and write a doctoral thesis in an afternoon.”
Another form of superintelligence is a system that can achieve superior performance by aggregating a large number of smaller intelligence.
Collective superintelligence: a system composed of a large number of smaller intellects such that its total performance significantly outperforms, in several general areas of knowledge, any current cognitive system.
Collective intelligence excels at solving problems that can be easily divided into parts so that the solutions to these subproblems can be found in parallel and independently verified [such as building a spaceship, managing a hamburger franchise, etc.].
Quality superintelligence: a system that is at least as fast as a human mind and qualitatively much more intelligent.
Like collective intelligence, quality intelligence is also a slightly nebulous concept; and in this case, the difficulty is compounded by the lack of experience with any variation in intelligence quality beyond the upper limit of the present human distribution.
Chapter 4 – The kinetics of an intelligence explosion
Once machines have achieved some form of human equivalence in reasoning ability, how long would it then be before they achieve radical superintelligence? Would this be a slow, gradual and prolonged transition? Or would it be sudden and explosive? This chapter analyzes the kinetics of the transition to superintelligence as a function of optimization power and system resilience. We consider what we know or what we can reasonably assume about the behavior of these two factors relative to human-level general intelligence.
Starting with intelligent software (an emulation or an AI), it is possible to extend collective intelligence simply by additional computers to run more instances of the program.
Another way to extend fast intelligence would be to transfer the program to faster computers. Depending on the degree of parallelization allowed by the program, the fast intelligence could be amplified by running the program on more processors. This is most likely to be feasible for emulations whose architecture is highly parallelized, but several AI programs also have important subroutines that could benefit from massive parallelization. Amplifying quality intelligence through computational power may also be possible, but this is a less direct case.
Chapter 5 – Decisive Strategic Advantage
A distinct question, though related to the question of kinetics, is whether there will be only one or many superintelligent forces. Could a burst of intelligence launch one project so far ahead of all others as to make it capable of dictating the future? Or will progress be more uniform? Unfolding on a broad front, with many projects underway, with none of them guaranteeing significant, permanent leadership.
Some paths to superintelligence are resource-intensive, and so are likely to be maintained by heavily funded projects. Complete brain emulation, for example, requires different types of techniques and a range of equipment.
Improvements in biological intelligence and brain-computer interfaces also depend heavily on the scale of the project: while a small biotech firm might invent one or two drugs, achieving superintelligence along these paths (if it is at all possible) would probably require many varied inventions and tests, and thus the support of an industrial sector and a well-funded national program.
Achieving collective superintelligence through more efficient organizations and networks demands even more resources, which would involve much of the world economy.
Chapter 6 – Cognitive Superpowers
Suppose a superintelligent digital agent emerges and decides, for some reason, to take over the world: would it be able to do it? In this chapter, we will look at some of the powers that a superintelligence could develop and what it would be able to do with such powers. We will outline a scenario from which a superintelligent agent, initially a mere software, could establish itself as a singleton. We will also make some observations regarding the relationship between power over nature and power over other agents:
It is important that we do not anthropomorphize superintelligence when thinking about its potential impacts. Such an anthropomorphic perspective would encourage unfounded expectations about the growth trajectory of an embryonic AI, as well as questions about the psychology, motivations, and capabilities of a mature superintelligence.
A very common assumption is that a superintelligent machine would be similar to a very intelligent but nerdy human being. We tend to speculate that an AI would have a technical, yet unintuitive and creative intelligence.
These ideas come, most likely, from observation: we look at current computers and see that they are very good at calculating, have a great memory, and follow rules to the letter, but are oblivious to social context and subtleties, norms, emotions, and politics. This association is reinforced when we observe that people who work well with computers tend to be “nerds” themselves. It is then natural to assume that a more advanced computational intelligence will have similar attributes, but on a larger scale.
Chapter 7 – The Superintelligent Will
We have already seen that superintelligence could have a great ability to shape the future according to its own goals. But what might these goals be? What is the relationship between intelligence and motivation in an artificial agent? Here we develop two theses. The orthogonality thesis argues (with some caveats) that intelligence and end goals are independent variables: any level of intelligence could be combined with any end goal.
The instrumental convergence thesis holds that superintelligent agents possessing any among a diverse range of end goals will nevertheless pursue similar intermediate goals because they have common instrumental reasons for doing so. Combined, these two theses help us think about what a superintelligent agent would do:
Orthogonality thesis. Intelligence and goals are orthogonal. In other words, virtually any level of intelligence could, in principle, be combined with virtually any ultimate goal. If the orthogonality thesis seems problematic, this is perhaps due to the superficial resemblance it bears to some traditional philosophical positions that have been the subject of long debates. Once it is understood that it has a different and narrower scope, its credibility should increase. (For example, the orthogonality thesis does not presuppose Hume’s theory of motivation, nor does it presuppose that basic preferences cannot be irrational.)
Instrumental convergence thesis. Many instrumental values can be identified as convergent in the sense that their attainment would increase the chances that the agent’s goal will be realized for a wide range of end goals and a range of situations, suggesting that such instrumental values are more likely to be pursued by many established instrumental agents.
Chapter 8 – Will the most likely outcome be our end goal?
We notice that the connection between intelligence and ultimate values is extremely weak. We also see a threatening convergence in instrumental values. This does not matter much in the case of weak agents, because they are easily controllable and unable to do much damage. However, we argued in chapter 6 that the first superintelligence would have a great possibility to gain a decisive strategic advantage. Its goals, then, would determine how humanity’s cosmic dominance would be used. We can now begin to visualize how threatening that prospect is.
Consider the following scenario. In the coming years and decades, AI systems gradually become more capable and as a consequence could be increasingly used in the real world: they could be used to operate trains, cars, domestic and industrial robots, and autonomous military vehicles.
We can assume that such automation mostly delivers the desired effects, but that its success is marked by occasional accidents – an autonomous truck crashes head-on into another vehicle or a military drone shoots innocent civilians. Investigations reveal that these incidents were caused by failures in the decision-making of AI systems.
This results in public opinion debates. Some call for greater oversight and regulation, others emphasize the need for research and development of better systems.
Chapter 9 – The Problem of Control
If we are threatened with existential catastrophes as the expected result of an intelligence explosion, our thinking should immediately turn to find countermeasures. Is there any way to avoid this default outcome? Is it possible to plan for a controlled detonation? In this chapter, we will begin to look at the problem of control, the specific agent-director problem that arises with the creation of an artificial superintelligent agent.
We will differentiate between two general classes of methods that could potentially be used to address this problem – capacity control and motivation selection – and examine several techniques specific to each class. We will also allude to the esoteric possibility of “anthropic capture.”
Confining methods – Motivation selection may involve either explicitly formulating a goal or a set of rules that are to be followed (direct specification) or configuring the system so that it can discover an appropriate set of values on its own, using implicitly or indirectly formulated criteria (indirect normativity). One option to motivation selection would be to try to build a system so that it has modest and unambitious goals (domesticity). An alternative to creating a motivation system from scratch could be to take an agent who already has an acceptable motivation system and from there increase its cognitive powers to turn it into a superintelligence, making sure that the motor system does not become corrupted in the process (amplification).
Chapter 10: Oracles, Jinn, Sovereigns, and Tools
Some might say, “Just build a system that answers questions!” or “Just build an IA that is a tool, not an agent!” But these suggestions do not make all security concerns go away, and it is not a trivial matter to define what type of system would offer the best security prospects. We will consider four types or “castes” – oracles, genies, sovereigns, and tools – and explain how each relates to the others. All offer a different set of advantages and disadvantages in our quest to solve the problem of control.
With advances in artificial intelligence, it would be possible for the programmer to get rid of a portion of the cognitive work required to figure out how to perform a given task. In an extreme case, the programmer could simply specify a formal criterion for what would count as success and leave it to the AI to find a solution. The AI would guide this search through a powerful set of heuristics and other methods for discovering structures in the space of possible solutions.
It would keep searching until it found a solution that meets the success criteria. The AI would then implement the solution, or (in the case of an oracle) it could tell the user.
We would enter a danger zone only if these methods used in the search for solutions became extremely broad and powerful: that is when they began to equate to general intelligence – and especially when they equated to a superintelligence.
Chapter 11 – Multipolar Scenarios
We have seen (particularly in chapter 8) how threatening a unipolar outcome, in which a single superintelligence came to gain a decisive strategic advantage and used it to create a singleton, could be. In this chapter, we will examine what might happen in a multipolar outcome, that is, a post-transition society with multiple superintelligent agencies competing with each other. Our interest in this class of scenarios is twofold. First, as we mentioned in chapter 9, one might imagine that social integration offers a solution to the control problem.
We have already cited some of the limitations of such an approach, and this chapter will attempt to present a more complete picture. Second, even if no one were dedicated to creating the conditions for a multipolar outcome as a means of dealing with the control problem, this outcome may occur anyway. And what, then, would such an outcome look like? The competitive society that would result from it is not necessarily attractive, nor is it lasting:
Capital and welfare – One difference between horses and humans is that humans possess capital. It is an empirical fact that the total share of capital has been stable at around 30% for quite some time (despite significant short-term fluctuations). This means that 30% of the total global income is received in the form of earnings by the owners of capital, while the remaining 70% is received by workers in the form of wages. If we classify an IA as capital, then with the invention of machine intelligence that can fully replace human labor, wages would equal the reduced value of the cost of machines, and – considering that machines would be very efficient – would be quite low, much lower than the income needed for human subsistence.
Chapter 12 – Acquiring Value
Capacity control is at best a temporary and auxiliary measure. Unless the plan is to keep superintelligence locked away forever and ever, complete mastery of motivation selection will be required. But how to implant a value into an artificial agent so that it will pursue that value as its ultimate goal? As long as the agent is not intelligent, it may not be able to understand or even represent any humanly meaningful value. However, if we delay this procedure until the agent becomes superintelligent, it may be able to resist our attempts to interfere with its motivational system – and, as we show in chapter 7, it would have convergent instrumental reasons for doing so. This “value insertion” problem is complicated, but it must be confronted.
We now come to an important but subtle approach to the value insertion problem. This approach consists in using the intelligence of the AI itself so that it learns the values we want it to have.
To do this, we must provide a criterion so that the AI can, at least implicitly, choose a suitable set of values. We could then construct the AI so that it would act according to its best estimates of these implicitly defined values. It would continually refine its estimates as it learned more about the world and gradually understood the implications of the value-determining criteria.
Unlike the steppe approach, which gives the IA an interim goal and then replaces it with a different end goal, the value learning approach keeps an end goal unchanged throughout the development and operational phases of the IA. The learning does not change the end goal. It modifies the IA’s beliefs about the goal.
Chapter 13 – Electing the Choice Criteria
Suppose we could implant any arbitrary final value into an embryonic IA. The decision regarding what the isolated value would be could then have far-reaching consequences. Other certain basic parameter choices – concerning the axioms of decision theory and IA epistemology – could be similarly important. But fools, ignorant and limited that we are, how could we be trusted to make good decisions regarding design? How could we choose without forever leaving imprinted the biases of the current generation? In this chapter, we will explore how indirect normativity might allow us to pass on to our own superintelligence much of the cognitive work associated with making these decisions and still keep the outcome anchored in deeper human values.
Available options include casual decision theory (in its many variations) and evidential decision theory, accompanied by other newer candidates such as “timeless decision theory” and “decision theory without updates,” which are still under development. It is likely to be difficult to identify and articulate the right decision theory, and there will be some difficulty in trusting that we have made the right choice.
While the prospects that we will be able to directly specify an IA decision theory are better than the possibility of directly specifying its ultimate goals, we are still faced with a substantial risk of error. Many of the complications that could perhaps invalidate the currently best-known decision theories were discovered only recently, suggesting that there may be additional problems that have not yet come to light.
The result of deploying a faulty decision theory in an IA could be disastrous, possibly leading to an existential catastrophe.
Chapter 14 – The Strategic Landscape
Now is the time to consider the challenge of superintelligence in a broader context. We would like to be able to orient ourselves sufficiently in the strategic landscape to know, at least, what general direction we should follow. This, effectively, is not something simple. Here, in this penultimate chapter, we will introduce some general analytical concepts that will help us think about long-term science and technology policy. Then we will apply these concepts of machine intelligence:
The principle of differential technological development – Delay the development of dangerous and harmful technologies, especially those that increase the level of existential risk, and accelerate the development of beneficial technologies, especially those that reduce the existential risks imposed by nature or other technologies. In this way, a policy could be evaluated on the basis of how much differential advantage it gives to desirable forms of technological development compared to undesirable forms.
The claim that it is preferable for a superintelligence to emerge before other potentially dangerous technologies such as nanotechnology is based on the fact that a superintelligence would reduce the existential risks associated with nanotechnology, but that the opposite would not occur. Consequently, if we create a superintelligence first, we will face only the risks associated with a superintelligence; on the other hand, if nanotechnology is created first, we will face the risks associated with nanotechnology, and after that, the risks associated with the superintelligence. Even if the existential risks related to a superintelligence are very great, and that superintelligence is the most dangerous of all technologies, there could still be a reason for us to rush its development.
Chapter 15 – The Turning Point
We find ourselves in a tangle of strategic complexity, surrounded by a dense fog of uncertainty. Although many considerations have been made, their details and interrelationships remain muddled and uncertain – and there may be other factors that we have not even thought about. What can we do about this difficult situation?
The intention, therefore, is to concentrate on problems that are not only important but urgent, because their solutions would be needed before the intelligence explosion. We should also be careful not to work on problems that have a negative value (since solving them is dangerous). Some technical problems in the area of artificial intelligence, for example, may have negative value, since their solutions would accelerate the development of super-machine intelligence, without at the same time speeding up the development of beneficial machine intelligence, enabling us to survive it.
Another specific goal is the promotion of ‘best practices’ among AI researchers.
Any progress made on the control problem needs to be disseminated. Some forms of computational experimentation, particularly those involving the use of strong recursive self-improvement, might also call for the use of capacity control to mitigate the risk of an accidental startup.
While the actual implementation of security methods is not something that relevant today, it will gradually become relevant as the state of the art advances. And it is not too soon to call on practitioners to declare a commitment to safety, including endorsing the principle of the common good and promising to step up safety if and when the prospect of machine superintelligence begins to look more imminent.
FACT SHEET:
Original title: Superintelligence: Paths, Dangers, Strategies
Author: Nick Bostrom