Note: this post aims to help a particular subset of our audience understand the assumptions behind our work on science philanthropy and global catastrophic risks. Throughout, “we” refers to positions taken by the Open Philanthropy Project as an entity rather than to a consensus of all staff.
Two priorities for the Open Philanthropy Project are our work on science philanthropy and global catastrophic risks. These interests are related because—in addition to greatly advancing civilization’s wealth and prosperity—advances in certain areas of science and technology may be key to exacerbating or addressing what we believe are the largest global catastrophic risks. (For detail on the idea that advances in technology could be a driver, see “‘Natural’ GCRs appear to be less harmful in expectation” in this post.) For example, nuclear engineering created the possibility of nuclear war, but also provided a source of energy that does not depend on fossil fuels, making it a potential tool in the fight against climate change. Similarly, future advances in bioengineering, genetic engineering, geoengineering, computer science (including artificial intelligence), nanotechnology, neuroscience, and robotics could have the potential to affect the long-term future of humanity in both positive and negative ways.
Therefore, we’ve been considering the possible consequences of advancing the pace of development of various individual areas of science and technology in order to have more informed opinions about which might be especially promising to speed up and which might create additional risks if accelerated. Following Nick Bostrom, we call this topic “differential technological development.” We believe that our views on this topic will inform our priorities in scientific research, and to a lesser extent, global catastrophic risks. We believe our ability to predict and plan for future factors such as these is highly limited, and we generally favor a default presumption that economic and technological development is positive, but we also think it’s worth putting some effort into understanding the interplay between scientific progress and global catastrophic risks in case any considerations seem strong enough to influence our priorities.
The first question our investigation of differential technological development looked into was the effect of speeding progress toward advanced AI on global catastrophic risk. This post gives our initial take on that question. One idea we sometimes hear is that it would be harmful to speed up the development of artificial intelligence because not enough work has been done to ensure that when very advanced artificial intelligence is created, it will be safe. This problem, it is argued, would be even worse if progress in the field accelerated. However, very advanced artificial intelligence could be a useful tool for overcoming other potential global catastrophic risks. If it comes sooner—and the world manages to avoid the risks that it poses directly—the world will spend less time at risk from these other factors.
Curious about how to compare these two factors, I tried looking at a simple model of the implications of a survey of participants at a 2008 conference on global catastrophic risk organized by the Future of Humanity Institute at Oxford University. I found that speeding up advanced artificial intelligence—according to my simple interpretation of these survey results—could easily result in reduced net exposure to the most extreme global catastrophic risks (e.g., those that could cause human extinction), and that what one believes on this topic is highly sensitive to some very difficult-to-estimate parameters (so that other estimates of those parameters could yield the opposite conclusion). This conclusion seems to be in tension with the view that speeding up artificial intelligence research would increase risk of human extinction on net, so I decided to write up this finding, both to get reactions and to illustrate the general kind of work we’re doing to think through the issue of differential technological development.
- Describe our simplified model of the consequences of speeding up the development of advanced AI on the risk of human extinction using a survey of participants at a 2008 conference on global catastrophic risk organized by the Future of Humanity Institute at Oxford University.
- Explain why, in this model, the effect of faster progress on artificial intelligence on the risk of human extinction is very unclear.
- Describe several of the model’s many limitations, illustrating the challenges involved with this kind of analysis.
We are working on developing a broader understanding of this set of issues, as they apply to the areas of science and technology described above, and as they relate to the global catastrophic risks we focus on.
- How large other global catastrophic risks are in comparison with risks from advanced artificial intelligence
- How much advanced artificial intelligence would affect the other important risks
- How much (if any) less prepared we’d be for advanced artificial intelligence if the pace of progress in artificial intelligence were faster
To illustrate what might be involved in reasoning about this kind of problem, I made a simplified model using a survey of participants at a 2008 conference on global catastrophic risk organized by the Future of Humanity Institute at Oxford University (where I worked before joining GiveWell). We do not know the details regarding the specific participants, but would guess that participation in this survey selects for unusually high levels of concern about global catastrophic risk.
The model makes the following assumptions:
- It uses the conference’s estimates of the probability of extinction from various risks.
- It assumes the risk from possible extinction events other than advanced AI is evenly distributed from 2015 to 2100.
- It assumes that if advanced AI is developed safely, the remaining years of risks from other factors will become negligible.
- It focuses exclusively on extinction risk, which we believe is the main type of risk that the people most concerned about advanced AI have in mind.
The model outputs how much the extinction risk from advanced AI would have to increase (as a fraction of its initial value), as a result of arriving X years sooner, in order to offset the risk reduction from having X fewer years exposed to other risks. The model only considers what happens between 2015 and 2100.
The median participant in FHI’s conference estimated that the probability of human extinction by 2100 was 19%, with 5 percentage points of that risk coming from advanced artificial intelligence. In my model, those numbers yield the following table:
|Number of years advanced AI is sped up
|Percentage point decrease in other risks
|Fractional decrease in other risks
|Fractional increase in risk from advanced AI required to offset decrease in other risk
For example, this table says that if progress toward advanced AI were sped up by 5 years, it would reduce other risks by about 6% (or about 0.8 percentage points). Since the probability of extinction from advanced AI is lower than the total probability of extinction from other causes, this would require an increase of over 16% to outweigh those 0.8 percentage points. The figures for 5 and 10 years are just multiples of the figures for one year.
The model doesn’t say anything about how large the increase in risk from advanced AI would actually be if advanced AI were sped up by a given number of years, largely because I couldn’t think of any simple and reasonable way to model that. The model allows readers to check the implications of their own views on this topic.
One possible example for some context: if one believed that 20 additional years of research on risks from advanced AI would be needed in order to cut risk from advanced AI in half (from 5% to 2.5%), this would imply an average of 0.125 additional percentage points of risk from advanced AI per year of speedup. Since each year of speedup is estimated at ~0.164 percentage points of risk reduction from other causes, this would imply that speedup is safety-improving for the average year. (The picture becomes substantially more complicated if one does not evenly divide the relevant probabilities across years.)
Overall, under the assumptions of this model, the net effect of faster progress toward advanced AI on total global catastrophic risk seems unclear.
To a be a bit more general and illustrate the impacts of alternative assumptions, here’s a table that looks at the relevant figures under different assumptions about the ratio of risk from advanced AI to all other risks. These are the proportional increases in risk from advanced AI required to offset X years of AI progress if you accept the above model and assume that the ratio of (risk from advanced AI:all other risks) is as specified. For example, the table says that if you think the ratio is 0.3:1, and you otherwise accept the model described above, then you should think that one year of faster progress toward advanced AI would have neutral consequences for the probability of human extinction before 2100 if faster progress proportionally increased the risk of advanced AI by about 4%.
|Number of years development of advanced AI is sped up
|Ratio of risk from advanced AI to all other risks
This model is extremely simplified, and limitations include the following:
- Risks from future technology are not likely to be evenly spread across the century. They seem small right now, and presumably will be largest during some future time when technology has advanced. They might decline after that as prevention/response mechanisms improve. And each might occur before or after advanced artificial intelligence, in which case the effect of additional progress toward advanced AI on each risk could be much greater or much smaller. For example, if atomically precise manufacturing would by default be developed in 2060 and advanced AI would by default be developed in 2050, then additional progress toward advanced AI would not have any effect on risks from atomically precise manufacturing.
- It’s extremely hard to say how much additional preparation for advanced AI would reduce risks from advanced AI. It’s possible that years of preparation work will have increasing or diminishing returns, which could make later years more or less productive relative to earlier years. It’s possible preparation efforts will be more effective in years when AI development is further along. It’s possible that advanced AI is sufficiently distant, and preparation sufficiently subject to diminishing returns, that additional progress toward advanced AI relative to the status quo wouldn’t diminish the level of preparation appreciably.
- Artificial intelligence may not fully eliminate other risks to civilization, and it might vary by risk. For example, nuclear and other advanced weapons could still be used with great destructive potential. But biological weapons might pose much smaller risks.
- The same action (e.g. publishing a series of important papers) might speed up progress toward advanced AI by a significantly greater amount of time if advanced AI is coming in relatively soon (e.g. 20 years), in comparison with potential scenarios where the arrival of advanced AI is much farther in the future (e.g. more than 100 years). Pushing forward progress toward advanced AI could have importantly different consequences in terms of reducing other risks or adequacy of preparations for advanced AI in scenarios where advanced AI is coming relatively soon in comparison with scenarios where advanced AI is coming relatively late. For example, if advanced AI is coming much later, I would guess that (i) it’s more likely we’ll be adequately prepared when advanced AI arrives, and (ii) there will already be other advanced technologies that pose risks to civilization when advanced AI arrives.
- I have little confidence in the conference’s estimates of the probability of various catastrophic events. Readers can try their own estimates in our model if they prefer, or refer to the second table above for a rough sense of what their views would imply.
- The conference’s 5% estimate for risk from advanced AI presumably includes some assumptions about when advanced AI is likely to arrive and how much preparation is likely to be in place before then. I haven’t modeled this; I’ve merely laid out the implications of various assumptions about the impact of marginal speedup on marginal risk increase.
- The model does not include positive/negative outcomes other than extinction, such as global catastrophes or other positive/negative events with lasting consequences.
- There are other important considerations related to the pace of progress in artificial intelligence that I did not consider. For example, the rate of progress—and where the progress happens—could affect:
- What country or organization obtains advanced capabilities first, which could positively or negatively affect overall risk. It could also affect the attitudes with which advanced AI is developed; for example, in today’s relatively stable and peaceful geopolitical environment, people seem particularly likely to prioritize safety before deploying advanced AI, relative to a possible world where geopolitics is more unstable and the incentives favor deploying advanced AI as soon as possible.
- How advanced available hardware and robotics is when advanced artificial intelligence is developed, which may affect “hardware overhang” and how powerful advanced artificial intelligence is at early stages of its development.
- How long it takes to move from it being possible to create advanced artificial intelligence to adequate safety measures being implemented (which could potentially be longer or shorter if advanced AI develops more quickly).
We are working on developing a broader—though still relatively simple—picture of how large various global catastrophic risks are, how those risks depend on progress in different areas of science and technology, and how progress in different areas of science and technology interact with each other.
Thanks to the following people for reviewing a draft of this post and providing thoughtful feedback (this of course does not mean they agree with the post or are responsible for its content): Stuart Armstrong, Nick Bostrom, Paul Christiano, Owen Cotton-Barratt, Daniel Dewey, Eric Drexler, Holden Karnofsky, Luke Muehlhauser, Toby Ord, Anders Sandberg, and Carl Shulman.