Potential global catastrophic risk focus areas

Throughout the post, “we” refers to GiveWell and Good Ventures, who work as partners on GiveWell Labs. This post draws substantially on our recent updates on our investigation of policy-oriented philanthropy, including using much of the same language.

As part of our work on GiveWell Labs, we’ve been exploring the possibility of getting involved in efforts to ameliorate potential global catastrophic risks (GCRs), by which we mean risks that could be bad enough to change the very long-term trajectory of humanity in a less favorable direction (e.g. ranging from a dramatic slowdown in the improvement of global standards of living to the end of industrial civilization or human extinction). Examples of such risks could include a large asteroid striking earth, worse-than-expected consequences of climate change, or a threat from a novel technology, such as an engineered pathogen.

In our annual plan for 2014, we set a stretch goal of making substantial commitments to causes within global catastrophic risks by the end of this calendar year. We are still hoping to decide whether to make commitments in this area, and if so which causes to commit to, on that schedule. At this point, we’ve done at least some investigation of most of what we perceive as the best candidates for more philanthropic involvement in this category, and we think it is a good time to start laying out how we’re likely to choose between them (though we have a fair amount of investigative work still to do). This post lays out our current thinking on the GCRs we find most worth working on for GiveWell Labs.

Why global catastrophic risks?

We believe that there are a couple features of global catastrophic risks that make them a conceptually good fit for a global humanitarian philanthropist to focus on. These map reasonably well to two of our criteria for choosing causes, though GCRs generally seem to perform relatively poorly on the third:

  • Importance. By definition, if a global catastrophe were to occur, the impact would be devastating. However, most natural GCRs appear to be quite unlikely, making the annual expected mortality from natural GCRs low (e.g., perhaps in the hundreds or thousands; more on the distinction between natural and anthropogenic GCRs below). The potential importance of GCRs comes both from novel technological threats, which could be much more likely to cause devastating impacts, and from considering the very long-term impacts of a low-probability catastrophe: depending on the moral weight one assigns to potential future generations, the expected harm of (even very unlikely) GCRs may be quite high relative to other problems.
  • Crowdedness. Because GCRs are generally perceived to have a very low probability, many other social agents that are normally devoted to protecting against risks (e.g. insurance companies, governments in wealthy countries) appear not to pay them much attention. This should not necessarily be surprising, since much of the benefits of averting GCRs seem to accrue to future generations, which cannot hold contemporary institutions accountable, and to the extent they accrue to present generations, they are distributed very widely, with no clear concentrated constituency that has an incentive to prioritize them. The possibility that a long time horizon may be required to justify investment in averting GCRs also seems to make them a good conceptual fit for philanthropy, which, as GiveWell board member Rob Reich has argued, is unusually institutionally suited to long time horizons. This makes it all the more notable that, with the key exception of climate change, most potential global catastrophic risks seem to receive little or no philanthropic attention (though some receive very significant government support). The overall lack of social attention to GCRs is not dispositive, but it suggests that if GCRs are genuinely worthy of concern, a new philanthropist aiming to address them may encounter some low-hanging fruit.
  • Tractability. The very low frequencies of GCRs suggest that tractability is likely to be a challenge. Humanity has little experience dealing with such threats, and it may be important to get them right the first time, which seems likely to be difficult. A philanthropist would likely struggle to know whether they were making a difference in reducing risks.

Our tentative conclusion on GCRs as a whole is that the balance of strong performance on the importance and crowdedness criteria outweighs low expected tractability, but we are open to revising that view on the basis of deeper explorations of particularly promising-seeming GCRs.

What we’ve done to investigate GCRs
We have published shallow investigations on both GCRs in general and a variety of specific (potential) GCRs:

We also have an investigation forthcoming on potential risks from artificial intelligence, and we commissioned former GiveWell employee Nick Beckstead to do a shallow investigation of efforts to improve disaster shelters to increase the likelihood of recovery from a global catastrophe. We are still hoping to conduct shallow investigations of nanotechnology, synthetic biology governance (aimed more at ecological threats than biosecurity), and the field of emerging technology governance, though we may not do so before prioritizing causes within GCRs.

Beyond the shallow level, we have done a deeper investigation on geoengineering research and continued our investigation of biosecurity through a number of additional conversations.

Our investigations have been far from comprehensive; we’ve prioritized causes we’ve had some reason to think were particularly promising, often because we suspected a relative lack of interest from other philanthropists relative to the causes’ humanitarian importance or because we encountered a specific idea from someone in our network.

We have also made attempts to have conversations with people who think broadly and comparatively about global catastrophic risks. As far as we can tell, most such people tend to be connected to the effective altruist community (to which we have strong ties and which tends to take a strong interest in GCRs). Many of our conversations with such people have been informal, but public notes are available from our conversations with Carl Shulman, a research associate at the Future of Humanity Institute, and Seth Baum, executive director of the Global Catastrophic Risk Institute.

General patterns in what we find promising
The following two general observations are major inputs into our thinking:

“Natural” GCRs appear to be less harmful in expectation.

After a number of shallow investigations, we’ve tentatively concluded that “natural” (i.e. not human-caused) GCRs seem to present smaller threats than “anthropogenic” (i.e. human-caused) GCRs. The specific examples we’ve examined and a general argument point the same direction.

The general argument for being more worried about anthropogenic GCRs is as follows. The human species is fairly old (Homo sapiens sapiens is believed to have evolved several hundred thousand years ago), giving us a priori reason to believe that we do not face high background extinction risk: if we had a random 10% chance of going extinct every 10,000 years, we would have been unlikely to have survived this long (0.9^30 = ~4%). Note that anthropic bias can make this kind of reasoning suspect, but this reasoning also seems to map well to available data about different potential GCRs, as discussed below (i.e., we do not observe natural risks that appear likely to cause human extinction). By contrast with “natural” risks, anthropogenic risks present us with potentially unprecedented situations, for which history cannot serve as much of a guide. Atomic weapons and biotechnology are only decades old, and some of the most dangerous technologies may be those that don’t yet exist. With that said, some “natural” risks could present us with somewhat unprecedented situations, due to the modern world’s historically high level of interconnectedness and reliance on particular infrastructure.

On the specifics of various “natural” GCRs:

  • Near earth asteroids. A 2010 U.S. National Research Council report estimates that the background annual probabilities of an impact as large as the one that is believed to have caused the extinction of the dinosaurs and a “possible global catastrophe” are 1/100 million and 1/700,000 respectively (PDF, page 19). NASA reports that it has tracked 93% of the near earth asteroids large enough to cause a “possible global catastrophe” and all of the ones as large as the one believed to have caused the extinction of the dinosaurs (and none of them are on track to hit Earth in the next few centuries), suggesting a residual possibility of a “possible global catastrophe” of ~1/100,000 during the next century (and likely lower). There may be a comparable remaining risk from comets—Vaclav Smil claims that “probabilities of the Earth’s catastrophic encounter with a comet are likely less than 0.001% during the next 50 years,” which would be about the same as the remaining asteroid risk—but our understanding is that comets are much harder to detect. As a result of the attention from NASA and the B612 Foundation, this cause also appears more “crowded” than others, though seemingly more tractable as well.
  • Large volcanic eruptions. Estimates of the frequency of volcanic eruptions large enough to count as global catastrophic risks differ by several orders of magnitude, but our current understanding is that volcanic eruptions large enough to cause major crop failures are likely to occur no more frequently than 1/10,000 years, and perhaps significantly less frequently (suggesting a <1% chance of such an eruption in the next century). Large volcanic eruptions may be much more of a cause for concern than asteroid strikes, but this cause performs relatively poorly on tractability, since our ability to predict eruptions is limited, and we are not currently capable of preventing an eruption.
  • Antibiotic resistance. Microbes are currently evolving to be resistant to antibiotics faster than new antibiotics are being developed, posing a growing public health threat. However, antibiotic resistance is unlikely to represent a threat to civilization, since humanity survived without antibiotics until ~1940, including during the period when most gains against infectious diseases were made. We also expect other actors to work to address antibiotic resistance as it continues to become a more pressing public health issue. (More at our writeup.)
  • Geomagnetic storms. The major threat from geomagnetic storms is to potentially imperil some large-scale power infrastructure, but the risks are not well-understood. A consultant who has contributed to many of the published reports on the topic contends that a worst-case, 1/200 year storm could result in a “years-long global blackout,” but other sources show less concern (e.g. modeling the impact of a ~200 year storm as a risk of a blackout for ~10% of the U.S. population for somewhere between 2 weeks and 2 years).

The only GCRs that receive large amounts of philanthropic attention are nuclear security and climate change.

We do not have precise figures aggregated across causes, but our impression is that climate change is an area in which hundreds of millions of dollars a year are spent by U.S. philanthropic funders, while philanthropic funding addressing nuclear security appears to be in the tens of millions.

We don’t know of philanthropic funding for any of the other GCRs exceeding the single digit millions of dollars per year.

Leading focus area contenders

The leading contenders described below are among the most apparently dangerous and potentially unprecedented GCRs (seemingly – to us – more worrisome than the “natural” GCRs listed above, though such a comparison is necessarily a judgment call). At the same time, all appear to have limited “crowdedness,” at least in terms of philanthropic attention, unlike nuclear security (and unlike most of the climate change space, though one of the contenders described below relates to climate change). They are discussed in the order I would pick between them if I had to pick today, though we have not decided how many we expect to commit to by the end of the year, and other GiveWell staff may disagree. Though these are the GCRs I would choose to work on if I were picking today, we don’t have high confidence that they represent the correct set. There are a number of questions (discussed below) that we hope to address before reaching a conclusion at the end of the year.


By biosecurity, we mean the constellation of issues around pandemics, bioterrorism, biological weapons, and biotechnology research that could be used to inflict great harm (“dual use research”). Our understanding is that natural pandemics (especially flu pandemics) likely present the greatest current threat, but that the development of novel biotechnology could lead to greater risks over the medium or long term. We see this GCR as having a strong case for “importance” because it seems to combine relatively credible, likely, current threats with more speculative potential longer-term threats in a fairly coherent program area. The space receives significant attention from the U.S. government (with ~$5 billion in funding in 2012) but little from foundations: the Skoll Global Threats Fund is the only U.S. foundation we know to be engaging in this area currently, at a relatively low level, though the Sloan Foundation also used to have a program in this area. (We believe the distinction between government and philanthropic funding is at least potentially meaningful, as the two types of actors have different incentives and constraints; in particular, philanthropic funding could potentially influence a much larger amount of government funding.) Although we are not sure of the activities that would be best for a philanthropist to support, many people we spoke with argued that current preparedness is subpar and that there is significant room for a new philanthropic funder.

Although we have had a number of additional conversations since the completion of our shallow investigation, we continue to regard the question of what a philanthropist should fund within this broad issue as an open one. We expect to address it with a deeper investigation and a declared interest in funding.

Geoengineering research and governance

We see a twofold case for the importance of work on geoengineering research and governance:

Although solar geoengineering is in the news periodically, research on the science or governance appears to receive relatively little dedicated funding: our rough survey found about $10 million/year in identifiable support from around the world (mostly from government sources), and we are not aware of any institutional philanthropic commitment in the area (though Bill Gates personally supports some research in the area).

Our conversations have led us to believe that there is significant scientific interest in conducting geoengineering research and that funding is an obstacle, but, as with biosecurity, we do not have a very detailed sense of what we might fund. We’re wary of the concern that further geoengineering research could conceptually undermine support for emissions reductions, but we regard it as relatively unlikely, and also find it plausible that further research could contribute significantly to governance efforts.

We expect to address the question of what a philanthropist could support in this area with a deeper investigation and a declared interest in funding. Note that we don’t envision ourselves as trying to encourage geoengineering, but rather as trying to gain better information and governance structures for it, which could make the actual use more or less likely (and given the high potential risks of both climate change and geoengineering, we could imagine that shifting the probabilities in either direction – depending on what comes of more exploratory work – could do great good).

Potential risks from artificial intelligence

We are earlier in this investigation than in investigations of the above two causes, and have not yet produced a writeup. There is internal disagreement about how likely this cause is to end up as a priority; I don’t feel highly confident that it should be above some of the other contenders not discussed in depth here.

In brief, it appears possible that the coming decades will see substantial progress in artificial intelligence, potentially even to the point where machines come to outperform humans in many or nearly all intellectual domains, though it is difficult or impossible to make confident forecasts in this area. Such a scenario could carry great potential benefits, but could carry significant dangers (e.g. technological disemployment, accidents, crime, extremely powerful autonomous agents) as well. The majority of academic artificial intelligence researchers seem not to see the rapid development of powerful autonomous agents as a substantial risk, but to believe that there are some potential risks worth preparing for now (such as accidents in crucial systems or AI-enabled crime; see slides 20-22). However, some people, including the Machine Intelligence Research Institute and computer scientist Stuart Russell, feel that there are important things that should be done today to substantially improve the social outcomes associated with the rapid development of powerful artificial intelligence.

In general, my inclination would be to defer to the preponderance of expert opinion, but I think this area could potentially be promising for philanthropy partly because I have not seen a rigorous public assessment by credible AI researchers to support the (seemingly predominant) lack of concern over risks from the rapid development of powerful autonomous agents. Since this topic seems to be drawing increasing attention from some highly credentialed people, supporting such a public assessment seems like it could be valuable, even if the conclusion is that most researchers are right to not be concerned. The fact that a substantial portion of mainstream AI researchers also seem to think that more traditional risks from AI progress (e.g. accidents, crime) are worth addressing in the near term does increase my interest in the area, though not by much, since I don’t see those issues as GCRs, whereas the rapid development of powerful autonomous agents could conceivably be one. Should we decide to pursue this area further, I would guess that it would be at a lower level of funding than the other potential priority areas described above.

Note from Holden: I currently see this cause as more promising than Alexander does, to a fairly substantial degree. I agree that there are reasons, including the preponderance of expert opinion, to think that there is little preparatory work worth doing today; however, I see the stakes as large enough to justify work in this area even at a relatively low probability of having impact. I would like to see reasonably well-resourced, full-time efforts – with substantial input from mainstream computer scientists – to think about what preparations could be done for major developments in artificial intelligence, and my perception is that efforts fitting this description do not exist currently. We are currently working on trying to understand whether the seeming lack of activity comes from a place of “justified confidence that action is not needed now” or of “lack of action despite a reasonable possibility that action would be helpful now.” My current guess is that the latter is the case, and if so I hope to make this cause a priority.

We will be writing more on this topic in the future.

Why these three risks stand out

Generally speaking, the causes highlighted above (geoengineering, biosecurity and potentially (pending more investigation) artificial intelligence) seem to us to have:

  • Greater potential for the most extreme direct harms (extreme enough to make a substantial change to the long-term trajectory of civilization likely) relative to other risks we’ve looked at, with the exception of nuclear weapons (an area that we perceive as more “crowded” than these three).
  • Very difficult to quantify, but potentially reasonably high (1%+), risk of such extreme harm in the next 50-100 years.
  • Very little philanthropic attention.

Our guess is that most other candidate risks would, upon sufficient investigation, appear less worth working on than at least one of our top candidates – due to presenting less potential for harm, less tractability, or more crowdedness, while being roughly comparable on other dimensions. That said, (a) the specific assessment of artificial intelligence is still in progress and we don’t have internal agreement on it, as discussed above; (b) we have low confidence in our working assessment, and plan both to do more investigation and to seek out more critical viewpoints on our current priorities.

Topics for further investigation

While I currently see the three potential GCRs discussed above as the leading contenders for GCR focus areas, there are a number of questions we would like to answer before committing.

Our shallow investigations have generated a number of follow-up questions that we would like to resolve before committing to causes:

  • Our current understanding is that major volcanic eruptions are currently neither predictable nor preventable, making this cause apparently rather intractable. To what extent could further research help remedy these shortcomings, and are there other ways a philanthropist could help address the risk from a large volcanic eruption?
  • How do risks from comets compare to the remaining risks from untracked near earth asteroids? Our understanding is that these risks are likely to be an order of magnitude or two lower than volcanic eruption risks that would cause similar harm, but we aren’t sure how they compare in tractability. What could be done about potential risks from comets?
  • How credible are existing estimates of the potential harm of geomagnetic storms? In particular, how do experts assess the risks to the power grid from a rare geomagnetic event? How prepared are power companies for geomagnetic storms?
  • Are there any important gaps in current funding for efforts to improve nuclear security?

In addition, we are still hoping to conduct shallow investigations of nanotechnology, synthetic biology governance (aimed more at ecological threats than biosecurity), and the field of emerging technology governance as a whole, which we think could potentially be competitive with some of the risks described as potential focus areas.

Update on GiveDirectly

Three members of GiveDirectly‘s board of directors (Paul Niehaus, Michael Faye, and Chris Hughes) are planning to start a for-profit technology company, Segovia, aimed at improving the efficiency of cash transfer distributions in the developing world. Segovia plans to sell software to developing-country governments for use in implementing their cash transfer programs.

This development was announced today (though we have been aware of and discussing it with GiveDirectly for some time). Some discussion is available at today’s post on the Development Channel blog.

GiveDirectly and Segovia will work out of the same office space in New York City.

Dr. Niehaus, who has been our primary contact at GiveDirectly and has unofficially played the role of GiveDirectly’s full-time Executive Director, will continue to devote significant time to GiveDirectly and serve as its President with primary responsibility for GiveDirectly. He will be co-employed by Segovia and has told us that he may spend up to 20% of his time on Segovia. Dr. Faye will become Segovia’s president. (Previously, both Dr. Niehaus and Dr. Faye have had full-time jobs outside of GiveDirectly, though they have had substantial responsibilities at GiveDirectly.)

We think this development is simultaneously a potentially very positive one broadly – bringing the possibility of greatly leveraged positive impact on the world – and one that raises new issues and risks for GiveDirectly and its donors.

We think these issues and risks (discussed further below) are noteworthy but ultimately similar in magnitude to, or smaller than, similar risks that exist for our other present and past recommended charities. We plan to continue recommending GiveDirectly as a top charity and continue to see it as an outstanding giving opportunity.

Note that we have discussed all of these issues with Dr. Niehaus and Dr. Faye – they have reviewed a draft of this post – and we believe they are aware of all of the issues we discuss below.

This post focuses on the following:

  • What costs and benefits does this decision pose for GiveDirectly right now?
  • What additional issues could arise in the future, particularly potential conflicts of interest between Segovia and GiveDirectly?
  • Why have Dr. Niehaus, Dr. Faye, and Mr. Hughes decided to serve developing country governments and why are they using a for-profit-company structure?
  • What effect will this have on our recommendation of GiveDirectly?

We have not tried to formulate a view on Segovia’s possible impact because this does not seem directly relevant to GiveWell or our donors. Based on what Dr. Niehaus and Dr. Faye have told us, we believe it’s plausible that given (a) the amount of money governments transfer to recipients and (b) the amount of money that may be lost by those programs due to negligence and/or corruption, Segovia could be very impactful and may represent some of the “upside” we hoped to see from GiveDirectly.

What costs and benefits does this decision pose for GiveDirectly right now?

We discuss several potential negative impacts Segovia could have on GiveDirectly; we also discuss potential positive impacts.

  1. What impact will Segovia have on key staff’s time allocation to GiveDirectly?
  2. Will Segovia’s existence affect the intensity with which GiveDirectly leadership work to maximize GiveDirectly’s impact?
  3. Will Segovia directly affect GiveDirectly’s ability to absorb and distribute funds to recipients?
  4. Will the general public react negatively to this announcement in a way that affects GiveDirectly’s ability to raise funds or otherwise distracts it from its core work?
  5. What benefits might Segovia have for GiveDirectly?

What impact will Segovia have on key staff’s time allocation to GiveDirectly?

Dr. Niehaus and Dr. Faye told us that they expect the following changes to staff time allocations due to Segovia:

  • Paul Niehaus, GiveDirectly’s President, had previously been splitting his time between GiveDirectly and his academic position at University of California at San Diego. Pending the university’s approval, he hopes to take a one-year leave of absence from his academic position to enable co-employment at GiveDirectly and Segovia. During this one-year leave of absence, he expects that the total amount of time he devotes to GiveDirectly will increase slightly and that he will spend a maximum of 20% of his time on Segovia.
  • Michael Faye, Segovia’s president and a member of GiveDirectly’s Board of Directors, had previously worked at a management consulting firm but spent significant time on GiveDirectly. He has now taken a leave of absence from his job and intends to spend the vast majority of his time on Segovia while still offering time to GiveDirectly. He expects the time he spends on GiveDirectly to increase. More on this below.
  • Melissa Harpool, Outreach Coordinator, will split her time between GiveDirectly and Segovia. Her current primary role is managing schedules, and the people whose schedules she manages will now be splitting time between Segovia and GiveDirectly. She had previously been full-time at GiveDirectly.

Dr. Niehaus and Dr. Faye told us that relevant staff track their time allocation to projects and will be able to share whether or not they have hit the targets described above.

Will Segovia affect the intensity with which GiveDirectly leadership work to maximize GiveDirectly’s impact?

Dr. Niehaus told us that he retains his ambitions for and commitment to GiveDirectly’s long term impact, but splitting attention between two organizations is difficult, especially when both are growing rapidly and likely to face significant obstacles.

It is plausible that given GiveDirectly’s and Segovia’s overlapping leadership, staff and office space, those involved with both might see Segovia as the more exciting opportunity. We believe that this could lead to reduced ambition or it could reduce the quality of the mental effort GiveDirectly’s leadership dedicates to maximizing GiveDirectly’s impact.

Will Segovia directly affect GiveDirectly’s ability to absorb and distribute funds to recipients?

Assuming that GiveDirectly staff meets the time targets described above, we don’t think Segovia will have a direct impact on GiveDirectly’s ability to absorb and distribute funds to recipients.

Will GiveDirectly receive a negative response from the general public that affects its ability to raise funds or otherwise distracts it from its core work?

We continue to see GiveDirectly as an outstanding giving opportunity and plan to continue recommending it to donors. That said, we are not confident about how others will react and remain concerned about the impact that the general public’s reaction might have on GiveDirectly’s future fundraising prospects.

Dr. Niehaus and Dr. Faye told us that they have attempted to reduce the likelihood that the response is negative by speaking at length with media in advance of the announcement so that stories written about their decision present a reasonable perspective on this new development. They have also communicated with their major donors and report that they have not encountered negative reactions.

What benefits will Segovia provide for GiveDirectly?

Potential benefits include:

  • GiveDirectly will receive an equity stake in Segovia, which could result in GiveDirectly’s receiving additional funding in the future. The size of the stake is not yet determined. Dr. Niehaus, Dr. Faye, and Mr. Hughes are currently discussing the size of this stake with potential investors.
  • The technology Segovia is planning to develop would likely be helpful to GiveDirectly. Segovia would give this technology to GiveDirectly without charge.
  • As discussed above, Paul Niehaus has been based in San Diego and the rest of GiveDirectly staff is in New York. Dr. Faye has been employed full-time at a management consulting firm. Dr. Niehaus will be spending half his time in New York and hopes to take leave from his academic position, and Dr. Faye will now be working full-time out of the same office. Dr. Niehaus’s co-location with the rest of GiveDirectly staff will likely improve his ability to manage other staff. Dr. Faye’s co-location with Dr. Niehaus and other GiveDirectly staff may also increase his contribution to GiveDirectly. (Dr. Faye has told us that the time he has spent on GiveDirectly has increased since he took leave of absence from his job.)
  • Mr. Hughes intends to significantly increase his work on advocating for cash transfers. This should benefit both Segovia and GiveDirectly.

What additional issues could arise in the future, particularly potential conflicts of interests between Segovia and GiveDirectly?

There may be cases where GiveDirectly has to consider actions that would maximize its impact but might harm Segovia’s interests. GiveDirectly board members (Paul Niehaus, Michael Faye, and Chris Hughes) will hold equity stakes in Segovia, so their financial interests could come into conflict with their roles as Directors of GiveDirectly. We see the following possible conflicts of interest:

  • GiveDirectly’s board members’ financial interest in Segovia could lead them to use GiveDirectly as a means to promote Segovia. This could be via using Segovia’s software even if it’s not well suited to GiveDirecty’s needs, or otherwise using contacts/meetings that might take place due to GiveDirectly (e.g., government, academic or media contacts) to promote Segovia’s offering.
  • Segovia will also have (a) investors and (b) staff who hold significant financial stakes in Segovia, which could lead to conflicts between maximizing profit and maximizing impact.
  • If Segovia were bidding on a contract with a particular government, would GiveDirectly avoid offering its service in the same area/to the same government so that Segovia would have an easier path to a sale?

We have spent significant time with Paul Niehaus and some time with Michael Faye and Chris Hughes over the past few years, and we believe they have good intentions.

In addition, Dr. Niehaus, Dr. Faye, and Mr. Hughes hope to identify investors whose primary motivation is social impact, and believe that choosing investors wisely is a priority. They have also told us that they plan to expand GiveDirectly’s board to 6-7 directors, 3-4 of whom have no overlap with Segovia. Dr. Niehaus told us that overlapping directors would recuse themselves from votes that involve conflicts.

Why have Dr. Niehaus, Dr. Faye, and Mr. Hughes decided to serve developing country governments and why are they using a for-profit-company structure?

Dr. Niehaus and Dr. Faye believe that Segovia’s product is one that governments will want to purchase, and the product will have significant social impact. They have had a long-standing interest in working directly with governments.

Dr. Niehaus and Dr. Faye told us of their hope that GiveDirectly would work with government-run cash transfer programs in November 2013. We discuss this possibility in our review of GiveDirectly, relying on a summary of a conversation we had with them at the time.

Dr. Niehaus and Dr. Faye told us recently that they had initially hoped governments would transfer funds directly to/through GiveDirectly. The developing-country governments that GiveDirectly spoke with preferred technology to fully outsourcing implementation, saying that they already had a significant number of individuals employed to implement their cash transfer programs. Instead, governments asked for software that could improve their operations, which Segovia now aims to provide.

GiveDirectly still believes it will have opportunities to implement government programs, but Dr. Niehaus and Dr. Faye have come to the conclusion that there will be many more cases where governments want technology alone.

Dr. Niehaus and Dr. Faye pointed us to a World Economic Forum report estimating that developing-country governments distribute $400 billion in transfers each year. Dr. Niehaus and Dr. Faye have also told us that data showing rates of leakage of 50% or more are not uncommon in large public-sector transfer programs (i.e., the amount that never reaches the intended recipients). (More information about these sources in this footnote.) They believe that governments will see that purchasing Segovia’s product will save them money by allowing them to transfer more money to recipients at lower overall cost.

We find the above explanation of Segovia’s potential impact plausible but have not tried to vet it as we don’t think our take on it has direct relevance to GiveDirectly or the donors who use our research.

We have the impression that the belief that Segovia could have great social impact is the primary driver of Dr. Niehaus’s, Dr. Faye’s, and Mr. Hughes’ desire to start Segovia.

Why has GiveDirectly settled on this corporate structure as opposed to another structure?

Dr. Niehaus, Dr. Faye, and Mr. Hughes had initially expected to undertake this project as part of GiveDirectly’s existing non-profit structure but told us that they decided on the structure of a for-profit, independent company for three reasons:

  1. Recruiting. We spoke with the recruiting firm that GiveDirectly retained for this search, and the person who led the search told us that recruiting top technology talent was slow. In some cases, the engineers GiveDirectly contacted were not interested in working for a non-profit. Even when GiveDirectly offered compensation packages competitive with for-profit companies, some engineers balked when they saw the negative attention that the media and donors give to high salaries in the non-profit sector. Dr. Niehaus, Dr. Faye, and Mr. Hughes place high priority on recruiting the very best possible talent, so while they feel they could have reasonable success recruiting as a non-profit, they see the improved recruiting prospects associated with a for-profit to be a major consideration.
  2. Investment. GiveDirectly told us that there are investors who would support Segovia as a for-profit entity but would not be interested in supporting GiveDirectly, the non-profit.
  3. Legal advice. GiveDirectly received legal advice that an independent for-profit company is the most straightforward way to avoid jeopardizing GiveDirectly’s tax exempt status.

What effect will this have on our recommendation of GiveDirectly?

We do not expect the existence of Segovia to change our recommendation of GiveDirectly. We expect GiveDirectly to continue to successfully distribute cash to very poor individuals in the developing world, and believe that the issues and risks described above are smaller than, or at worst similar in importance to, those that exist with all of our other recommended charities.

We will continue to follow GiveDirectly closely and report on its progress.

We have written previously about the “upside” we saw in GiveDirectly. We think that Segovia may be one example of that “upside” — Dr. Niehaus and Dr. Faye, partly through their work on GiveDirectly, saw an opportunity for significant social impact and are now pursuing it. However, we think the attention they will now pay to Segovia likely diminishes the upside of future donations to GiveDirectly.

Footnote: On the World Economic Forum report described above, Dr. Niehaus wrote, “I have some questions about the methodology but believe the basic message that it is big and has problems.” On the leakage rates, he wrote, “India’s two largest social programs are the employment scheme (NREGS) and ration scheme (TPDS). For NREGS, the best nationally representative leakage estimate is by Imbert and Papp (published in R. Khera, editor, The battle for employment guarantee. Oxford University Press, 2011) who estimate that between 44% and 58% of participation reported in official figures is fictitious. This likely understates leakage in dollar figures since people who do work are often underpaid, but nationally representative data on earnings are not to the best of my knowledge available. For TPDS, the most recent nationally representative figures I know of are from the 2004-2005 NSS and are discussed in work by Svedberg in EPW who reports a national average estimate of 54% leakage of grains intended for the poor.”

Sequence Thinking vs. Cluster Thinking

Note: this is an unusually long and abstract post whose primary purpose is to help a particular subset of our audience understand our style of reasoning. It does not contain substantive updates on our research and recommendations.

GiveWell – both our traditional work and GiveWell Labs – is fundamentally about maximization: doing as much good as possible with each dollar you donate. This introduces some major conceptual challenges when making certain kinds of comparisons – for example, how does one compare the impact of distributing bednets in sub-Saharan Africa with the impact of funding research on potential high-risk responses to climate change, attempts to promote better collaboration in the scientific community or working against abuse of animals on factory farms?

Our approach to making such comparisons strikes some as highly counterintuitive, and noticeably different from that of other “prioritization” projects such as Copenhagen Consensus. Rather than focusing on a single metric that all “good accomplished” can be converted into (an approach that has obvious advantages when one’s goal is to maximize), we tend to rate options based on a variety of criteria using something somewhat closer to (while distinct from) a “1=poor, 5=excellent” scale, and prioritize options that score well on multiple criteria. (For example, see our most recent top charities comparison.)

We often take approaches that effectively limit the weight carried by any one criterion, even though, in theory, strong enough performance on an important enough dimension ought to be able to offset any amount of weakness on other dimensions. Relatedly, we look into a broad variety of causes, broader than can seemingly be justified by a consistent and stable set of values. Many others in the effective altruist community seem to have a strong and definite opinion on questions such as “how much animals suffer compared to humans,” such that they either prioritize animal welfare above all else or dismiss it entirely. (Similar patterns apply to views on the moral significance of the far future.) By contrast, we give simultaneous serious consideration to reducing animal suffering, reducing risks of global catastrophic events, reforming U.S. intellectual property regulation, global health and nutrition and more, and think it’s quite likely that we’ll recommend giving opportunities in several of these areas, while never resolving the fundamental questions that could (theoretically) establish one such cause as clearly superior to the others.

I believe our approach is justified, and in order to explain why – consistent with the project of laying out the basic worldview and epistemology behind our research – I find myself continually returning to the distinction between what I call “sequence thinking” and “cluster thinking.” Very briefly (more elaboration below),

  • Sequence thinking involves making a decision based on a single model of the world: breaking down the decision into a set of key questions, taking one’s best guess on each question, and accepting the conclusion that is implied by the set of best guesses (an excellent example of this sort of thinking is Robin Hanson’s discussion of cryonics). It has the form: “A, and B, and C … and N; therefore X.” Sequence thinking has the advantage of making one’s assumptions and beliefs highly transparent, and as such it is often associated with finding ways to make counterintuitive comparisons.
  • Cluster thinking – generally the more common kind of thinking – involves approaching a decision from multiple perspectives (which might also be called “mental models”), observing which decision would be implied by each perspective, and weighing the perspectives in order to arrive at a final decision. Cluster thinking has the form: “Perspective 1 implies X; perspective 2 implies not-X; perspective 3 implies X; … therefore, weighing these different perspectives and taking into account how much uncertainty I have about each, X.” Each perspective might represent a relatively crude or limited pattern-match (e.g., “This plan seems similar to other plans that have had bad results”), or a highly complex model; the different perspectives are combined by weighing their conclusions against each other, rather than by constructing a single unified model that tries to account for all available information.

A key difference with “sequence thinking” is the handling of certainty/robustness (by which I mean the opposite of Knightian uncertainty) associated with each perspective. Perspectives associated with high uncertainty are in some sense “sandboxed” in cluster thinking: they are stopped from carrying strong weight in the final decision, even when such perspectives involve extreme claims (e.g., a low-certainty argument that “animal welfare is 100,000x as promising a cause as global poverty” receives no more weight than if it were an argument that “animal welfare is 10x as promising a cause as global poverty”).

Finally, cluster thinking is often (though not necessarily) associated with what I call “regression to normality”: the stranger and more unusual the action-relevant implications of a perspective, the higher the bar for taking it seriously (“extraordinary claims require extraordinary evidence”).

I’ve tried to summarize the difference with the following diagram. Variation in shape size represents variation in the “certainty/robustness” associated with different perspectives, which matters a great deal when weighing different perspectives against each other for cluster thinking, but isn’t an inherent part of sequence thinking (it needs to be explicitly modeled by inserting beliefs such as “The expected value of this action needs to be discounted by 90%”).


I don’t believe that either style of thinking fully matches my best model of the “theoretically ideal” way to combine beliefs (more below); each can be seen as a more intellectually tractable approximation to this ideal.

I believe that each style of thinking has advantages relative to the other. I see sequence thinking as being highly useful for idea generation, brainstorming, reflection, and discussion, due to the way in which it makes assumptions explicit, allows extreme factors to carry extreme weight and generate surprising conclusions, and resists “regression to normality.” However, I see cluster thinking as superior in its tendency to reach good conclusions about which action (from a given set of options) should be taken. I have argued the latter point before, using a semi-formal framework that some have found convincing, some believe has flaws, and many have simply not engaged due to its high level of abstraction. In this post, I attempt a less formalized, more multidimensional, and hopefully more convincing (more “cluster-style”) defense. Following that, I lay out why I think sequence thinking is important and is probably more undersupplied on a global scale than cluster thinking, and discuss how I try to combine the two in my own decision-making. Separately from this post, I have also published a further attempt to formalize the underlying picture of an idealized reasoning process. 

By its nature, cluster thinking is hard to describe and model explicitly. With this post, I hope to reduce that problem by a small amount – to help people understand what is happening when I say things like “I see no problem with your reasoning, but I’m not placing much weight on it anyway” or “I think that factor could be a million times as important as the others, but I don’t want to give it 100x as much attention,” and what they can do to change my mind in such circumstances. (The general answer is to reduce the uncertainty associated with an argument, rather than simply demonstrating that no explicit flaws with the argument are apparent.)

In the remainder of this post, I:

  • Elaborate on my definitions of sequence and cluster thinking. More
  • Give a variety of arguments for why one should expect cluster thinking to result in superior decisions. More
  • Briefly note and link to a new page (published alongside this post) that attempts to formalize, to some degree, the “idealized thought process” I’m envisioning and how it reproduces key properties of cluster thinking. More
  • Lay out some reasons that I find sequence thinking valuable, even if one accepts that cluster thinking results in superior decisions, and defend the idea of switching between “sequence” and “cluster” styles for different purposes. I believe sequence thinking is superior not only for purposes of discussion and reflection (due to its transparency), but also for reaching the sort of deep understanding necessary for intellectual progress, and for generating novel insights that can become overwhelmingly important. More
  • Briefly discuss why cluster thinking can be confusing and challenging to deal with in a discussion, and outline how one can model and respond to cluster-thinking-based arguments that are often perceived as “conversation stoppers.”More
  • Close with a brief discussion of how I try to combine the two in my own thinking and actions. More

Before I continue, I wish to note that I make no claim to originality in the ideas advanced here. There is substantial overlap with the concepts of foxes and hedgehogs (discussed by Philip Tetlock); with the model and combination and adjustment idea described by Luke Muehlhauser; with former GiveWell employee Jonah Sinick’s concept of many weak arguments vs. one relatively strong argument (and his post on Knightian uncertainty from a Bayesian perspective); with former GiveWell employee Nick Beckstead’s concept of common sense as a prior; with Brian Tomasik’s thoughts on cost-effectiveness in an uncertain world; with Paul Christiano’s Beware Brittle Arguments post; and probably much more.

Defining Sequence Thinking and Cluster Thinking
Say that we are choosing between two charities: Charity A vaccinates children against rotavirus, and Charity B does basic research aiming to improve the odds of eventual space colonization. Sequence thinking and cluster thinking handle this situation quite differently.

Sequence thinking might look something like:

Charity A spends $A per child vaccinated. Each vaccination reduces the odds of death by B%. (Both A and B can be grounded somewhat in further analysis.) That leaves an estimate of (B/A) lives saved per dollar. I will adjust this estimate down 50% to account for the fact that costs may be understated and evidence may be overstated. I will adjust it down another 50% to account for uncertainties about organizational competence.

Charity B spends $C per year. My best guess is that it improves the odds that space colonization eventually occurs by D%. I value this outcome as the equivalent of E lives saved, based on my views about when space colonization is likely to occur, how many human lives would be possible in these case, and how I value these lives. (C, D, and E can be grounded somewhat in further analysis.) That leaves an estimate of (D*E)/C) lives saved per dollar. I will adjust this estimate down 95% to account for my high uncertainty in these speculative calculations. I will adjust it down another 75% to account for uncertainties about organizational competence, which I think are greater for Charity B than Charity A; down another 80% to account for the fact that expert opinion seems to look more favorably on Charity A; and down another 95% to account for the fact that charities such as Charity A generally have a better track record as a class.

After all of these adjustments, Charity B comes out better, so I select that one.

Cluster thinking might look something like:

Explicit expected-value calculations [such as the above] imply quite a strikingly good cost-per-life-saved for Charity A, and I think the estimate isn’t terribly likely to be terribly mistaken. That’s a major point in favor of Charity A. Similar calculations imply good cost-per-life-saved for Charity B, but this is a much more uncertain estimate and I don’t put much weight on it. The fact that Charity B comes out ahead even after trying to adjust for other factors is a point in favor of Charity B. In addition, Charity A seems like a better organization than Charity B, and expert opinion seems to favor Charity A, and organizations such as Charity A generally have a better track record as a class, and all of these are signals I have a fair amount of confidence in. Therefore, Charity A has more certainty-weighted factors in its favor than Charity B.

Note that this distinction is not the same as the distinction between explicit expected value and holistic-intuition-based decision-making. Both of the thought processes above involve expected-value calculations; the two thought processes consider all the same factors; but they take different approaches to weighing them against each other. Specifically:

  • Sequence thinking considers each parameter independently and doesn’t do any form of “sandboxing.” So it is much easier for one very large number to dominate the entire calculation even after one makes adjustments for e.g. expert opinion and other “outside views” (such as the track record of the general class of organization). More generally, it seems easier to reach a conclusion that contradicts expert opinion and other outside views using this style. This style also seems more prone to zeroing in on a particular category of charity as most promising: for example, often one’s estimate of the value of space colonization will either be high enough to dominate other considerations or low enough to make all space-colonization-related considerations minor, even after many other adjustments are made.
  • The two have very different approaches to what some call Knightian uncertainty (also sometimes called “model uncertainty” or “unknown unknowns”): the possibility that one’s model of the world is making fundamental mistakes and missing key parameters entirely. Cluster thinking uses several models of the world in parallel (e.g., “Expert opinion is correct”, “The track record of the general class of an organization predicts its success”, etc.) and limits the weight each can carry based on robustness (by which I mean the opposite of Knightian uncertainty: the feeling that a model is robust and unlikely to be missing key parameters); any chain of reasoning involving high uncertainty is essentially disallowed from making too much difference to the final decision, regardless of the magnitude of effect it points to. Sequence thinking involves the use of a single unified framework for decision analysis and by default it treats “50% probability that a coin comes up heads” and “50% probability that Charity B will fail for a reason I’m not anticipating” in fundamentally the same way. When it does account for uncertainty, it’s generally by adjusting particular parameters (for example, increasing “0.00001% chance of a problematic error” to “1% chance of a problematic error” based on the chance that one’s calculations are wrong); after such an adjustment, it uses the “highly uncertain probabilities adjusted for uncertainty” just as it would use “well-defined probabilities,” and does not disallow the final calculation from carrying a lot of weight.

Robustness and uncertainty

For the remainder of this piece, I will use the term robustness to refer to the “confidence/robustness” concept discussed immediately above (and “uncertainty” to refer to its opposite). I’m aware that I haven’t defined the term with much precision, and I think there is substantial room for sharpening its definition. One clarification I would like to make is that robustness is not the same as precision/quantifiability; instead, it is intended to capture something like “odds that my view would remain stable on this point if I were to gain more information, more perspectives, more intelligence, etc.” or “odds that the conclusion of this particular mental model would remain qualitatively similar if the model were improved.”

Regression to normality

A final important concept, which I believe is loosely though not necessarily related, is that of regression to normality: the stranger and more unusual the implications of an argument, the more “robustness” the supporting arguments need to have in order for it to be taken seriously. One way to model this concept is to consider “Conventional wisdom is correct and what seems normal is good” to be one of the “perspectives” or “mental models” weighed in parallel with others. This concept can potentially be modeled in sequence thinking as well, but in practice does not seem to be a common part of sequence thinking.

A couple more clarifications

Note that sequence thinking and cluster thinking converge in the case where one can do an expected-value calculation with sufficiently high robustness. “Outside view” arguments inherently involve a substantial degree of uncertainty (there are plenty of examples of expert opinion being wrong, of longstanding historical trends suddenly ending, etc.) so a robust enough expected-value calculation will carry the decision in both frameworks.

Note also that cluster thinking does not convert “uncertain, speculative probabilities” automatically into “very low probabilities.” Rather, it de-weights the conclusions of perspectives that overall contain a great deal of cumulative uncertainty, so that no matter what conclusion such perspectives reach, the conclusion is not allowed to have much influence on one’s actions.

Summary of properties of sequence thinking and cluster thinking

Sequence thinking Cluster thinking
Basic structure Tries to combine all relevant beliefs into a prediction using one model (“If A, B, C, … N, then X”) Weighs different mental models, each implying its own prediction (“A implies X; B implies ~X; C implies X; … therefore X”)
How much can a high-uncertainty parameter affect the conclusion? One big enough consideration can outweigh all others, even if it’s an uncertain “best guess” Any conclusion reached using uncertain methods has limited impact on the final decision
“Inside views” (laying out a causal chain) vs. “outside views” (expert opinion, “regression to normality,” historical track record of superficially similar decisions, etc. No obvious way of integrating inside and outside views; integration is often done via ad hoc adjustments and inside views often end up dominating the decision High-uncertainty inside views are usually dominated by outside views no matter what conclusions they reach

Why Cluster Thinking?
When trying to compare two very different options (such as vaccinations and space colonization), it seems at first glance as though sequence thinking is superior, precisely because it allows huge numbers to carry huge weight. The practice of limiting the weight of uncertain perspectives can have strange-seeming results such as (depending on robustness considerations) giving equal weight to “Charity A seems like the better organization” and “Charity B’s goal is 200 billion times as important.” In addition, I find cluster thinking far more difficult to formalize and describe, which can further lower its appeal in public debates about where to give.

Below, I give several arguments for expecting cluster thinking to produce better decisions. It is important to note that I emphasize “better decisions” and not “correct beliefs”: it is often the case that one reaches a decision using cluster thinking without determining one’s beliefs about anything (other than what decision ought to be made). In the example given in the previous section, cluster thinking has not reached a defined conclusion on how likely space colonization is, how valuable space colonization would be, etc. and there are many possible combinations of these beliefs that could be consistent with its conclusion that supporting Charity A is superior. Cluster thinking often ends up placing high weight on “outside view” pattern-matching, and often leads to conclusions of the form “I think we should do X, but I can’t say exactly why, and some of the most likely positive outcomes of this action may be outcomes I haven’t explicitly thought of.”

The arguments I give below are, to some degree, made using different vocabularies and different styles. There is some conceptual overlap between the different arguments, and some of the arguments may be partly equivalent to each other. I have previously tried to use sequence-thinking-style arguments to defend something similar to cluster thinking (though there were shortcomings in the way I did so); here I use cluster-thinking-style arguments.

Sequence thinking is prone to reaching badly wrong conclusions based on a single missing, or poorly estimated, parameter

Sequence-style reasoning often involves a long chain of propositions that all need to be reasonable for the conclusion to hold. As an example, Robin Hanson lays out 10 propositions that cumulatively imply a decision to sign up for cryonics, and believes each to have probability 50-80%. However, if even a single one ought to have been assigned a much lower probability (e.g., 10^-5) – or if he’s simply failed to think of a missing condition that has low probability – the calculation is completely off.

In general, missing parameters and overestimated probabilities will lead to overestimating the likelihood that actions play out as hoped, and thus overestimating the desirability of deviating from “tried and true” behavior and behavior backed by outside views. Correcting for missed parameters and overestimated probabilities will be more likely to cause “regression to normality” (and to the predictions of other “outside views”) than the reverse.

Cluster thinking is more similar to empirically effective prediction methods

Sequence thinking presumes a particular framework for thinking about the consequences of one’s actions. It may incorporate many considerations, but all are translated into a single language, a single mental model, and in some sense a single “formula.” I believe this is at odds with how successful prediction systems operate, whether in finance, software, or domains such as political forecasting; such systems generally combine the predictions of multiple models in ways that purposefully avoid letting any one model (especially a low-certainty one) carry too much weight when it contradicts the others. On this point, I find Nate Silver’s discussion of his own system and the relationship to the work of Philip Tetlock (and the related concept of foxes vs. hedgehogs) germane:

Even though foxes, myself included, aren’t really a conformist lot, we get worried anytime our forecasts differ radically from those being produced by our competitors.

Quite a lot of evidence suggests that aggregate or group forecasts are more accurate than individual ones … “Foxes often manage to do inside their heads what you’d do with a whole group of hedgehogs,” Tetlock told me. What he means is that foxes have developed an ability to emulate this consensus process. Instead of asking question of a whole group of experts, they are constantly asking questions of themselves. Often this implies that they will aggregate different types of information together – as a group of people with different ideas about the world naturally would – instead of treating any one piece of evidence as though it is the Holy Grail. The Signal and the Noise, pg 66

In sequence thinking, a single large enough number can dominate the entire calculation. In consensus decision making, a person claiming radically larger significance for a particular piece of the picture would likely be dismissed rather than given special weight; in a quantitative prediction system, a component whose conclusion differed from others’ by a factor of 10^10 would be likely to be the result of a coding error, rather than a consideration that was actually 10^10 times as important as the others. This comes back to the points made by the above two sections: cluster thinking can be superior for its tendency to sandbox or down-weight, rather than linearly up-weight, the models with the most extreme and deviant conclusions.

A cluster-thinking-style “regression to normality” seems to prevent some obviously problematic behavior relating to knowably impaired judgment

One thought experiment that I think illustrates some of the advantages of cluster thinking, and especially cluster thinking that incorporates regression to normality, is imagining that one is clearly and knowably impaired at the moment (for example, drunk), and contemplating a chain of reasoning that suggests high expected value for some unusual and extreme action (such as jumping from a height). A similar case is that of a young child contemplating such a chain of reasoning. In both cases, it seems that the person in question should recognize their own elevated fallibility and take special precautions to avoid deviating from “normal” behavior, in a way that cluster thinking seems much more easily able to accommodate (by setting an absolute limit to the weight carried by an uncertain argument, such that regression to normality can override it no matter what its content) than sequence thinking (in which any “adjustments” are guessed at using the same fallible thought process).

The higher one’s opinion of one’s own rationality relative to other people, the less appropriate the above analogy becomes. But it can be easy to overestimate one’s own rationality relative to other people (particularly when one’s evidence comes from analyzing people’s statements rather than e.g. their success at achieving their goals), and some component of “If I’m contemplating a strange and potentially highly consequential action, I should be wary and seek robustness (not just magnitude) in my justification” seems appropriate for nearly everyone.

Sequence thinking seems to tend toward excessive comfort with “ends justify the means” type thinking

Various historical cases of violent fanaticism seem somewhat fairly modeled as sequence thinking gone awry: letting one’s decisions become dominated by a single overriding concern, which then justifies actions that strongly violate many other principles. (For example, justifying extremely damaging activities based on Marxist reasoning.) Cluster thinking is far from a complete defense against such things: the robustness of a perspective (e.g., a Marxist perspective) can itself be overestimated, and furthermore a “regression to normality” can encourage conformism with highly problematic beliefs. However, the basic structure of cluster thinking does set up more hurdles for arguments about “the ends” (large-magnitude but speculative down-the-line outcomes) to justify “the means” (actions whose consequences are nearer and clearer).

I believe that invoking “the ends justify the means” (justifying near and clear harms by pointing to their further-out effects) is sometimes the right thing to do, and is sometimes not. Specifically, I think that the worse the “means,” the more robust (and not just large in claimed magnitude) one’s case for “the ends” ought to be. Cluster thinking seems to accommodate this view more naturally than sequence thinking.

(Related piece by Phil Goetz: Reason as memetic immune disorder)

When uncertainty is high, “unknown unknowns” can dominate the impacts of our actions, and cluster thinking may be better suited to optimizing “unknown unknown” impacts.

Sequence thinking seems, by its nature, to rely on listing the possible outcomes of an action and evaluating the action according to its probability of achieving these outcomes. I find sequence thinking especially problematic when I specifically expect the unexpected, i.e., when I expect the outcome of an action to depend primarily on factors that haven’t occurred to me. And I believe that the sort of outside views that tend to get more weight in cluster thinking are often good predictors of “unknown unknowns.” For example, obeying common-sense morality (“ends don’t justify the means”) heuristics seems often to lead to unexpected good outcomes, and contradicting such morality seems often to lead to unexpected bad outcomes. As another example, expert opinion often seems a strong predictor of “which way the arguments I haven’t thought of yet will point.”

It’s hard to formalize “expecting unknown unknowns to be the main impact of one’s action” in a helpful way within sequence thinking, but it’s a fairly common situation. In particular, when it comes to donations and other altruistic actions, I expect the bulk of the impact to come from unknown unknown factors including flow-through effects.

Broad market efficiency

Another way of thinking about the case for cluster thinking is to consider the dynamics of broad market efficiency. As I stated in that post:

the more efficient a particular market is, the higher the level of intensity and intelligence around finding good opportunities, and therefore the more intelligent and dedicated one will need to be in order to consistently “beat the market.” The most efficient markets can be consistently beaten only by the most talented/dedicated players, while the least efficient ones can be beaten with fairly little in the way of talent and dedication.

When one is considering a topic or action that one knows little about, one should consider the broad market to be highly efficient; therefore, any deviations from the status quo that one’s reasoning calls for are unlikely to be good ideas, regardless of the magnitude of benefit that one’s reasoning ascribes to them. (An amateur stock trader should generally assume his or her opinions about stocks to be ill-founded and to have zero expected value, regardless of how strong the “inside view” argument seems.) By contrast, when one is considering a topic or action that one is relatively well-informed and intelligent about, contradicting “market pricing” is not as much of a concern.

This is a special case of “as robustness falls, the potential weight carried by an argument diminishes – no matter what magnitudes it claims – and regression to normality becomes the stronger consideration.”

Sequence thinking seems to over-encourage “exploiting” as opposed to “exploring” one’s best guesses

I expect this argument to be least compelling to most people, largely because it is difficult for me to draw convincing causality lines and give convincing examples, but to me it is a real argument in favor of cluster thinking. It seems to me that people who rely heavily on sequence thinking have a tendency to arrive at a “best guess” as to what cause/charity/etc. ought to be prioritized, and to focus on taking the actions that are implied by their best guess (“exploiting”) rather than on actions likely to lead to rethinking their best guess (“exploring”). I would guess that this is because:

  • To the extent that sequence thinking highlights opportunities for learning, it tends to focus on a small number of parameters that dominate the model, and these parameters are often the least tractable in terms of learning more (for example, the value of space colonization). It thus seems often to encourage continued debate on largely intractable topics. Cluster thinking highlights many consequential areas of uncertainty and promises returns to clearing up any of them, leading to more traction on learning and more reduction in “unknown unknowns” over time.
  • Sequence thinking has a tendency to make different options seem to differ more in value, while cluster thinking tends to make it appear as though any high-uncertainty decision is a “close one” that can be modified with more learning. I believe the latter tends to be a more helpful picture.
  • Cluster thinking tends to have heavier penalties for uncertainty, due to its feature of not allowing the magnitude of a model parameter to overwhelm adjustments for uncertainty. When people are promoting speculative arguments, having to contend with and persuade “cluster thinkers” seems to cause them to do more investigation, do more improving of their arguments, and generally do more to increase the robustness of their claims.

In the domains GiveWell focuses on, it seems that learning more over time is paramount. We feel that much of the effective altruist community tends to be quicker than we are to dismiss large areas as unworthy of exploration and to focus in on a few areas.

Formal framework reproducing key qualities of cluster thinking

Cluster thinking, despite its seeming inelegance, is in some ways a closer match to what I see as the “idealized” thought process than sequence thinking is. On a separate page, I have attempted to provide a formal framework describing this “idealized” thought process as I see it, and how this framework deals with extreme uncertainty of the kind we often encounter in making decisions about where to give.

According to this framework, formally combining different mental models of the world has a tendency to cap the decision-relevance of highly uncertain lines of reasoning – the same tendency that distinguishes cluster thinking from sequence thinking. For more, see my full writeup on this framework, which I have confined to another page because it is long and highly abstract.

Writeup on modeling extreme model uncertainty


Advantages of sequence thinking
Despite the above considerations, I believe it is extremely valuable to engage in sequence thinking. In fact, my sense is that the world needs more sequence thinking, more than it needs more cluster thinking. While I believe that cluster thinking is more prone to making the correct decision between different possible (pre-specified) actions, I believe that sequence thinking has other benefits to offer when used appropriately.

To be clear, in this section when I say “engaging in sequence thinking” I mean “working on generating and improving chains of reasoning along the lines of explicit expected-value calculations,” or more generally, “Trying to capture as many relevant considerations as possible in a single unified model of the world.” Cluster thinking includes giving some consideration and weight to the outcomes of such exercises, but does not include generating them. Many of the advantages I name have to do with the tendency of sequence thinking to underweight, or ignore, “outside views” and crude pattern-matches such as historical patterns and expert opinion, as well as “regression to normality”; sequence thinking can make adjustments for such things, but I generally find its method for doing so unsatisfactory, and feel that its greatest strengths come when it does not involve such adjustments.

Sequence thinking can generate robust conclusions that then inform cluster thinking

There are times when a long chain of reasoning can be constructed that has relatively little uncertainty involved (it may involve many probabilistic calculations, but these probabilities are well-understood and the overall model is robust).

The extreme case of this is in some science and engineering applications, when sequence thinking is all that is needed to reach the right conclusion (I might say cluster thinking “reduces to” sequence thinking in these cases, since the sequence-thinking perspective is so much more robust than all other available perspectives).

A less extreme case is when someone simply puts a great deal of work into doing as much reflection and investigation as they can of the parameters in their model, to the point where they can reasonably be assumed to have relatively little left to learn in the short to medium term. People who have reached such status have, in my opinion, good reason to assign much less uncertainty to their sequence-thinking-generated views and to place much more weight on their conclusions. (Still, even these people should often assign a substantial amount of uncertainty to their views.)

There are many times when I have underestimated the weight I ought to place on a sequence-thinking argument because I underestimated how much work had gone into investigating and reflecting on its parameters. I have been initially resistant to many ideas that I now regard as extremely important, such as the greater cost-effectiveness of developing-world as opposed to developed-world aid, the potential gains to labor mobility, and views of “long-term future” effective altruists on the most worrying global catastrophic risks, all of which appeared to me at first to be based on naïve chains of logic but which I now believe to have been more thoroughly researched – and to have less uncertainty around key parameters – than I had thought.

Sequence thinking is more favorable to generating creative, unconventional, and nonconformist ideas

I often feel that people in the effective altruist community do too little regression to normality, but I believe that most people in the world do far too much. Any thinking style that provides a “regression to normality”-independent way of reaching hypotheses has major advantages.

Sequence thinking provides a way of seeing where a chain of reasoning goes when historical observations, conventional wisdom, expert opinion and other “outside views” are suspended. As such, it can generate the kind of ideas that challenge long-held assumptions and move knowledge forward (the cases I list in the immediately previous section are some smaller-scale examples; many scientific breakthroughs seem to fit in this category as well). Sequence thinking is also generally an important component in the formation of expert opinion (more below), which is usually a major input into cluster thinking.

Sequence thinking is better-suited to transparency, discussion and reflection

I generally find it very hard to formalize and explain what “outside views” I am bringing to a decision, how I am weighing them against each other, and why I have the level of certainty I do in each view. Many of my outside views consist of heuristics (i.e., “actions fitting pattern X don’t turn out well”) that come partly from personal experiences and observations that are difficult to introspect on, and even more difficult to share in ways that others would be able to comprehend and informedly critique them.

Sequence thinking tends to consist of breaking a decision down along lines that are well-suited to communication, often in terms of a chain of causality (e.g., “This action will lead to A, which will lead to B, which will lead to outcome-of-interest C if D and E are also true”). This approach can be clumsy at accommodating certain outside views that don’t necessarily apply to a particular sub-prediction (for example, many heuristics are of the form “actions fitting pattern X don’t turn out well for reasons that are hard to visualize in advance”). However, sequence thinking usually results in a chain of reasoning that can be explicitly laid out, reflected on, and discussed.

Consistent with this, I think the cost-effectiveness analysis we’ve done of top charities has probably added more value in terms of “causing us to reflect on our views, clarify our views and debate our views, thereby highlighting new key questions” than in terms of “marking some top charities as more cost-effective than others.” I have often been pushed, by people who heavily favor sequence thinking, to put more work into clarifying my own views, and I’ve rarely regretted doing so.

Sequence thinking can lead to deeper understanding

Partly because it is better-suited to explicit discussion and reflection, and partly because it tends to focus on chains of causality without deep integration of poorly-understood but empirically observed “outside view” patterns, sequence thinking often seems necessary in order to understand a particular issue very deeply. Understanding an issue deeply, to me, includes (a) being able to make good predictions in radically unfamiliar contexts (thus, not relying on “outside views” that are based on patterns from familiar contexts); (b) matching and surpassing the knowledge of other people, to the point where “broad market efficiency” can be more readily dismissed.

In my view, people who rely heavily on sequence thinking often seem to have inferior understanding of subjects they aren’t familiar with, and to ask naive questions, but as their familiarity increases they eventually reach greater depth of understanding; by contrast, cluster-thinking-reliant people often have reasonable beliefs even when knowing little about a topic, but don’t improve nearly as much with more study. At GiveWell, we often use a great deal of sequence thinking when exploring a topic (less so when coming to a final recommendation), and often feel the need to apologize in advance to the people we interview for asking naïve-seeming questions.

In order to reap this benefit of sequence thinking, one must do a good job stress-testing and challenging one’s understanding, rather than being content with it as it is. This is where the “incentives to investigate” provided by cluster thinking can be crucial, and this is why (as discussed below) my ideal is to switch between the two modes.

Other considerations

Sequence thinking can be a good antidote to scope insensitivity, since it translates different factors into a single framework in which they can be weighed against each other. I do not believe scope insensitivity is the only, or most important, danger in making giving decisions, but I do find sequence thinking extremely valuable in correcting for it.

Many seem to believe that sequence thinking is less prone to various other cognitive biases, and in general that it represents an antidote to the risks of using “intuition” or “system 1.” I am unsure of how legitimate this view is. When making decisions with high levels of uncertainty involved, sequence thinking is (like cluster thinking) dominated by intuition. Many of the most important parameters in one’s model or expected-value calculation must be guessed at, and it often seems possible to reach whatever conclusion one wishes. Sequence thinking often encourages one to implicitly trust one’s intuitions about difficult-to-intuit parameters (e.g., “value of space colonization”) rather than trusting one’s more holistic intuitions about the choice being made – not necessarily an improvement, in my view.

Cluster thinking and argumentation
I’ve argued that cluster thinking is generally superior for reaching good conclusions, but harder to describe and model explicitly. While I believe transparency of thought is useful and important, it should not be confused with rationality of thought.

I’ve sometimes observed an intelligent cluster thinker, when asked why s/he believes something, give a single rather unconvincing “outside view” related reason. I’ve suspected, in some such cases, that the person is actually processing a large number of different “outside views” in a way that is difficult to introspect on, and being unable to cite the full set of perspectives with weights, returns a single perspective with relatively (but not absolutely) high weight. I believe this dynamic sometimes leads sequence thinkers to underestimate cluster thinkers.

One of my hopes for this piece is to help people better understand cluster thinking, and in particular, how one can continue to make progress in a discussion even after a seemingly argument-stopping comment like “I see no problem with your reasoning, but I’m not placing much weight on it anyway” is made.

In such a situation, it is important to ask not just whether there are explicit problems with one’s argument, but how much uncertainty there is in one’s argument (even if such uncertainty doesn’t clearly skew the calculation in one direction or another) and whether other arguments, using substantially different mental models, give the same conclusion. When engaging with cluster thinking, improving one’s justification of a probability or other parameter – even if it has already been agreed to by both parties as a “best guess” – has value; citing unrelated heuristics and patterns has value as well.

To give an example, many people are aware of the basic argument that donations can do more good when targeting the developing-world poor rather than the developed-world poor: the developing-world poor have substantially worse incomes and living conditions, and the interventions charities carry out are commonly claimed to be substantially cheaper on per-person or per-life-saved basis. However, many (including myself) take these arguments more seriously on learning things like “people I respect mostly agree with this conclusion”; “developing-world charities’ activities are generally more robustly evidence-supported, in addition to cheaper”; “thorough, skeptical versions of ‘cost per life saved’ estimates are worse than the figures touted by charities, but still impressive”; “differences in wealth are so pronounced that “hunger” is defined completely differently for the U.S. vs. developing countries“; “aid agencies were behind undisputed major achievements such as the eradication of smallpox”; etc. The function of such findings isn’t necessarily to address specific objections to the basic argument, but rather to put its claims on more solid footing – to improve the robustness of the argument.

The balance I try to strike
As implied above, I believe sequence thinking is valuable for idea generation, reflection and discussion, while cluster thinking is best for making the final choice between options. I try to use the two types of thinking accordingly. GiveWell often puts a great deal of work into understanding the causal chain of a charity’s activities, estimating the “cost per life saved,” etc., while ultimately being willing to accept some missing links and place limited weight on these things when it comes to final recommendations.

However, there are also times in which I let sequence thinking dominate my decisions (not just my investigations), for the following reasons.

One of the great strengths of sequence thinking is its ability to generate ideas that contradict conventional wisdom and easily observable patterns, yet have some compelling logic of their own. For brevity, I will call these “novel ideas” (though a key aspect of such ideas is that they are not just “different” but also “promising”). I believe that novel ideas are usually flawed, but often contain some important insight. Because the value of new ideas is high, promoting novel ideas – in a way that is likely to lead to stress-testing them, refining them, and ultimately bringing about more widespread recognition of their positive aspects – has significant positive expected value. At the same time, a given novel idea is unlikely to be valid in its current form, and quietly acting on it (when not connected to “promoting” it in the marketplace of ideas, leading to its refinement and/or widespread adoption) may have negative expected value.

One example of this “novel ideas” dynamic is the charities recommended by GiveWell in 2006 or 2007: GiveWell at that time had a philosophy and methodology with important advantages over other resources, but it was also in a relatively primitive form and needed a great deal of work. Supporting GiveWell’s recommendations of that time – in a way that could be attributed to GiveWell – led to increasing attention and influence for GiveWell, which was evolving quickly and becoming a more sophisticated and influential resource. However, if not for GiveWell’s ongoing evolution, supporting its recommended charities would not have had the sort of expected value that it naively appeared to (according to our over-optimistic “cost per life saved” figures of the time). (Note that this paragraph is intended to give an example of the “novel ideas” dynamic I described, but does not fit the themes of the post otherwise. Our recommendations weren’t purely a product of sequence thinking but rather of a combination of sequence and cluster thinking.)

For me, a basic rule of thumb is that it’s worth making some degree of bet on novel ideas, even when the ideas are likely flawed, when it’s the kind of bet that (a) facilitates the stress-testing, refinement, and growing influence of these ideas; (b) does not interfere with other, more promising bets on other novel ideas. So it makes sense to start, run, or support an organization based on a promising but (because dependent on sequence thinking, and in tension with various outside views) likely flawed idea … if (a) the organization is well-suited to learning, refining, and stress-testing its ideas and growing its influence over time; (b) starting or supporting the organization does not interfere with one’s support of other, more promising novel ideas. It makes sense to do so even when cluster thinking suggests that the novel idea’s conclusions are incorrect, to the extent that quite literal endorsement of the novel idea would be “wrong.”

When we started GiveWell, I believed that we were likely wrong about many of the things that seemed to us from an inside, sequence-thinking view to be true, but that it was worth acting on these things anyway, because of the above dynamic. (I am referring more to our theories about how we could influence donors and have impact than to our theories about which charities were best, which we tried to make as robust as we could, while realizing that they were still quite uncertain.) We believed we were onto some underappreciated truth, but that we didn’t yet know what it was, and were “provisionally accepting” our own novel ideas because we could afford to do so without jeopardizing our overall careers and because they seemed to be the novel ideas most worth making this sort of bet on. We expected our ideas to evolve, and rather than taking them as true we tried to stress-test them by examining as many different angles as we could (for example, visiting a recommended charity’s work in the field even though we couldn’t say in advance which aspect of our views this would affect). There were other novel ideas that we found interesting as well, but incorporating them too deeply into our work (or personal lives) would have interfered with our ability to participate in this dynamic.

The above line of argument justifies behavior that can seem otherwise strange and self-contradictory. For example, it can justify advocating and acting to some degree on a novel idea, while not living one’s life fully consistently with this idea (e.g., working to promote Peter Singer’s ideas about the case for giving more generously, while not actually giving as much as his ideas would literally imply one should). When considering possible actions including “avoiding factory-farmed meat,” “giving to the most apparently cost-effective charity,” etc., I am always asking not only “Does this idea seem valid to me?” but “Am I acting on this idea in a way that promotes it and facilitates its evolution, and does not interfere with my promotion of other more promising ideas?” As such, I tend to change my own behavior enough to reap a good portion of the benefits of supporting/promoting an idea but not as much as literal acceptance of the idea would imply. I have a baseline level of stability and conservatism in the way I live my life, which my bets on novel ideas are layered on top of in a way that fits well within my risk tolerance.

Promoting a sequence-thinking-based idea in a cluster-thinking-based world leads to examining the idea from many angles, looking for many unrelated (or minimally related) arguments in its favor, and generally working toward positive evolution of the idea. The ideal, from my perspective, is to use cluster thinking to evaluate the ultimate likely validity of ideas, while retaining one’s ability to (without undue risk) promote and get excited about sequence-thinking-generated ideas that may eventually change the world.

For one with few resources for idea promotion and exploration, this may mean picking a very small number of bets. For one who expects to influence substantial resources – as GiveWell currently does – it is rational to simultaneously support/promote work in multiple different causes, each of which could be promising under certain assumptions and parameters (regarding how much value we should estimate in the far future, how much suffering we should ascribe to animals, etc.), even if the assumptions and parameters that would support one cause contradict those that would support another. When choosing between causes to support, cluster thinking – rather than choosing one’s best-guess for each parameter and going from there – is called for.

Reflections on a Site Visit in Myanmar (Burma)

This is a cross-post from Good Ventures’ blog Give & Learn. It was co-authored by Cari Tuna, Good Ventures Co-Founder, and Natalie Crispin, GiveWell Research Analyst.

We recently traveled to Myanmar to visit a project Good Ventures is supporting to help prevent the spread of drug-resistant malaria. This post shares some observations from the trip and general thoughts about the value of site visits versus other ways of learning about the impact of one’s giving.

About the project

To recap, this project aims to rapidly replace one type of malaria treatment in Myanmar (AMT) with another (ACT) to reduce the risk of drug resistance, which could have a devastating effect on global malaria control efforts if left unchecked.

The project is being carried out by Population Services International (PSI) with support from the Gates Foundation, the UK Department for International Development (DFID) and Good Ventures. We contributed to the project as part of our effort to learn from other major funders through co-funding.

The project involves selling subsidized ACTs to the largest pharmaceutical distributor in Myanmar, sending “product promoters” to private drug providers to promote the appropriate use of ACTs, and piloting a project to promote the use of rapid diagnostic tests (RDTs) among such providers. It also involves encouraging the Myanmar government to follow through on its commitment to end the importation of AMTs.

The project got off to a slow start in 2012, but progress has accelerated over the course of 2013. The latest data show that the ratio of AMT to ACT in the market has decreased from about 20:1 in 2012 to about 1:2 in 2013. The subsidy has been largely passed on to patients: the price of a full course of ACT is less than or equal to the price of a typical (partial) dose of AMT in 94% of drug outlets. Due to quickly declining malaria rates in the country, PSI Myanmar estimates that the project has enough funding to continue into 2016, 18 months longer than originally planned. (A more detailed update is forthcoming.)

Observations on the site visit

We spent five days with PSI Myanmar for the annual donor review of the project, which was also attended by representatives of the Gates Foundation and DFID. We spent the first day in PSI Myanmar’s office in Yangon reviewing the project’s progress and potential risks to its continued success and sustainability. We spent the next three days traveling through Mon State and interviewing people working at various levels of the supply chain, including itinerant drug vendors, village pharmacists, drug wholesalers, and representatives of the large pharmaceutical distributor, AA Medical Products. (We’ll refer to these three days as the “field visit.”) We spent the last day in Yangon reflecting on the visit with PSI and DFID. We’re deeply grateful to PSI Myanmar for hosting us and organizing the trip.

We’ll post detailed notes from our travels soon. In the meantime, we wanted to share some miscellaneous observations and reflections:

  • A major goal of ours in co-funding this project — and attending the annual review — was to learn about how funders such as the Gates Foundation and DFID operate.
    • These funders took a similar approach to the site visit as we would have taken on our own. They asked everyone we interviewed copious questions, both directly and indirectly related to the project. They tended to avoid leading questions and tried to ask the same questions in multiple ways, in order to increase their odds of getting an unbiased view of the situation.
    • We were impressed by Louise Mellor from DFID and her efforts to establish a positive rapport with the people we interviewed. She encouraged everyone in our entourage to introduce themselves by name at each stop. After we had asked our questions, she often offered the interviewees words of encouragement or thanks for the role they were playing to help prevent drug resistance. (Previously, Louise told me that establishing a positive rapport is important for helping people feel comfortable speaking freely.) We plan to be more intentional about establishing a positive rapport with interviewees on future site visits.
    • We were also impressed by the Gates Foundation’s Tom Kanyok, who volunteered to take a rapid diagnostic test (RDT) at a general store we visited in Mon State. The test involved the general store owner pricking his finger for a blood sample. It was helpful to see an RDT performed live.
    • DFID’s process for evaluating the project’s progress involved both a formal review of pre-determined, quantified indicators and unstructured reflection on what could be improved. At the end of our visit, DFID staff provided PSI with both overall feedback on how they believed the project was going (“a very positive review”) and some recommendations for PSI to consider over the next few months, such as studying the feasibility of reducing the ACT subsidy, exploring the use of local languages on drug packages, and providing more information to donors on issues raised during the review. DFID will also produce a formal report on the review (update: recently published here). In addition to DFID staff who are based in Myanmar, a DFID economic advisor, who is based in London and not normally involved in the project, attended the annual review, in order to provide an outside perspective.
    • DFID staff noted that while field observations cannot be treated as representative evidence, site visits help to provide a reality check on one’s assumptions, surface potential problems, and allow for relationship building and evaluation of project management. They said DFID is trying to incorporate more site visits into its project monitoring.
    • We were struck by how often Tom Kanyok from the Gates Foundation raised the topic of malaria elimination and eradication. (Representatives of the Gates Foundation previously told us that malaria eradication is a focus of the foundation. PSI told us that governments in the region, with support from the World Health Organization, have adopted the strategy of eliminating of P. falciparum malaria, and believe it's the only long-term solution for antimalarial drug resistance.)
  • ACTs were available in all but one of the 10 drug outlets we visited. The shopkeeper at the outlet that was out of stock out told us that she had sold the last pack the previous day. At each outlet, we asked the provider whether he or she sold other antimalarial drugs, and we looked for such drugs on display. No provider said they carried oral AMTs. At one wholesale outlet, the shop owner told us that he did not sell oral AMTs, but we subsequently found them on display. The shop owner then became worried that he was in trouble. (It’s not against the law to sell oral AMTs, only to import them, but there seemed to be rumors that selling them was banned as well. What’s more, we were traveling with a representative of the Myanmar Ministry of Health, which could have increased the shop owner’s concern.)
  • From what we saw, the pilot program to encourage use of rapid diagnostic tests (RDTs) appeared to be going well. All of the pilot participants with whom we spoke reacted positively to questions about the program, and the shopkeeper who performed a demonstration test for us did so competently. She had completed only a few tests before our visit.
  • One issue that emerged as a concern to donors during the field visit — but wasn’t emphasized as a major risk to the project’s success during PSI’s earlier presentation — was the upcoming expiration of ACTs in the market. At the start of the project, PSI had to place its first order of ACTs with limited information about the size of the market. As a result, and because of declining malaria rates in Myanmar, PSI ordered more ACTs than have been needed. It had stock that would expire, unused, in March and April 2014. At the time of the site visit, PSI was waiting to begin selling a new batch of drugs, and replacing expired drugs in the market with new drugs, to minimize wastage. The issue surfaced during the field visit when a DFID participant noticed that all the ACTs we encountered in the market were set to expire in March or April. Donors raised the concern that PSI may be waiting too long to replace the drugs and that expired drugs may trickle down to harder-to-reach areas as a result. This prompted a conversation that led PSI to begin the replacement process slightly earlier than planned, after obtaining approval from the donors. PSI notes that it's not alone in having expired drug stock — all agencies in Myanmar, including the Ministry of Health, are currently seeing rapidly decreasing disease transmission resulting from aggressive control efforts, which has led them to need fewer drugs than originally anticipated. PSI also notes that overstock is greatly preferable to understock, which could lead to market demand for sub-standard products.
  • Because we were traveling with a large entourage (around 10 people) and making mostly pre-scheduled stops, it was difficult to know whether we were seeing an unbiased picture of circumstances on the ground. We do note that the purpose of the field visit was not to assess the project’s overall success, which is more appropriately done by looking at representative monitoring data rather than a small sample of anecdotal observations.
    • For instance, we were struck by the abundance of promotional materials for the PSI-subsidized ACTs on display at the outlets we visited, including at outlets where providers reported relatively low malaria caseloads. Many of the materials looked relatively new. A PSI representative assured us that extra materials were not hung up in anticipation of our visit. This may well have been the case; PSI staff said they had recently undertaken a big marketing push.
  • We were struck by the complexity of the operating environment in which the project is taking place. This served as a general reminder of the large number of ways in which a project can fall short of its objectives, including ways that would be difficult for a donor who is not highly informed to predict. In this case, complexities of the operating environment include violent conflict in some parts of the country where drug resistance is of particular concern, conflicting policies in different government departments, dissatisfaction with the goals of the project among some government physicians, a second common malaria parasite that is treated with a different drug regimen, and working conditions and vector behavior that make certain interventions, such as bednets, somewhat less effective for at-risk populations. 

General thoughts on the value of site visits

As we research potential focus areas, we’re often advised to “get out into the field” in order to understand the work we’re funding, or considering funding, better. We agree with the notion that site visits can be valuable for learning. That said, we’ve found that such visits are helpful for some — but not all — types of learning. They are not a substitute for desk research, though they can complement such research in important ways. Site visits also take a great deal of time to conduct, and hosting them requires a significant investment on the part of the nonprofit. These are trade-offs we take into account when deciding how to prioritize our time to maximize our learning.

What are site visits good for?

  • Field observations can provide a valuable reality check on our assumptions.
  • They can be helpful for raising questions and surfacing potential problems that may not have appeared important based on reviewing formal reports and monitoring data. In this case, for example, the issue of expiring drugs became a greater concern for donors because we encountered them in the market.
  • Field visits often involve spending an extended amount of time with project managers and staff — and in this case, other donors — including unstructured time on the road and eating meals, in addition to scheduled stops. This time is not only helpful for relationship building; it also allows us to ask many more questions than we could over the phone or by e-mail. We’ve consistently found this to be one of the biggest benefits to participating in such visits.
  • Field visits help us to develop a fuller picture of the context in which a nonprofit is working, including complexities that may be hard to appreciate from afar.
  • We’ve found that communicating about a project is easier after spending a lot of time immersed in the details of the work.
  • Lastly, visiting with beneficiaries of a project, as we did in our 2012 visit to GiveDirectly in Western Kenya, can lead to a greater emotional connection to the work.

What aren’t site visits good for?

In most cases, site visits do not seem to be an appropriate tool for learning how a project is going in general, because they only allow for a small number of anecdotal observations, which are not necessarily representative of the situation overall. Furthermore, despite our best efforts not to ask leading questions or prime interviewees to respond in certain ways, we’ve found it hard to know whether we’re getting a fully accurate view of circumstances on the ground in the places we’ve visited. This is to be expected and doesn’t diminish the other benefits of conducting site visits. But it does point to the importance of representative monitoring data and rigorous, independent evaluation in learning about whether the work we’re funding is succeeding in meeting its goals.

Potential U.S. policy focus areas

Throughout the post, “we” refers to GiveWell and Good Ventures, who work as partners on GiveWell Labs.

Previously, we laid out our basic framework and reasoning for selecting U.S. policy causes to focus on for GiveWell Labs. This post goes through the specific causes that we’re most likely to commit to (and are accordingly performing in-depth investigations of, with some preliminary grantmaking, at the moment).

A few preliminary notes:

  • This post does not offer the same sort of thoroughness and comprehensiveness that people might be accustomed to from our research on top charities. Part of this is because this post is more preliminary than our charity recommendations are, and much of the purpose is to elicit feedback and determine which questions most merit further investigation. (We are not yet committing to causes, and are aiming to do so near the end of the calendar year.) However, we also feel that as we move forward on GiveWell Labs, “accomplishing as much good as possible” and “thoroughly examining every question one might ask” will come into conflict, and the former is more important to us. We are aiming for transparency in the sense of making clear what sort of support we have for all of our major beliefs and statements; we are not aiming for comprehensiveness of investigation.
  • Some of our beliefs at this point come from intuitions that we’re unable to trace back to a particular source – intuitions that have come from the aggregate of many conversations as well as generally following and discussing policy-related topics. We try to make clear what our beliefs are based on, to the extent we’re able, and hope the ensuing discussion will highlight the areas where we have the most work to do in re-examining the bases for our statements.

We are trying to evaluate causes to “commit” to (as discussed previously), and “committing” could end up meaning many different sorts of things in terms of what sort of work we support. In a given cause, we could end up focusing on (a) supporting better research to determine optimal policy; (b) supporting information, education, and advocacy to push for particular policies; (c) working within an already-changing policy landscape and trying to affect the details of how policies change; (d) something else. We’ve tried to assess the importance, tractability, and crowdedness of causes with this broad potential mandate in mind, and to focus on causes that seem to be quite broadly important/tractable/uncrowded rather than simply presenting an opportunity for a specific narrow intervention.

As discussed previously, the causes we find most promising generally stand out on at least one of our three key criteria: tractability, importance, and crowdedness. As such, our discussion of causes is organized by criterion – we discuss which causes stand out on each dimension, followed by discussion of other causes that we find worth discussing for other reasons.

Contents of this post:

A few key resources that provide partial support for much of the reasoning in this post:

Windows of opportunity: outstanding tractability
As discussed previously, it can be very difficult to predict whether and when a policy area might become tractable (i.e., when it might become possible for advocacy infrastructure to play a major role in how policy develops in that area) in the long run. Paying too much attention to very short-run tractability (for example, what issue is in the news or being debated in Congress at the moment) seems inappropriate given the nature of what we’re trying to do: pick areas to commit to and build infrastructure in for several years.

With that said, we’ve come across a few causes that seem to present unusual “windows of opportunity,” in which something highly relevant in the political landscape seems to be changing in a way that could make the issue unusually prone to change for the next several years (and possibly beyond), and we could imagine our involvement helping to shape the specific way in which changes play out.

Perhaps the best example we’ve seen is the criminal justice policy space, which we’ve done a medium-depth writeup on. This space came up as promising early in our conversations with generalists, and was particularly emphasized by Steven Teles. There has long been a humanitarian argument (generally emphasized by people on the political left) for the importance of reducing unnecessary incarceration and the suffering associated with it; what seems to have changed relatively recently, however, is a combination of historically high incarceration rates, declining crime rates, and state budget difficulties – accompanied by a growing interest among the political right in reducing incarceration rates if it can be done without reducing public safety (e.g., Right on Crime.) Between the excitement we saw about tractability and the concrete opportunities we saw to support promising-seeming approaches, we saw this cause as a good one for our first medium-depth investigation in the policy arena; having investigated further and made some grants, we believe there are many promising underfunded approaches, real opportunities to influence policy, and reasonably high humanitarian stakes. More at our writeup on this topic.

Other “window of opportunity” causes that have come up:

  • Public opinion on marijuana over the last ~15 years has shifted dramatically, leading to state-level changes in drug policy and seeming potential for more change. Good Ventures has done some funding and a fair number of conversations in this space, with GiveWell providing support; in addition, there is some overlap between criminal justice and drug policy, particularly the research by Mark Kleiman that we are supporting. We don’t see this cause as particularly crowded, and we see work on designing good regulation (such as Prof. Kleiman’s research) as having a great deal of room for more funding. However, GiveWell’s perception at this moment (though open to revision) is that the net humanitarian benefits of marijuana legalization are unlikely to be as high as for potential reforms in the criminal justice space, and that the aspects of this space that don’t touch on either marijuana legalization or criminal justice policy do not necessarily have much “window of opportunity.”
  • Public opinion has also been shifting on the topic of same-sex marriage, though the people we’ve spoken with have often expressed the sentiment that we’re “in the endgame” at this point, and that the entrance of an additional major funder wouldn’t be likely to have much impact.
  • A couple of people have raised the possibility that surveillance – e.g., policy around what sorts of information U.S. security agencies collect – is becoming a more dynamic area, due to recent revelations (e.g., Edward Snowden’s leaks) and changes in what’s technologically possible. Our impression, based on our general perceptions of what is at stake and common attitudes toward this issue, is that this area is likely to be or become relatively crowded, and that we are unlikely to see opportunities with comparable humanitarian significance to what we’ve seen in the criminal justice space.

We have largely relied on impressions from our conversations with generalists in order to identify the most promising “window of opportunity” causes. Of these, our view is that criminal justice reform is the most promising, having equal or greater humanitarian significance and equal or lesser crowdedness compared to the others.

Ambitious longshots: outstanding importance
There are a couple of obstacles to identifying policy areas with outstanding importance:

  • In order to assess potential humanitarian gains, one must have some idea of what sort of policy change is possible. This seems very hard to assess, and we hope that gaining more intimate familiarity with specific policy areas will give us a better idea of how to do so.
  • The calculations we’ve done attempting to compare the significance of different policy areas are extremely non-robust, and arguably have little value-added on top of intuition. With that said, it does seem to us that one can reasonably distinguish between “contenders for the most important policy area,” “policy areas that have fairly small implications” and “somewhere in between.”

Another challenge here is that from what we’ve seen, the causes that seem like strongest contenders for “most important” tend to have relatively poor, or at least highly ambiguous, scores on the other two criteria (more details below). We haven’t seen any such cause where we are (at this early stage) convinced of a clear opening for a philanthropist and an opportunity to make tangible progress.

With that said, we see some compelling reasons to get deeply involved with at least one “ambitious longshot” cause, even if the prospects for change seem doubtful and/or the space seems relatively crowded:

  • It’s possible that our assessments of the other two criteria are highly unreliable. It’s possible that the right attitude is: “Any cause has potential for change over the long run, and any cause has plenty of space for a new philanthropist to add value, if one is sufficiently committed. What’s most knowable is which policy areas have the highest stakes.” In particular, if we get deeply involved in a space that initially seems “crowded,” we may discover that there is more value to add than we would have guessed.
  • As discussed in a later section, we are thinking about the long-term goal of promoting a broader political platform. More broadly, we hope to see dramatically higher money moved and influence over time. With this in mind, it is probably worth attacking the question, “What causes would we encourage massively more people to support – with their giving and with their talents – if we could?” For that question, focusing on a relatively overlooked cause of extreme importance – rather than simply on causes that currently seem to present opportunities for major gains with relatively small investments – seems valuable.

What follows is the set of causes that we believe to have overwhelming humanitarian importance (in the sense that an imaginable policy change would create large amounts of economic value and/or affect large numbers of people significantly). They are listed in order of how promising we find them, taking into account other criteria (tractability, crowdedness). Note that the way we’re using “importance” here attempts, when feasible, to incorporate not just the size of the problem, but the likely impact of an improvement in policy if the improvement could be implemented. (In other words, a major problem may still fall short on “importance” if it seems unlikely that one could identify a change in legislation with large expected impact on the problem.) With that said, there are many cases in which we know very little about the details of possible policy fixes, and try to approximate “importance” based primarily on the size of the problem and very rough intuition about how policy change might affect it.

We have created a collection of back-of-the-envelope estimates on the likely impact of policy reform in different areas, which informs the comments below in general, though we do not place much confidence in the particular estimates.

Labor mobility

It appears to us that moving from a lower-income country to a higher-income country can bring about enormous increases in a person’s income (e.g., multiplying it several-fold), dwarfing the effect of any direct-aid intervention we’re aware of. As such, labor mobility seems to us to be an enormously high-stakes issue, whether based on our own back-of-the-envelope calculations for possible legislative changes, academic estimates that sufficient increases in immigration could create value on the order of 50% of world GDP, or just the observation that changes on a per-person-affected basis are impressive.

Additionally, it appears to us that there is relatively little attention paid to this cause in some sense: the humanitarian benefits of migration seem to receive little discussion and emphasis generally, we have not identified any other philanthropic funding focused on labor mobility as an anti-poverty issue, and we note that immediately prior to our involvement, Michael Clemens’s work on this issue at Center for Global Development was in the relatively unusual position of not having specific private support (though it had been supported previously).

With that said, there is another sense in which this cause is quite “crowded”: U.S. immigration policy more broadly is an extremely salient and heavily contested issue, with significant philanthropic involvement as well as interest in allowing more migration from the business community. The debates taking place at the moment seem to center mostly around the treatment of undocumented immigrants, with labor mobility as a secondary issue. Thus, the question of how “crowded” this space is – and what a new funder might be able to contribute – remains very much an open question for us, and one that we are trying to address with deeper investigation and declared interest in funding.

There are a couple of other challenges with this area:

  • There is a high degree of controversy over this issue. We hope to conduct a thorough review of arguments and counterarguments, which we have done some work on but have not yet completed.
  • While we generally feel that “political tractability” is difficult to predict past the short term, immigration seems to be a particularly charged issue where the fundamental obstacles to change may be extremely strong.

Macroeconomic policy

Macroeconomic policy appears to be an area with extraordinarily high stakes, in that a small number of decisions can arguably have substantial effects on national (and global) growth and unemployment.

Our aim in this space would likely be focused on generating better information and new ideas, rather than coming down on one side or the other of a partisan battle.

The question of whether we ought to consider this space “crowded” is a difficult one.

  • Arguments to consider this space “crowded”:
  • Argument to consider this space “uncrowded”:
    • We’re not aware of any major philanthropic funder that has made this area a top priority.
    • The early conversations we’ve had have given us some reason to think that there are certain types of research that are not supported by existing infrastructure.
    • A number of people we spoke to noted the outsize influence of the Fed in monetary policy and macroeconomic research, and argued that support for more independent research and thinking could be useful.
    • We’re not aware of work in this area that focuses on formulating workable legislation to improve on the status quo (e.g., working out the details of “automatic stabilizers” that could be budget-neutral over time, which could be an important criterion for winning support or acceptance from both the right and left). The new Hutchins Center may end up working in this area.
    • It generally seems to us that the importance of macroeconomic policy is underappreciated and rarely discussed outside of a few narrow circles (in particular, the academic field of macroeconomics and a particular set of bloggers and journalists).

We are currently conducting deeper investigation accompanied by readiness to provide funding. We expect to learn more about what gaps and opportunities exist.

Foreign aid and global poverty

We have had a number of conversations about the policy landscape around issues directly affecting the global poor, such as the U.S. foreign aid budget and allocation, trade policy related to the developing world, etc. (We have unfortunately not been able to publish notes from a number of these conversations; others are linked to from our shallow writeup on this topic.)

We see this general cluster of issues as having potentially overwhelming importance, because of the direct relevance to the global poor (whose numbers and degree of poverty both exceed those of the U.S. population).

Our impression is that there is a substantial amount of philanthropic involvement in this area, and a relatively strong infrastructure that analyzes and advocates for policies that benefit the global poor. This infrastructure includes The Center for Global Development (CGD), a think tank we perceive as highly intelligent and effective in developing new ideas; we have supported CGD and may increase our level of support over time, but we also note that CGD has expressed a lack of desire to expand much further. It also includes the ONE Campaign (with a budget of roughly $30 million/year, supported by the Gates Foundation and others) and a network of large aid organizations. It has been argued to us that this infrastructure has been highly successful in preventing cuts to foreign aid despite recent concern over budget balance, and we find this a strong argument. The Bill and Melinda Gates Foundation has made influencing policy in this space a clear priority.

As such, we aren’t sure how much can be accomplished by a new funder in this space, at least at the level we’re currently contemplating (in the $5-25 million per year range). A few possibilities we’ve considered:

  • One issue we perceive relatively little attention to within the anti-global-poverty community is labor mobility, discussed above.
  • We also have the impression that there is relatively little advocacy – in the US policy arena (as opposed to the community around multilateral funders, etc.) – around allocating foreign aid for maximum humanitarian “bang for the buck.” This could mean, e.g., advocating for relatively more to be spent on proven cost-effective global health programs relative to more expensive and/or less proven programs; it could also mean advocating for more structural reforms, of the sort promoted by CGD (e.g., Cash on Delivery Aid). We see some reason to believe work along these lines could do more harm than good (by undermining the attempt to preserve/expand the total level of aid), and we believe that other funders in this space (particularly the Gates Foundation) recognize the importance of these issues, so we are not highly optimistic about pursuing this line of reasoning, but we may do so if capacity permits.

Because we perceive the infrastructure in this space as relatively strong and successful, we’ve considered providing funding and spending time in this area as a way of learning more about what a strong advocacy infrastructure looks like.

Improving democracy

We’ve been following the Hewlett Foundation’s evolving initiative on aiming to improve the general functioning of the U.S. democratic system, particularly with regard to the highly polarized current environment. We have reviewed an early report on this initiative (not public) and spoken with Daniel Stid and Hewlett Foundation President Larry Kramer about it.

It seems to us that federal politics are currently deeply dysfunctional, and we could imagine enormous gains (though it is hard to lay out the likely specifics of such gains) if we could help ameliorate this issue. However, “size of the problem” is only one part of our definition of “importance.” The other part – “likely impact of hoped-for legislative reforms” – is much less clear for us. It seems to us that past attempts at reforming the political system as a whole haven’t clearly done more good than harm (see, for example, points 1 and 2 at Wonkblog’s discussion of U.S. political dysfunction, which I see as a good concise summary of the major potential factors overall). Reviewing the fairly broad list of potential interventions laid out by Hewlett (in its not-yet-released document, and summarized to some degree in our conversation notes), we are ambivalent regarding what the likely impact of legislative reforms would be, assuming political victory.

“Crowdedness” is somewhat difficult to assess for this cause. The Hewlett Foundation seems likely to make it a real priority, and to try to interest other foundations in it too, which could dramatically increase the amount of philanthropic investment. It’s hard to say, at this point, to what degree this will happen and how much space (and what sort of space) will remain for us to potentially fill.

Overall, we are glad to see that the Hewlett Foundation is taking on what we believe is one of the world’s most pressing issues, and we plan to follow its work with interest. At this time, we see greater likelihood of getting heavily involved (in the sense of “committing” to) the causes listed above, though that may change as we continue to follow Hewlett’s work.

Climate change

We have done a shallow-depth investigation of climate change, an area that gets a great deal of philanthropic attention compared to all of the above causes. The potential impact of climate change mitigation is enormous, but not (by our estimates, based on mainstream projections) clearly larger than that of other causes we’ve classified as “ambitious longshots.” We do see a case that climate change deserves special attention because of its potential as a global catastrophic risk: there is a risk that mainstream projections are badly off and that the consequences will be much worse than currently projected. We will discuss this aspect of climate change (and the interventions we feel are most appropriate to deal with this relatively low-probability, high-impact scenario) in an upcoming discussion of global catastrophic risks.

Tax policy

Tax policy, like macroeconomic policy, has theoretically huge economic stakes and a good deal of attention from intellectuals. We see it as having substantially more attention from funders and nonprofits, and (likely as a consequence) fewer gaps in the work done by intellectuals (particularly with regard to developing workable policy proposals). We also see less room for impact from new academic research on related matters, as the main bottleneck to improved policy seems to be politics (in particular, resistance from groups like Americans for Tax Reform to changes that would involve new taxes or reduced tax expenditures) rather than knowledge. We have done a shallow investigation of this area and will be writing it up in the future.

Green fields: outstanding “room for more philanthropy”
We’ve identified a small number of causes that seem to have at least moderate importance and potential tractability, while being extremely “empty” – very little infrastructure in place pushing for what we would see as positive policy change.

One such area is what Steven Teles calls “rent seeking.” The broad idea is that there are some industries in which government regulation has been captured in a manner that makes it unnecessarily difficult and expensive to provide a service, so the existing providers of this service benefit from inefficiently low levels of competition. Consequently, existing providers tend to push for preserving and expanding such regulation. A classic example would be that of taxis: an artificially restricted supply of taxi medallions makes it artificially difficult and expensive to become a taxi driver, and the existing medallion owners have an interest in continuing to artificially restrict the supply. This dynamic results in unnecessarily high taxi costs, low taxi supply, and fewer job options for people who would consider being taxi drivers. It’s been claimed that similar dynamics apply, to varying degrees, to a broad range of occupations, both lower-skilled and higher-skilled (such as doctors, dentists, lawyers, and accountants).

Prof. Teles believes that there is little in the way of concentrated advocacy groups to counteract “rent seeking” in occupational licensing (by arguing for less protective regulation and more permissiveness in who can e.g. drive taxis), and that even creating a small advocacy infrastructure could make a big difference in combating artificial supply restrictions. Most importantly, a small number of victories at the local and/or state level could (he argues) raise the general profile of these issues, create a model for people in other areas, and lead to “compounding” policy change at the state and local level. We expect that efforts focusing on higher-skilled occupations would have quite a different profile than efforts focusing on lower-skilled occupations, and we do not have a strong sense of which is likely to be more promising.

We have had an initial conversation with Institute for Justice about this topic, and may look into it further.

Other causes in this category:

  • Zoning reform to enable more construction and urban density. It seems possible that there is a currently excessive level of regulation held up by those (property owners) who benefit from a restriction in supply of housing, business space, etc. While specific developers may advocate heavily on behalf of specific projects, it seems to us (from initial conversations) that there is very little advocacy infrastructure making the public-interest case for general increases in how much development is allowed.
  • Incentives for organ donation. GiveWell Senior Research Analyst Alexander Berger has an unusual degree of familiarity with this area. It appears to us that there is practically no work being done on finding, and promoting, ethical and safe ways to provide incentives for organ donation, something that could have large health benefits and save a significant amount of money for the health system.

Other causes of interest
We are interested in a few other causes that don’t fit into any of the above categories.

  • We’ve had some conversations about the idea of improving the general quality of policy analysis available to state-level governments, where there may be a type of void that doesn’t exist at the federal level. This is a very preliminary idea at the moment and we will likely be writing more about it.
  • We are investigating the treatment of animals in industrial agriculture at a medium-depth level, due largely to a particular interest on the part of one of our employees. This is a cause that preliminarily appears relatively “uncrowded,” and according to some moral frameworks could be seen as having enormous importance as well. We’re also intrigued by the possibility that Steve Teles raised of working more generally toward accountability of industrial agriculture companies on a broad array of issues; this could have implications for animal welfare, climate change, antibiotic resistance, farm subsidies, and potentially nutrition as well.
  • Intellectual property reform could present an unusual combination of unusually high tractability (see notes from our conversation with the Electronic Frontier Foundation), unusually high uncrowdedness (see our writeup on software patent reform, though intellectual property reform need not confine itself to software), and reasonably high importance (though we’ve had a good deal of trouble estimating this last piece). There could also be connections with trade policy, as mentioned in our conversations with Steven Teles.

Some major issue areas that we are less likely to prioritize
There are a other issue areas that we may investigate at some point, though we consider them less promising than the issues listed above.

  • U.S. education generally is a popular area among philanthropists, and the education policy space generally seems to be heavily influenced by the agendas of three major foundations: Gates, Broad and Walton (references to this in conversation notes here and here). At the levels of funding we’re currently contemplating, we have difficulty imagining that we could substantially contribute to or alter this agenda.
  • Health care policy is highly important, and there is arguably some degree of “window of opportunity” to affect the specifics of how the system changes in reaction to the recent passage of the Affordable Care Act. However, our loose impression is that (a) this is the major priority of one of the major U.S. foundations (Robert Wood Johnson) and that (b) more generally, this area seems highly crowded and we haven’t become aware of any likely promising angles that we could take on.
  • A very broad area of policy, with potentially very far-reaching repercussions, is the issue of inequality and the question of the extent to which (and manner in which) U.S. governments, at the federal, state and local level, should redistribute wealth. Our impression is that this cause gets far more attention from philanthropists, nonprofits and intellectuals than questions about helping the global poor, who we feel are more numerous and benefit more from redistribution relative to the U.S. poor. We have not come across any aspects of this broad space that seem appealing by the criteria we’ve laid out.
  • Our impression is that environmental issues (aside from climate change, discussed in an earlier section) also receive a great deal of attention from philanthropy, and are not particularly likely to be of comparable humanitarian significance.
  • Trade policy is another major policy area. The main potential benefits we see to working on trade policy pertain to the impact on the developing world, so we’re inclined to classify it with the set of developing-world-oriented policy areas discussed above. Generally, it seems to us (based on loose impressions and no particular source) that there is reasonably strong infrastructure in place representing most relevant perspectives in trade policy, though we have not done a shallow investigation and may in the future.
  • Defense policy seems clearly important, and we aren’t aware of much in the way of advocacy infrastructure pushing to reduce unnecessary military expenditures and unnecessary military engagements. We plan to investigate this area at some point, but intuitively feel that philanthropy is unlikely to have much impact on this front and that the causes discussed earlier are more promising.

Other causes we may focus on in the future, but are not including in the categories above
Helping to strengthen a broad political platform. It can be argued that the strongest impact of philanthropic engagement with policy has been long-term promotion and development of a movement. (For instance, Steve Teles has notably made this argument with respect to the conservative legal movement). Rather than picking individual policy issues in which to invest, a philanthropist with interests in a number of a causes and clear set of values might achieve more by promoting their general values, along with the people and organizations that share them (since much of the long term benefits of investment in a given area may be in the form of empowering the particular individuals who receive support, who may go on to other things). However, we do not feel that our values are broadly shared by any existing, easily located major political movements.

In particular, we generally favor policy focused on benefiting low-income and otherwise disadvantaged people, even when it involves active government – an attitude often associated with the U.S. political “left” – but we place particularly high value on the developing world. Additionally, we place high emphasis on the value of economic growth and innovation (which we feel are likely to benefit future people). In the long term, we could imagine exploring the possibility of helping to promote a political platform consistent with these values and trying to find, connect, and support people and organizations supporting this platform. We’re aware that people who share these values will have many disagreements over policy, but feel that there could nonetheless be major benefits to laying out, and promoting, a platform that emphasizes both global humanitarianism and economic development.

We think of this as a long-term possibility with highly uncertain value. We are doing some very preliminary work now to explore the idea, but feel that more direct engagement with specific issues will make us better-informed, better-connected, and overall better-positioned to explore such a possibility further down the line.

Policy related to global catastrophic risks. We are treating “global catastrophic risks” as a separate category of work at the moment, and we will be writing more later this year about our likely priorities in that category. So far, we haven’t identified clear cases in which a particular policy change seems highly important for one of what we consider the most important global catastrophic risks (other than climate change, discussed above), though this may change. We’re looking to build our general capacity for policy-oriented philanthropy by working on other causes, and will hopefully be well-positioned to do relevant policy-oriented work if and when it becomes important to do so.

Policy related to scientific research. We see policy around scientific research (for example, the budget, mandate and policies of the National Institutes of Health) as potentially extremely important, but at this time we don’t feel that we have strong enough scientific advisory capacity to have a good grasp on the relevant issues. We are building our scientific advisory capacity via separate projects, and will be writing about this more in the future. Again, we will hopefully be well-positioned to do relevant policy work if and when it becomes important to do so.

Other categories. This post has focused exclusively on our medium-term plans for U.S. policy. We continue to explore a broad variety of other sorts of philanthropic work, which we will be writing about in the future.

Bottom line and our plans from here
We’ve spent a good deal of time investigating potential focus areas in U.S. policy, and we have a very large number of questions remaining. There are many causes that we have much to learn about on many dimensions, including both questions like “How should policy change and why?” and questions like “How can a philanthropist increase the odds of a particular policy change?” One of the aims of this post is to stimulate discussion and help determine which questions are most important to focus on. Our hope is to finalize “commitments” to causes by the end of this calendar year.

Our current working agenda is as follows:

Deep investigations of cause areas: looking actively for funding opportunities and being highly open to funding them.

  • We are exploring both labor mobility and macroeconomic policy at this level.
  • We have done a fair amount of work on criminal justice reform, and are pausing our investigation of it for the moment.
  • In addition to finding funding opportunities, we are also interested in (a) doing thorough reviews of academic literature to assess the best arguments on each side of the relevant policy debates; (b) trying to substantially refine our “importance” estimates after gaining more context. Both (a) and (b) could be substantial projects, and we are likely to do them only for causes that we do deep investigations of and seriously consider committing to.
  • Depending on our capacity and on the results of lower-depth investigations, we may do this sort of “deep investigation” of other causes as well.

Medium-depth investigations of cause areas: having 5+ conversations per cause area to get a good sense of the overall landscape.

  • We are hoping to explore the “rent seeking” and “zoning” causes discussed above (under “Green fields”) at this level.
  • We are also conducting a number of conversations on factory farming, currently with a focus on animal welfare implications.
  • There are several other cases in which we have done a medium level of investigation, including foreign aid and organ donation (in the latter case, we feel we have a strong understanding of the issue largely through Alexander Berger’s personal background, as mentioned above).
  • We are likely to do a future investigation on improving the general quality of policy analysis available to state-level governments, which we will be writing about more in the future.
  • Other causes we may investigate at this level include tax policy and intellectual property reform.

Shallow-level investigations of cause areas: having a few conversations to get a basic picture of an area. We hope to look into some of the causes we have done little investigation of, such as health care policy. However, this area is a lower priority than the above, and we aren’t sure whether we’ll get to it this year (whereas we do expect to make significant progress on all of the above points).

Hiring. Having a decent sense of our likely interests, we are working on hiring U.S.-policy-specific staff, so that when we do make commitments, we’ll have the staff available to execute on them. We have a major hire starting in June whom we will be writing more about in the future.

Limited time and capacity. At the moment, we are executing on the above agenda; if and when we complete currently-in-progress items and have more capacity, we may promote some causes from the “medium” to the “deep” level of investigation or (less likely) from “shallow” to “medium.” However, around the end of the calendar year, we expect to use whatever information and staff we have at that time to make commitments.

A journalist visits GiveDirectly villages in Kenya

In February, Jacob Kushner, a journalist living in Kenya, contacted us. We have long been interested in seeing more substantive coverage of philanthropy, so we were excited to talk to him.

As a pilot project, Mr. Kushner decided to visit villages in which GiveDirectly had distributed some of its earliest cash transfers. We spoke with Mr. Kushner several times to offer thoughts and feedback, but we encouraged him to write about whatever he found (positive or negative about GiveDirectly). We also put him in touch with GiveDirectly to confirm that staff there were amenable to this project.

Mr. Kushner completed his trip in April, and his full article follows. He also shared his full interview notes with us which we’ve posted here.

We’ve summarized what we took away from his article here. Carolina Toth, Manager, People and Partnerships at GiveDirectly responds here.

When giving out cash to the poor, what happens when some are left behind?
A closer look at whether GiveDirectly’s cash transfers stoke community tension in Western Kenya

By Jacob Kushner

For several years now, the charity GiveDirectly has experimented with different ways of deciding who among Western Kenya’s rural poor should receive cash transfers. It’s an important consideration, because $1,000 means a lot to the families that receive it—and it can mean a lot of disappointment to the families that don’t. Last month I traveled to Western Kenya to speak with both lots, and I found that the discrepancy did not go unnoticed in their communities.

To date, GiveDirectly has undergone five different transfer programs in Siaya over the past three years, with different metrics for selecting recipients. I interviewed recipients from three of those cohorts:

  • The Google Cohort (approximately 850 ‘thatch-roof only’ recipients whose transfers were completed in October 2013)
  • The 200k Cohort (approximately 200 ‘thatch-roof only’ recipients whose transfers were completed in January 2013)
  • The 2M cohort (approximately 2,000 recipients divided into ‘thatch-roof only’ villages and ‘saturation’ villages (in which nearly everyone is eligible) who have received one major transfer and will receive the second and final one in July 2014).

In a follow-up to a randomized controlled trial, GiveDirectly asked residents if they’d heard any complaints about GiveDirectly in their community. Sixty-four percent of respondents in Siaya County answered “yes,” as did 48 percent of those in the “Google” cohort (in Rareida it was 28 percent).

Fewer than 6 percent of respondents in all four groups said shouting or angry arguments had ensued because of the transfers, and fewer than 4 percent said they’d experienced crime, theft or violence or felt threatened as a result. Virtually no one said they’d argued with family members over how to spend the money, and no more than 7 percent in any group said their village elder had approached them asking for money.

Carolina Toth, Kenya Field Director for GiveDirectly, explained the results of a series of informal community group meetings in which GiveDirectly led residents in a discussion of who should be eligible for transfers.

Sixty-two percent of respondents in thatch-only villages said they’d heard complaints relating to ineligible households, compared with 46 percent in saturation villages. Thirty percent of those in thatch-only villages said they’d heard complaints about different criteria being used across different villages, compared with only 4 percent in saturation villages.

GiveDirectly concluded that the strongest takeaway from the discussions is that poorer ‘thatched’ households are more deserving but also that certain households that have mabati or permanent houses are deserving of the transfers as well. When asked about their own villages, residents preferred the saturation method. When asked about other villages, they preferred thatch-only. No one thought it would be “bad” if cash were given to some wealthier households.

Because recipients in saturation villages have yet to receive their second transfer (due in July), it’s too early to draw definite conclusions. But this and other previous reports leave several question unanswered:

To the extent that community tension may result in the wake of cash disbursements, how does that tension actually unfold? Who are the parties and what are some examples? Most importantly, what do non-recipients in those communities think about the fairness of the selection process? Do they feel stigmatized for not having received the money, and how does their perception of whether animosity resulted from the cash transfers compare with those of the recipients’ themselves?

In April I made a reporting trip to Siaya County to interview recipient and non-recipients in the communities where GiveDirectly has made those disbursements. Over three days I interviewed 15 people, asking whether they were happy with GiveDirectly’s selection process and whether any tension arose in their communities as a result of it.

I interviewed some recipients from each of the three cohorts and also interviewed recipients and in both the ‘saturation’ and ‘thatch’ divisions of the 2M cohort. I interviewed four non-recipients, at least one in each of the three cohorts.

My interviews seemed to reflect many of the conclusions of the RCT and subsequent follow up interviews and meetings. No one reported intra-family arguments about how to spend the money or being coerced by a spouse or family member to spend it in a particular way. Only one recipient said he’d originally disagreed with his spouse but that they eventually came to a mutual agreement. No one reported theft or that their own money had gone to waste in any way.

But 12 of the 15 respondents did indicate that some amount of tension had fostered in their community as a factor of some people having received money while others did not. By far the most tangible conflict mentioned to me occurred in the 200k cohort in the village of Koga.

There, the village elder did not receive a cash transfer. He was, however, consulted by GiveDirectly staff to assist in a tour of the boundaries of the village so GiveDirectly could identify eligible households, for which he was given a small token payment as compensation for his time. But in the words of one recipient there, “there was a scandal.” The elder “had conspired (to enlist) some households that were outside the area and had better houses, with the understanding that they would give him some money.”

GiveDirectly staff say the elder seems to have directed residents who lived in tin-roof houses to “squat” in vacant thatch roofed houses in order to receive the money. Subsequently, the assistant chief, with the support of the other village council members, dismissed the elder from his position.

When I spoke with the elder, he confirmed that he had misrepresented certain households in the village so they would be enrolled in the program. He justified that decision saying, “I was the village elder and I was working for the (entire) community.”

He said tension resulted when the initial disbursements were made and some families, including his own, were left out.

“I felt degraded by my community members. They were laughing at me that I didn’t receive any help even though I was the leader of the community. I was so humiliated.” He said the incident led him to ‘resign’ after more than 35 years of serving as an elder in Koga (he is 62 years old).

The second most tangible takeaway was the resentment and frustration expressed by the four non-recipients I interviewed. One woman in a “saturation” village was visibly angry as she described how she was not selected because the living room in her tin roof house is cemented, even though her other rooms are not. Another Koga man said he was cheated out of a transfer:

“The time the GiveDirectly team was working in the village, they came to my home but at that time I was grazing cattle outside the compound and I saw them in my sister-in-law’s house. I was curious. But due to how relations within households go sour, my sister told the GiveDirectly team that I had left and was never around.”

Despite an appeal he said he made to GiveDirectly field staff, this man did not receive a transfer. He says his economic situation is similar to that of the other recipients:

“I live in a house like this—(a) grass thatch house. I have children in school and I struggle to pay their fees. Some of my children for lack of funds have to be supported by my relatives in other areas, in Nairobi. I have only two cattle.”

GiveDirectly staff pointed out that “targeting” is a universal problem in development aid. Other methods used to select recipients—such as letting communities vote on who should receive, or requiring people to go to some lengths to prove they are indeed quite economically poor-off—have major drawbacks: Cronyism, and excessive bureaucracy and burdens, respectively. As an alternative, GiveDirectly employs another common method that uses easy-to-observe characteristics such as roof style to judge how wealthy or poor a household is. According to GiveDirectly’s own research, less than 5 percent of people in the 2M cohort villages complained, legitimately or otherwise, of being unfairly excluded. (In comparison, a recent study of the Kenya Hunger Safety Net Program found an exclusion error rate of 46 percent).

The man in Koga who says he was unfairly excluded also expressed sympathy for the Koga village elder. “I would not be happy with what has happened to him, because the feeling he has now at losing his job is the same feeling I have at not getting the money. I feel bad for him because I am also going through some pain.”

The man also aired some critiques as to how some people in the community spent their money.

“I saw some beneficiaries, the way they misbehaved when they got the money, and that made me feel it is important that recipients receive training on how to spend it. For example there are people who wasted it on drinking sprees, and others bought items that they didn’t understand how they would maintain. For example, one bought a motorbike and used it for a few months, but now it is unused and has not really helped him.”

Indeed, several interviewees mentioned the need for training to accompany the transfer process. GiveDirectly currently does not provide training or advise recipients as to how they should spend their money. GiveDirectly does, however, provide a brochure that lists different possible categories of expenditure such as home construction, business, and farming. GiveDirectly is considering experiments in which brochures also list the average returns that previous beneficiaries earned on each category of investment.

After completing the interviews, I asked Carolina Toth, the GiveDirectly field director, what she made of it all. I asked Toth what she thought about the village elder scandal in Koga—that a man who had served as elder for 35 years lost that position not because he violated a community custom, but simply a rule imposed by GiveDirectly.

“The village elder more often than not is one of the richer members of the community,” Toth said. As to his “previous feelings of entitlement to benefit from whatever is happening … I don’t think that’s an expectation we want to uphold.”

Toth and I also discussed the consequences for individuals who are excluded in a community where most residents receive the cash.

“It’s definitely a psychological event in their live,” Toth said. “But we know from the (randomized controlled trial) that there are huge spillover effects to the people who didn’t receive.”

When I asked Toth about the man who says he missed out on the transfer because his sister-in-law misinformed the GiveDirectly staff that he was not living in the village, Toth said it’s certainly true that some people get left out by mistake. But she said such cases are rare. As to the woman with the cemented living room who didn’t receive cash even though the rest of her home is not yet cemented, Toth said the GiveDirectly field staff can only make decisions based upon what they see—and that the distinction between a cemented house and a non-cemented house is not always entirely clear under such circumstances.

The vast majority of people who aren’t selected, said Toth, are skipped because they come from a marginally higher socioeconomic standing to whom the money would be less useful.

“What is the value of $250 given to a family that’s richer? Wouldn’t that be more valuable in the hands of people who are really poor?” Toth asked. “We have a mission of giving to the extreme poor, so by excluding some people who are not in the extreme poor, you are able to reach more extreme poor.”

Ultimately, the question any cash transfer implementer must decide is, “Is the possibility that community tension may result from a non-universal disbursement so great or concerning that transfers should be made to all residents in a village despite the opportunity cost that fewer, even poorer people in other villages will not receive any cash?”

Thus far GiveDirectly has answered that question in the negative. With certain exceptions (such as allowing communities to nominate a pre-determined number of otherwise unqualified people for the disbursements) and with increased nuance (by considering more advanced criteria than simply thatch versus tin roofs and indoor plastering), GiveDirectly intends to continue excluding those residents who do not qualify as the poorest of the poor.

Jacob Kushner is a journalist based in Nairobi. He reports on foreign aid and investment in Africa, human rights and the extractives sector.